All of lore.kernel.org
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Mel Gorman" <mgorman@techsingularity.net>
Cc: "Linux-MM" <linux-mm@kvack.org>, "Theodore Ts'o" <tytso@mit.edu>,
	"Andreas Dilger" <adilger.kernel@dilger.ca>,
	"Darrick J . Wong" <djwong@kernel.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Michal Hocko" <mhocko@suse.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Rik van Riel" <riel@surriel.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Linux-fsdevel" <linux-fsdevel@vger.kernel.org>,
	"LKML" <linux-kernel@vger.kernel.org>,
	"Mel Gorman" <mgorman@techsingularity.net>
Subject: Re: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested
Date: Tue, 21 Sep 2021 09:19:07 +1000	[thread overview]
Message-ID: <163217994752.3992.5443677201798473600@noble.neil.brown.name> (raw)
In-Reply-To: <20210920085436.20939-2-mgorman@techsingularity.net>

On Mon, 20 Sep 2021, Mel Gorman wrote:
>  
> +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page);
> +static inline void acct_reclaim_writeback(struct page *page)
> +{
> +	pg_data_t *pgdat = page_pgdat(page);
> +
> +	if (atomic_read(&pgdat->nr_reclaim_throttled))
> +		__acct_reclaim_writeback(pgdat, page);

The first thing __acct_reclaim_writeback() does is repeat that
atomic_read().
Should we read it once and pass the value in to
__acct_reclaim_writeback(), or is that an unnecessary
micro-optimisation?


> +/*
> + * Account for pages written if tasks are throttled waiting on dirty
> + * pages to clean. If enough pages have been cleaned since throttling
> + * started then wakeup the throttled tasks.
> + */
> +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page)
> +{
> +	unsigned long nr_written;
> +	int nr_throttled = atomic_read(&pgdat->nr_reclaim_throttled);
> +
> +	__inc_node_page_state(page, NR_THROTTLED_WRITTEN);
> +	nr_written = node_page_state(pgdat, NR_THROTTLED_WRITTEN) -
> +		READ_ONCE(pgdat->nr_reclaim_start);
> +
> +	if (nr_written > SWAP_CLUSTER_MAX * nr_throttled)
> +		wake_up_interruptible_all(&pgdat->reclaim_wait);

A simple wake_up() could be used here.  "interruptible" is only needed
if non-interruptible waiters should be left alone.  "_all" is only needed
if there are some exclusive waiters.  Neither of these apply, so I think
the simpler interface is best.


> +}
> +
>  /* possible outcome of pageout() */
>  typedef enum {
>  	/* failed to write page out, page is locked */
> @@ -1412,9 +1453,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  
>  		/*
>  		 * The number of dirty pages determines if a node is marked
> -		 * reclaim_congested which affects wait_iff_congested. kswapd
> -		 * will stall and start writing pages if the tail of the LRU
> -		 * is all dirty unqueued pages.
> +		 * reclaim_congested. kswapd will stall and start writing
> +		 * pages if the tail of the LRU is all dirty unqueued pages.
>  		 */
>  		page_check_dirty_writeback(page, &dirty, &writeback);
>  		if (dirty || writeback)
> @@ -3180,19 +3220,20 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  		 * If kswapd scans pages marked for immediate
>  		 * reclaim and under writeback (nr_immediate), it
>  		 * implies that pages are cycling through the LRU
> -		 * faster than they are written so also forcibly stall.
> +		 * faster than they are written so forcibly stall
> +		 * until some pages complete writeback.
>  		 */
>  		if (sc->nr.immediate)
> -			congestion_wait(BLK_RW_ASYNC, HZ/10);
> +			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10);
>  	}
>  
>  	/*
>  	 * Tag a node/memcg as congested if all the dirty pages
>  	 * scanned were backed by a congested BDI and

"congested BDI" doesn't mean anything any more.  Is this a good time to
correct that comment.
This comment seems to refer to the test

      sc->nr.dirty && sc->nr.dirty == sc->nr.congested)

a few lines down.  But nr.congested is set from nr_congested which
counts when inode_write_congested() is true - almost never - and when 
"writeback and PageReclaim()".

Is that last test the sign that we are cycling through the LRU to fast?
So the comment could become:

   Tag a node/memcg as congested if all the dirty page were
   already marked for writeback and immediate reclaim (counted in
   nr.congested).

??

Patch seems to make sense to me, but I'm not expert in this area.

Thanks!
NeilBrown

  reply	other threads:[~2021-09-20 23:21 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20  8:54 [RFC PATCH 0/5] Remove dependency on congestion_wait in mm/ Mel Gorman
2021-09-20  8:54 ` [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested Mel Gorman
2021-09-20 23:19   ` NeilBrown [this message]
2021-09-21 11:12     ` Mel Gorman
2021-09-21 21:27       ` NeilBrown
2021-09-21  0:13   ` NeilBrown
2021-09-21 10:58     ` Mel Gorman
2021-09-21 21:40       ` NeilBrown
2021-09-22  6:04       ` Dave Chinner
2021-09-22  8:03         ` Mel Gorman
2021-09-22 12:16   ` Hillf Danton
2021-09-22 14:13     ` Mel Gorman
2021-09-20  8:54 ` [PATCH 2/5] mm/vmscan: Throttle reclaim and compaction when too may pages are isolated Mel Gorman
2021-09-20 23:27   ` NeilBrown
2021-09-21 11:03     ` Mel Gorman
2021-09-21 18:45   ` Yang Shi
2021-09-21 18:45     ` Yang Shi
2021-09-22  8:11     ` Mel Gorman
2021-09-20  8:54 ` [PATCH 3/5] mm/vmscan: Throttle reclaim when no progress is being made Mel Gorman
2021-09-20 23:31   ` NeilBrown
2021-09-21 11:16     ` Mel Gorman
2021-09-21 21:46       ` NeilBrown
2021-09-22  9:21         ` Mel Gorman
2021-09-20  8:54 ` [PATCH 4/5] mm/writeback: Throttle based on page writeback instead of congestion Mel Gorman
2021-09-20  8:54 ` [PATCH 5/5] mm/page_alloc: Remove the throttling logic from the page allocator Mel Gorman
2021-09-20 11:42 ` [RFC PATCH 0/5] Remove dependency on congestion_wait in mm/ Matthew Wilcox
2021-09-20 12:50   ` Mel Gorman
2021-09-20 14:11     ` David Sterba
2021-09-21 11:18       ` Mel Gorman
2021-09-20 19:51   ` Mel Gorman
2021-09-21 20:46 ` Dave Chinner
2021-09-22 17:52   ` Mel Gorman
2021-09-29 10:09 [PATCH 0/5] Remove dependency on congestion_wait in mm/ v2 Mel Gorman
2021-09-29 10:09 ` [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=163217994752.3992.5443677201798473600@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=adilger.kernel@dilger.ca \
    --cc=corbet@lwn.net \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.