regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
@ 2021-11-25 15:18 Mel Gorman
  2021-11-26 10:14 ` Vlastimil Babka
                   ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Mel Gorman @ 2021-11-25 15:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Alexey Avramov, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML, Mel Gorman

Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
problems due to reclaim throttling for excessive lengths of time.
In Alexey's case, a memory hog that should go OOM quickly stalls for
several minutes before stalling. In Mike and Darrick's cases, a small
memcg environment stalled excessively even though the system had enough
memory overall.

Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being
made") introduced the problem although commit a19594ca4a8b ("mm/vmscan:
increase the timeout if page reclaim is not making progress") made it
worse. Systems at or near an OOM state that cannot be recovered must
reach OOM quickly and memcg should kill tasks if a memcg is near OOM.

To address this, only stall for the first zone in the zonelist, reduce
the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
the scan control nr_reclaimed is 0 and kswapd is still active.  If kswapd
has stopped reclaiming due to excessive failures, do not stall at all so
that OOM triggers relatively quickly.

Alexey's test case was the most straight forward

	for i in {1..3}; do tail /dev/zero; done

On vanilla 5.16-rc1, this test stalled and was reset after 10 minutes.
After the patch, the test gets killed after roughly 15 seconds which is
the same length of time taken in 5.15.

Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de
Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv
Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net

Fixes: 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Darrick J. Wong <djwong@kernel.org>
---
 mm/vmscan.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fb9584641ac7..176ddd28df21 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1057,7 +1057,17 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 
 		break;
 	case VMSCAN_THROTTLE_NOPROGRESS:
-		timeout = HZ/2;
+		timeout = 1;
+
+		/*
+		 * If kswapd is disabled, reschedule if necessary but do not
+		 * throttle as the system is likely near OOM.
+		 */
+		if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES) {
+			cond_resched();
+			return;
+		}
+
 		break;
 	case VMSCAN_THROTTLE_ISOLATED:
 		timeout = HZ/50;
@@ -3395,7 +3405,7 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 		return;
 
 	/* Throttle if making no progress at high prioities. */
-	if (sc->priority < DEF_PRIORITY - 2)
+	if (sc->priority < DEF_PRIORITY - 2 && !sc->nr_reclaimed)
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 
@@ -3415,6 +3425,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 	unsigned long nr_soft_scanned;
 	gfp_t orig_mask;
 	pg_data_t *last_pgdat = NULL;
+	pg_data_t *first_pgdat = NULL;
 
 	/*
 	 * If the number of buffer_heads in the machine exceeds the maximum
@@ -3478,14 +3489,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 			/* need some check for avoid more shrink_zone() */
 		}
 
+		if (!first_pgdat)
+			first_pgdat = zone->zone_pgdat;
+
 		/* See comment about same check for global reclaim above */
 		if (zone->zone_pgdat == last_pgdat)
 			continue;
 		last_pgdat = zone->zone_pgdat;
 		shrink_node(zone->zone_pgdat, sc);
-		consider_reclaim_throttle(zone->zone_pgdat, sc);
 	}
 
+	consider_reclaim_throttle(first_pgdat, sc);
+
 	/*
 	 * Restore to original mask to avoid the impact on the caller if we
 	 * promoted it to __GFP_HIGHMEM.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-25 15:18 [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress Mel Gorman
@ 2021-11-26 10:14 ` Vlastimil Babka
  2021-11-26 10:26 ` Mike Galbraith
  2021-11-26 16:12 ` Alexey Avramov
  2 siblings, 0 replies; 40+ messages in thread
From: Vlastimil Babka @ 2021-11-26 10:14 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: Michal Hocko, Alexey Avramov, Rik van Riel, Mike Galbraith,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On 11/25/21 16:18, Mel Gorman wrote:
> Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
> problems due to reclaim throttling for excessive lengths of time.
> In Alexey's case, a memory hog that should go OOM quickly stalls for
> several minutes before stalling. In Mike and Darrick's cases, a small
> memcg environment stalled excessively even though the system had enough
> memory overall.
> 
> Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being
> made") introduced the problem although commit a19594ca4a8b ("mm/vmscan:
> increase the timeout if page reclaim is not making progress") made it
> worse. Systems at or near an OOM state that cannot be recovered must
> reach OOM quickly and memcg should kill tasks if a memcg is near OOM.
> 
> To address this, only stall for the first zone in the zonelist, reduce
> the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
> the scan control nr_reclaimed is 0 and kswapd is still active.  If kswapd
> has stopped reclaiming due to excessive failures, do not stall at all so
> that OOM triggers relatively quickly.
> 
> Alexey's test case was the most straight forward
> 
> 	for i in {1..3}; do tail /dev/zero; done
> 
> On vanilla 5.16-rc1, this test stalled and was reset after 10 minutes.
> After the patch, the test gets killed after roughly 15 seconds which is
> the same length of time taken in 5.15.
> 
> Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de
> Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv
> Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net

Should probably include Reported-by: tags too?

> Fixes: 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made")
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> Tested-by: Darrick J. Wong <djwong@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/vmscan.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fb9584641ac7..176ddd28df21 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1057,7 +1057,17 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
>  
>  		break;
>  	case VMSCAN_THROTTLE_NOPROGRESS:
> -		timeout = HZ/2;
> +		timeout = 1;
> +
> +		/*
> +		 * If kswapd is disabled, reschedule if necessary but do not
> +		 * throttle as the system is likely near OOM.
> +		 */
> +		if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES) {
> +			cond_resched();
> +			return;
> +		}
> +
>  		break;
>  	case VMSCAN_THROTTLE_ISOLATED:
>  		timeout = HZ/50;
> @@ -3395,7 +3405,7 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
>  		return;
>  
>  	/* Throttle if making no progress at high prioities. */
> -	if (sc->priority < DEF_PRIORITY - 2)
> +	if (sc->priority < DEF_PRIORITY - 2 && !sc->nr_reclaimed)
>  		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
>  }
>  
> @@ -3415,6 +3425,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  	unsigned long nr_soft_scanned;
>  	gfp_t orig_mask;
>  	pg_data_t *last_pgdat = NULL;
> +	pg_data_t *first_pgdat = NULL;
>  
>  	/*
>  	 * If the number of buffer_heads in the machine exceeds the maximum
> @@ -3478,14 +3489,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  			/* need some check for avoid more shrink_zone() */
>  		}
>  
> +		if (!first_pgdat)
> +			first_pgdat = zone->zone_pgdat;
> +
>  		/* See comment about same check for global reclaim above */
>  		if (zone->zone_pgdat == last_pgdat)
>  			continue;
>  		last_pgdat = zone->zone_pgdat;
>  		shrink_node(zone->zone_pgdat, sc);
> -		consider_reclaim_throttle(zone->zone_pgdat, sc);
>  	}
>  
> +	consider_reclaim_throttle(first_pgdat, sc);
> +
>  	/*
>  	 * Restore to original mask to avoid the impact on the caller if we
>  	 * promoted it to __GFP_HIGHMEM.
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-25 15:18 [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress Mel Gorman
  2021-11-26 10:14 ` Vlastimil Babka
@ 2021-11-26 10:26 ` Mike Galbraith
  2021-11-26 16:12 ` Alexey Avramov
  2 siblings, 0 replies; 40+ messages in thread
From: Mike Galbraith @ 2021-11-26 10:26 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Alexey Avramov, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Thu, 2021-11-25 at 15:18 +0000, Mel Gorman wrote:
> Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
> problems due to reclaim throttling for excessive lengths of time.
> In Alexey's case, a memory hog that should go OOM quickly stalls for
> several minutes before stalling. In Mike and Darrick's cases, a small
> memcg environment stalled excessively even though the system had enough
> memory overall.
>
> Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being
> made") introduced the problem although commit a19594ca4a8b ("mm/vmscan:
> increase the timeout if page reclaim is not making progress") made it
> worse. Systems at or near an OOM state that cannot be recovered must
> reach OOM quickly and memcg should kill tasks if a memcg is near OOM.
>
> To address this, only stall for the first zone in the zonelist, reduce
> the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
> the scan control nr_reclaimed is 0 and kswapd is still active.  If kswapd
> has stopped reclaiming due to excessive failures, do not stall at all so
> that OOM triggers relatively quickly.

This patch helped a ton with LTP testcases, but, the behavior of
test_1() in memcg_regression_test.sh still looks pretty darn bad...

homer:..debug/tracing # tail -1 /trace
    memcg_test_1-4683    [002] .....   282.319617: out_of_memory+0x194/0x440: !!oc->chosen:1
homer:..debug/tracing # grep memcg_test_1-4683 /trace|grep sleepy|wc -l
190 !!!

That one memcg_test_1 instance, of 400 spawned in a memcg with its
limit_in_bytes set to zero, slept 190 times on the way to oom-kill,
leading a regression test that used to complete in a fraction of a
second still taking over 8 minutes to even with the huge improvement
$subject made.

Poking it with the sharp stick below took it down to 20 encounters with
reclaim_throttle(), and a tad under a minute for test_1() to complete.

Reasoning: given try_charge_memcg() will loop as long as there is ANY
progress, and each call to try_to_free_mem_cgroup_pages() therein now
entails multiple encounters with reclaim_throttle(), allowing plenty of
time for some progress enabling events to have occurred and benefit
reaped by the time we return, looping again and again when having been
throttled numerous times did NOT help at all seems now to constitute
pointless thumb twiddling.  Or?

---
 mm/memcontrol.c |    5 -----
 1 file changed, 5 deletions(-)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2584,7 +2584,6 @@ static int try_charge_memcg(struct mem_c
 			unsigned int nr_pages)
 {
 	unsigned int batch = max(MEMCG_CHARGE_BATCH, nr_pages);
-	int nr_retries = MAX_RECLAIM_RETRIES;
 	struct mem_cgroup *mem_over_limit;
 	struct page_counter *counter;
 	enum oom_status oom_status;
@@ -2675,9 +2674,6 @@ static int try_charge_memcg(struct mem_c
 	if (mem_cgroup_wait_acct_move(mem_over_limit))
 		goto retry;

-	if (nr_retries--)
-		goto retry;
-
 	if (gfp_mask & __GFP_RETRY_MAYFAIL)
 		goto nomem;

@@ -2694,7 +2690,6 @@ static int try_charge_memcg(struct mem_c
 		       get_order(nr_pages * PAGE_SIZE));
 	if (oom_status == OOM_SUCCESS) {
 		passed_oom = true;
-		nr_retries = MAX_RECLAIM_RETRIES;
 		goto retry;
 	}
 nomem:


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-25 15:18 [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress Mel Gorman
  2021-11-26 10:14 ` Vlastimil Babka
  2021-11-26 10:26 ` Mike Galbraith
@ 2021-11-26 16:12 ` Alexey Avramov
  2021-11-26 16:52   ` Mel Gorman
  2021-11-29 15:01   ` Mel Gorman
  2 siblings, 2 replies; 40+ messages in thread
From: Alexey Avramov @ 2021-11-26 16:12 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

>After the patch, the test gets killed after roughly 15 seconds which is
>the same length of time taken in 5.15.

In my tests, the 5.15 still performs much better.

New question: is timeout=1 has sense? Will it save CPU?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-26 16:12 ` Alexey Avramov
@ 2021-11-26 16:52   ` Mel Gorman
  2021-11-27 19:26     ` Alexey Avramov
  2021-11-29 15:01   ` Mel Gorman
  1 sibling, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-11-26 16:52 UTC (permalink / raw)
  To: Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Sat, Nov 27, 2021 at 01:12:46AM +0900, Alexey Avramov wrote:
> >After the patch, the test gets killed after roughly 15 seconds which is
> >the same length of time taken in 5.15.
> 
> In my tests, the 5.15 still performs much better.
> 

How much better?

> New question: is timeout=1 has sense? Will it save CPU?

It's the minimum stall time available -- it's 1 tick so the exact stall
time depends on HZ and yes, it's to stall to wait for something to
happen. It'll get woken early if another reclaimer makes forward progress.

This patch on top will stall less. I sent it already but it may not be
clear that I meant it to be applied on top of this patch.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 176ddd28df21..167ea4f324a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,8 +3404,8 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 	if (current_is_kswapd())
 		return;
 
-	/* Throttle if making no progress at high prioities. */
-	if (sc->priority < DEF_PRIORITY - 2 && !sc->nr_reclaimed)
+	/* Throttle if making no progress at high priority. */
+	if (sc->priority == 1 && !sc->nr_reclaimed)
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-26 16:52   ` Mel Gorman
@ 2021-11-27 19:26     ` Alexey Avramov
  2021-11-28 10:00       ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Alexey Avramov @ 2021-11-27 19:26 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

I will present the results of the new tests here. 

TLDR;
=====
No one Mel's patch doesn't prevent stalls in my tests.

Test case:
Running firefox with youtube tab (basic desktop workload) and starting
$ for i in {1..10}; do tail /dev/zero; done
-- 1. with noswap and 2. with SwapTotal > MemTotal and swappiness=0.

I will log PSI metrics using latest psi2log (it's part of nohang [1]
package, src: [2]) and log some meminfo metrics using mem2log [3].

psi2log cmd:
$ psi2log -m 2 -i 1 -l /tmpfs/psi

mem2log cmd:
$ mem2log -i 1 -l /tmpfs/mem

The OS is Debian 9 x86_64 on HDD with ext4+LUKS, MemTotal: 11.5 GiB.

Repo with logs:
https://github.com/hakavlad/cache-tests/tree/main/516-reclaim-throttle

5.15
====

- With noswap, psi2log summary:

2021-11-27 20:51:14,018: Stall times for the last 141.6s:
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,018: some cpu     1.3s, avg 0.9%
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,018: some io      99.1s, avg 70.0%
2021-11-27 20:51:14,018: full io      93.4s, avg 66.0%
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,019: some memory  4.3s, avg 3.1%
2021-11-27 20:51:14,019: full memory  4.2s, avg 3.0%

-- average full memory stall is 0.4s per one `tail /dev/zero`.

full psi2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/noswap/psi

full mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/noswap/mem

- With swappiness=0 and SwapTotal > 0:

2021-11-27 20:57:30,217: Stall times for the last 141.6s:
2021-11-27 20:57:30,217: -----------
2021-11-27 20:57:30,217: some cpu     1.2s, avg 0.8%
2021-11-27 20:57:30,217: -----------
2021-11-27 20:57:30,217: some io      100.7s, avg 71.1%
2021-11-27 20:57:30,218: full io      94.8s, avg 67.0%
2021-11-27 20:57:30,218: -----------
2021-11-27 20:57:30,218: some memory  3.9s, avg 2.8%
2021-11-27 20:57:30,218: full memory  3.9s, avg 2.7%

-- 0.4s memory stall per one `tail /dev/zero`.

full psi2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/swappiness0/psi

full mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/swappiness0/mem


5.16-rc2 vanilla
================

- with noswap

psi2log summary:

2021-11-27 21:45:09,307: Stall times for the last 1086.5s:
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some cpu     4.2s, avg 0.4%
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some io      337.6s, avg 31.1%
2021-11-27 21:45:09,307: full io      325.2s, avg 29.9%
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some memory  888.7s, avg 81.8%
2021-11-27 21:45:09,307: full memory  886.2s, avg 81.6%

average full memory stall is 87s per one `tail /dev/zero`.

full psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/noswap/psi

full mem2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/noswap/mem

- And even worst with swappiness=0:

2021-11-27 22:33:19,207: Stall times for the last 2677.7s:
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some cpu     14.0s, avg 0.5%
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some io      306.7s, avg 11.5%
2021-11-27 22:33:19,208: full io      292.4s, avg 10.9%
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some memory  2504.8s, avg 93.5%
2021-11-27 22:33:19,209: full memory  2498.2s, avg 93.3%

What a horror!

In dmesg: a lot of
[] Purging GPU memory, 0 pages freed, 0 pages still pinned, 2112 pages left available.
(and dying journald)

psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/swappiness0/psi

mem2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/swappiness0/mem

dmesg:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/dmesg

5.16-rc2 and 1st Mel's patch [4]
================================

- with noswap

Summary:

2021-11-27 23:28:23,707: Stall times for the last 1035.1s:
2021-11-27 23:28:23,707: -----------
2021-11-27 23:28:23,707: some cpu     5.6s, avg 0.5%
2021-11-27 23:28:23,708: -----------
2021-11-27 23:28:23,708: some io      336.5s, avg 32.5%
2021-11-27 23:28:23,708: full io      322.9s, avg 31.2%
2021-11-27 23:28:23,708: -----------
2021-11-27 23:28:23,708: some memory  851.1s, avg 82.2%
2021-11-27 23:28:23,708: full memory  848.3s, avg 82.0%

full psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch1/noswap/psi

mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch1/noswap/mem

- with swappiness=0

Summary:

2021-11-27 23:50:25,722: Stall times for the last 1076.9s:
2021-11-27 23:50:25,722: -----------
2021-11-27 23:50:25,722: some cpu     12.2s, avg 1.1%
2021-11-27 23:50:25,722: -----------
2021-11-27 23:50:25,723: some io      232.1s, avg 21.6%
2021-11-27 23:50:25,723: full io      218.5s, avg 20.3%
2021-11-27 23:50:25,723: -----------
2021-11-27 23:50:25,723: some memory  957.5s, avg 88.9%
2021-11-27 23:50:25,723: full memory  951.2s, avg 88.3%

It's better than vanilla.

5.16-rc2 and 2nd Mel's patch [5]
================================

- with noswap

Summary:

2021-11-28 00:29:38,558: Stall times for the last 598.7s:
2021-11-28 00:29:38,558: -----------
2021-11-28 00:29:38,558: some cpu     5.2s, avg 0.9%
2021-11-28 00:29:38,558: -----------
2021-11-28 00:29:38,559: some io      235.0s, avg 39.3%
2021-11-28 00:29:38,559: full io      220.2s, avg 36.8%
2021-11-28 00:29:38,559: -----------
2021-11-28 00:29:38,559: some memory  439.0s, avg 73.3%
2021-11-28 00:29:38,559: full memory  437.5s, avg 73.1%

It's significant improvement, but far from the expected result.

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/noswap/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/noswap/psi

In some cases the stall was quite short.

- with swappiness=0

Summary:

2021-11-28 00:53:41,659: Stall times for the last 1239.0s:
2021-11-28 00:53:41,659: -----------
2021-11-28 00:53:41,660: some cpu     9.4s, avg 0.8%
2021-11-28 00:53:41,660: -----------
2021-11-28 00:53:41,660: some io      206.5s, avg 16.7%
2021-11-28 00:53:41,660: full io      195.0s, avg 15.7%
2021-11-28 00:53:41,660: -----------
2021-11-28 00:53:41,660: some memory  1113.0s, avg 89.8%
2021-11-28 00:53:41,661: full memory  1107.1s, avg 89.3%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/swappiness0/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/swappiness0/psi


5.16-rc2 and 3rd Mel's patch [6] on top of 2nd
==============================================

- with noswap

Summary:

2021-11-28 01:35:26,077: Stall times for the last 949.4s:
2021-11-28 01:35:26,077: -----------
2021-11-28 01:35:26,078: some cpu     5.4s, avg 0.6%
2021-11-28 01:35:26,078: -----------
2021-11-28 01:35:26,078: some io      269.0s, avg 28.3%
2021-11-28 01:35:26,078: full io      259.4s, avg 27.3%
2021-11-28 01:35:26,078: -----------
2021-11-28 01:35:26,079: some memory  759.9s, avg 80.0%
2021-11-28 01:35:26,079: full memory  757.5s, avg 79.8%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/noswap/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/noswap/psi

- with swappiness=0

Summary:

2021-11-28 01:59:40,253: Stall times for the last 1127.8s:
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some cpu     6.8s, avg 0.6%
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some io      228.3s, avg 20.2%
2021-11-28 01:59:40,253: full io      218.7s, avg 19.4%
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some memory  987.4s, avg 87.6%
2021-11-28 01:59:40,253: full memory  982.3s, avg 87.1%

mem log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/swappiness0/mem

psi log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/swappiness0/psi

¯\_(ツ)_/¯

Should I test anything else?

[1] https://github.com/hakavlad/nohang
[2] https://raw.githubusercontent.com/hakavlad/nohang/master/src/psi2log
[3] https://github.com/hakavlad/mem2log
[4] https://lore.kernel.org/lkml/20211124115007.GG3366@techsingularity.net/
[5] https://lore.kernel.org/lkml/20211124143303.GH3366@techsingularity.net/
[6] https://lore.kernel.org/lkml/20211126162416.GK3366@techsingularity.net/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-27 19:26     ` Alexey Avramov
@ 2021-11-28 10:00       ` Mike Galbraith
  2021-11-28 11:39         ` Mike Galbraith
                           ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Mike Galbraith @ 2021-11-28 10:00 UTC (permalink / raw)
  To: Alexey Avramov, Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Sun, 2021-11-28 at 04:26 +0900, Alexey Avramov wrote:
> I will present the results of the new tests here.
>
> TLDR;
> =====
> No one Mel's patch doesn't prevent stalls in my tests.

Seems there may be a problem with the THROTTLE_WRITEBACK bits..

> $ for i in {1..10}; do tail /dev/zero; done
> -- 1. with noswap

..because the bandaid below (made of 8cd7c588 shards) on top of Mel's
last pulled that one-liner's very pointy fangs.

---
 mm/backing-dev.c |    5 +++++
 mm/vmscan.c      |    8 +++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -1055,3 +1055,8 @@ long congestion_wait(int sync, long time
 	return ret;
 }
 EXPORT_SYMBOL(congestion_wait);
+
+int async_bdi_congested(void)
+{
+	return atomic_read(&nr_wb_congested[BLK_RW_ASYNC]) != 0;
+}
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1021,6 +1021,8 @@ static void handle_write_error(struct ad
 	unlock_page(page);
 }

+extern int async_bdi_congested(void);
+
 void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
@@ -1048,6 +1050,10 @@ void reclaim_throttle(pg_data_t *pgdat,
 	 */
 	switch(reason) {
 	case VMSCAN_THROTTLE_WRITEBACK:
+		if (!async_bdi_congested()) {
+			cond_resched();
+			return;
+		}
 		timeout = HZ/10;

 		if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
@@ -1079,7 +1085,7 @@ void reclaim_throttle(pg_data_t *pgdat,
 	}

 	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
-	ret = schedule_timeout(timeout);
+	ret = io_schedule_timeout(timeout);
 	finish_wait(wqh, &wait);

 	if (reason == VMSCAN_THROTTLE_WRITEBACK)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-28 10:00       ` Mike Galbraith
@ 2021-11-28 11:39         ` Mike Galbraith
  2021-11-28 12:35           ` Mike Galbraith
  2021-11-28 11:58         ` Alexey Avramov
  2021-11-29  8:26         ` Mel Gorman
  2 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-28 11:39 UTC (permalink / raw)
  To: Alexey Avramov, Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Sun, 2021-11-28 at 11:00 +0100, Mike Galbraith wrote:
> On Sun, 2021-11-28 at 04:26 +0900, Alexey Avramov wrote:
> > I will present the results of the new tests here.
> >
> > TLDR;
> > =====
> > No one Mel's patch doesn't prevent stalls in my tests.
>
> Seems there may be a problem with the THROTTLE_WRITEBACK bits..

Disregard.  I backed hacklet out again, and one-liner that had ground
my box to fine powder was harmless.  Hmm.

	-Mike
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-28 10:00       ` Mike Galbraith
  2021-11-28 11:39         ` Mike Galbraith
@ 2021-11-28 11:58         ` Alexey Avramov
  2021-11-29  8:26         ` Mel Gorman
  2 siblings, 0 replies; 40+ messages in thread
From: Alexey Avramov @ 2021-11-28 11:58 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Mel Gorman, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

I tested this patch [1] on top of the 2nd Mel's patch [2].

Test case:
Running firefox with youtube tab (basic desktop workload) and starting
$ for i in {1..10}; do tail /dev/zero; done
-- 1. with noswap and 2. with SwapTotal > MemTotal and swappiness=0.


- With noswap

2021-11-28 20:13:18,755: Stall times for the last 296.0s:
2021-11-28 20:13:18,755: -----------
2021-11-28 20:13:18,755: some cpu     8.6s, avg 2.9%
2021-11-28 20:13:18,755: -----------
2021-11-28 20:13:18,755: some io      253.7s, avg 85.7%
2021-11-28 20:13:18,755: full io      228.9s, avg 77.3%
2021-11-28 20:13:18,755: -----------
2021-11-28 20:13:18,755: some memory  135.1s, avg 45.7%
2021-11-28 20:13:18,755: full memory  134.3s, avg 45.4%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch4/noswap/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch4/noswap/psi


- With swappiness=0

2021-11-28 20:29:35,229: Stall times for the last 257.9s:
2021-11-28 20:29:35,229: -----------
2021-11-28 20:29:35,229: some cpu     7.7s, avg 3.0%
2021-11-28 20:29:35,229: -----------
2021-11-28 20:29:35,229: some io      223.1s, avg 86.5%
2021-11-28 20:29:35,230: full io      196.2s, avg 76.1%
2021-11-28 20:29:35,230: -----------
2021-11-28 20:29:35,230: some memory  170.8s, avg 66.2%
2021-11-28 20:29:35,230: full memory  167.9s, avg 65.1%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch4/swappiness0/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch4/swappiness0/psi


There are still some freezes, although in some cases there are no significant stalls. 

13-17s average stall vs 0.4s with 5.15

[1] https://lore.kernel.org/lkml/252cd5acd9bf6588ec87ce02884925c737b6a8b7.camel@gmx.de/
[2] https://lore.kernel.org/lkml/20211124143303.GH3366@techsingularity.net/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-28 11:39         ` Mike Galbraith
@ 2021-11-28 12:35           ` Mike Galbraith
  2021-11-28 18:38             ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-28 12:35 UTC (permalink / raw)
  To: Alexey Avramov, Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Sun, 2021-11-28 at 12:39 +0100, Mike Galbraith wrote:
> On Sun, 2021-11-28 at 11:00 +0100, Mike Galbraith wrote:
> > On Sun, 2021-11-28 at 04:26 +0900, Alexey Avramov wrote:
> > > I will present the results of the new tests here.
> > >
> > > TLDR;
> > > =====
> > > No one Mel's patch doesn't prevent stalls in my tests.
> >
> > Seems there may be a problem with the THROTTLE_WRITEBACK bits..
>
> Disregard.  I backed hacklet out again, and one-liner that had ground
> my box to fine powder was harmless.  Hmm.

Grr, no, disregard the disregard, it's just annoyingly variable, may
perform akin to 5.15 but most likely won't.  With bandaid, it _seems_
to be consistently on par with 5.15.

Bandaid likely just break the writeback bits all to pieces, but that's
ok, if breakage makes things work better, they need more breaking :)

	-Mike

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-28 12:35           ` Mike Galbraith
@ 2021-11-28 18:38             ` Mike Galbraith
  0 siblings, 0 replies; 40+ messages in thread
From: Mike Galbraith @ 2021-11-28 18:38 UTC (permalink / raw)
  To: Alexey Avramov, Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Sun, 2021-11-28 at 13:35 +0100, Mike Galbraith wrote:
>
> Bandaid likely just break the writeback bits all to pieces, but that's
> ok, if breakage makes things work better, they need more breaking :)

@@ -1048,6 +1050,10 @@ void reclaim_throttle(pg_data_t *pgdat,
 	 */
 	switch(reason) {
 	case VMSCAN_THROTTLE_WRITEBACK:
+		if (!async_bdi_congested()) {
+			cond_resched();
+			return;
+		}

And indeed, that's the only THROTTLE_WRITEBACK path I've seen taken.
Not pulling the plug made no difference to the tail /dev/zero test,
leaving only the bits I swiped from defunct wait_iff_congested() as
behavior delta cause.  Actually sleeping there apparently stings.

	-Mike

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-28 10:00       ` Mike Galbraith
  2021-11-28 11:39         ` Mike Galbraith
  2021-11-28 11:58         ` Alexey Avramov
@ 2021-11-29  8:26         ` Mel Gorman
  2 siblings, 0 replies; 40+ messages in thread
From: Mel Gorman @ 2021-11-29  8:26 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Sun, Nov 28, 2021 at 11:00:59AM +0100, Mike Galbraith wrote:
> On Sun, 2021-11-28 at 04:26 +0900, Alexey Avramov wrote:
> > I will present the results of the new tests here.
> >
> > TLDR;
> > =====
> > No one Mel's patch doesn't prevent stalls in my tests.
> 
> Seems there may be a problem with the THROTTLE_WRITEBACK bits..
> 
> > $ for i in {1..10}; do tail /dev/zero; done
> > -- 1. with noswap
> 
> ..because the bandaid below (made of 8cd7c588 shards) on top of Mel's
> last pulled that one-liner's very pointy fangs.
> 

This disables writeback throttling in most cases as bdi congested is not
updated by the block layer.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-26 16:12 ` Alexey Avramov
  2021-11-26 16:52   ` Mel Gorman
@ 2021-11-29 15:01   ` Mel Gorman
  2021-11-30 10:14     ` Mike Galbraith
  2021-11-30 16:03     ` Alexey Avramov
  1 sibling, 2 replies; 40+ messages in thread
From: Mel Gorman @ 2021-11-29 15:01 UTC (permalink / raw)
  To: Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Sat, Nov 27, 2021 at 01:12:46AM +0900, Alexey Avramov wrote:
> >After the patch, the test gets killed after roughly 15 seconds which is
> >the same length of time taken in 5.15.
> 
> In my tests, the 5.15 still performs much better.
> 
> New question: is timeout=1 has sense? Will it save CPU?

Ok, the following on top of 5.16-rc1 survived 8 minutes of watching youtube
on a laptop while "tail /dev/zero" was running within the background. While
there were some very short glitches, they were no worse than 5.15. I've
not reproduced your exact test case yet or the memcg ones yet but sending
now in case I don't complete them before the end of the day.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fb9584641ac7..1af12072f40e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
 	unlock_page(page);
 }
 
+bool skip_throttle_noprogress(pg_data_t *pgdat)
+{
+	int reclaimable = 0, write_pending = 0;
+	int i;
+
+	/*
+	 * If kswapd is disabled, reschedule if necessary but do not
+	 * throttle as the system is likely near OOM.
+	 */
+	if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+		return true;
+
+	/*
+	 * If there are a lot of dirty/writeback pages then do not
+	 * throttle as throttling will occur when the pages cycle
+	 * towards the end of the LRU if still under writeback.
+	 */
+	for (i = 0; i < MAX_NR_ZONES; i++) {
+		struct zone *zone = pgdat->node_zones + i;
+
+		if (!populated_zone(zone))
+			continue;
+
+		reclaimable += zone_reclaimable_pages(zone);
+		write_pending += zone_page_state_snapshot(zone,
+						  NR_ZONE_WRITE_PENDING);
+	}
+	if (2 * write_pending <= reclaimable)
+		return true;
+
+	return false;
+}
+
 void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
@@ -1057,7 +1090,13 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 
 		break;
 	case VMSCAN_THROTTLE_NOPROGRESS:
-		timeout = HZ/2;
+		if (skip_throttle_noprogress(pgdat)) {
+			cond_resched();
+			return;
+		}
+
+		timeout = 1;
+
 		break;
 	case VMSCAN_THROTTLE_ISOLATED:
 		timeout = HZ/50;
@@ -3386,16 +3425,16 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 	}
 
 	/*
-	 * Do not throttle kswapd on NOPROGRESS as it will throttle on
-	 * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under
-	 * writeback and marked for immediate reclaim at the tail of
-	 * the LRU.
+	 * Do not throttle kswapd or cgroup reclaim on NOPROGRESS as it will
+	 * throttle on VMSCAN_THROTTLE_WRITEBACK if there are too many pages
+	 * under writeback and marked for immediate reclaim at the tail of the
+	 * LRU.
 	 */
-	if (current_is_kswapd())
+	if (current_is_kswapd() || cgroup_reclaim(sc))
 		return;
 
 	/* Throttle if making no progress at high prioities. */
-	if (sc->priority < DEF_PRIORITY - 2)
+	if (sc->priority == 1 && !sc->nr_reclaimed)
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 
@@ -3415,6 +3454,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 	unsigned long nr_soft_scanned;
 	gfp_t orig_mask;
 	pg_data_t *last_pgdat = NULL;
+	pg_data_t *first_pgdat = NULL;
 
 	/*
 	 * If the number of buffer_heads in the machine exceeds the maximum
@@ -3478,14 +3518,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 			/* need some check for avoid more shrink_zone() */
 		}
 
+		if (!first_pgdat)
+			first_pgdat = zone->zone_pgdat;
+
 		/* See comment about same check for global reclaim above */
 		if (zone->zone_pgdat == last_pgdat)
 			continue;
 		last_pgdat = zone->zone_pgdat;
 		shrink_node(zone->zone_pgdat, sc);
-		consider_reclaim_throttle(zone->zone_pgdat, sc);
 	}
 
+	consider_reclaim_throttle(first_pgdat, sc);
+
 	/*
 	 * Restore to original mask to avoid the impact on the caller if we
 	 * promoted it to __GFP_HIGHMEM.

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-29 15:01   ` Mel Gorman
@ 2021-11-30 10:14     ` Mike Galbraith
  2021-11-30 11:22       ` Mel Gorman
  2021-11-30 16:03     ` Alexey Avramov
  1 sibling, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-30 10:14 UTC (permalink / raw)
  To: Mel Gorman, Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Mon, 2021-11-29 at 15:01 +0000, Mel Gorman wrote:
> On Sat, Nov 27, 2021 at 01:12:46AM +0900, Alexey Avramov wrote:
> > > After the patch, the test gets killed after roughly 15 seconds which is
> > > the same length of time taken in 5.15.
> > 
> > In my tests, the 5.15 still performs much better.
> > 
> > New question: is timeout=1 has sense? Will it save CPU?
> 
> Ok, the following on top of 5.16-rc1 survived 8 minutes of watching youtube
> on a laptop while "tail /dev/zero" was running within the background. While
> there were some very short glitches, they were no worse than 5.15. I've
> not reproduced your exact test case yet or the memcg ones yet but sending
> now in case I don't complete them before the end of the day.
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fb9584641ac7..1af12072f40e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
>         unlock_page(page);
>  }
>  
> +bool skip_throttle_noprogress(pg_data_t *pgdat)
> +{
> +       int reclaimable = 0, write_pending = 0;
> +       int i;
> +
> +       /*
> +        * If kswapd is disabled, reschedule if necessary but do not
> +        * throttle as the system is likely near OOM.
> +        */
> +       if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
> +               return true;
> +
> +       /*
> +        * If there are a lot of dirty/writeback pages then do not
> +        * throttle as throttling will occur when the pages cycle
> +        * towards the end of the LRU if still under writeback.
> +        */
> +       for (i = 0; i < MAX_NR_ZONES; i++) {
> +               struct zone *zone = pgdat->node_zones + i;
> +
> +               if (!populated_zone(zone))
> +                       continue;
> +
> +               reclaimable += zone_reclaimable_pages(zone);
> +               write_pending += zone_page_state_snapshot(zone,
> +                                                 NR_ZONE_WRITE_PENDING);
> +       }
> +       if (2 * write_pending <= reclaimable)

That is always true here...

	-Mike


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 10:14     ` Mike Galbraith
@ 2021-11-30 11:22       ` Mel Gorman
  2021-11-30 12:00         ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-11-30 11:22 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, Nov 30, 2021 at 11:14:32AM +0100, Mike Galbraith wrote:
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index fb9584641ac7..1af12072f40e 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
> >         unlock_page(page);
> >  }
> >  
> > +bool skip_throttle_noprogress(pg_data_t *pgdat)
> > +{
> > +       int reclaimable = 0, write_pending = 0;
> > +       int i;
> > +
> > +       /*
> > +        * If kswapd is disabled, reschedule if necessary but do not
> > +        * throttle as the system is likely near OOM.
> > +        */
> > +       if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
> > +               return true;
> > +
> > +       /*
> > +        * If there are a lot of dirty/writeback pages then do not
> > +        * throttle as throttling will occur when the pages cycle
> > +        * towards the end of the LRU if still under writeback.
> > +        */
> > +       for (i = 0; i < MAX_NR_ZONES; i++) {
> > +               struct zone *zone = pgdat->node_zones + i;
> > +
> > +               if (!populated_zone(zone))
> > +                       continue;
> > +
> > +               reclaimable += zone_reclaimable_pages(zone);
> > +               write_pending += zone_page_state_snapshot(zone,
> > +                                                 NR_ZONE_WRITE_PENDING);
> > +       }
> > +       if (2 * write_pending <= reclaimable)
> 
> That is always true here...
> 

Always true for you or always true in general?

The intent of the check is "are a majority of reclaimable pages
marked WRITE_PENDING?". It's similar to the check that existed prior
to 132b0d21d21f ("mm/page_alloc: remove the throttling logic from the
page allocator").

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 11:22       ` Mel Gorman
@ 2021-11-30 12:00         ` Mike Galbraith
  2021-11-30 12:51           ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-30 12:00 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, 2021-11-30 at 11:22 +0000, Mel Gorman wrote:
> On Tue, Nov 30, 2021 at 11:14:32AM +0100, Mike Galbraith wrote:
> > >       }
> > > +       if (2 * write_pending <= reclaimable)
> >
> > That is always true here...
> >
>
> Always true for you or always true in general?

"Here" as in the boxen located at my GPS coordinates :)

> The intent of the check is "are a majority of reclaimable pages
> marked WRITE_PENDING?". It's similar to the check that existed prior
> to 132b0d21d21f ("mm/page_alloc: remove the throttling logic from the
> page allocator").

I'll put my trace_printk() back and see if I can't bend-adjust it.

	-Mike


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 12:00         ` Mike Galbraith
@ 2021-11-30 12:51           ` Mike Galbraith
  2021-11-30 13:09             ` Mel Gorman
  0 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-30 12:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, 2021-11-30 at 13:00 +0100, Mike Galbraith wrote:
> On Tue, 2021-11-30 at 11:22 +0000, Mel Gorman wrote:
> > On Tue, Nov 30, 2021 at 11:14:32AM +0100, Mike Galbraith wrote:
> > > >       }
> > > > +       if (2 * write_pending <= reclaimable)
> > >
> > > That is always true here...
> > >
> >
> > Always true for you or always true in general?
>
> "Here" as in the boxen located at my GPS coordinates :)
>
> > The intent of the check is "are a majority of reclaimable pages
> > marked WRITE_PENDING?". It's similar to the check that existed prior
> > to 132b0d21d21f ("mm/page_alloc: remove the throttling logic from the
> > page allocator").
>
> I'll put my trace_printk() back and see if I can't bend-adjust it.

As it sits, write_pending is always 0 with tail /dev/zero.

	-Mike

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 12:51           ` Mike Galbraith
@ 2021-11-30 13:09             ` Mel Gorman
  2021-12-01  4:32               ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-11-30 13:09 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, Nov 30, 2021 at 01:51:10PM +0100, Mike Galbraith wrote:
> On Tue, 2021-11-30 at 13:00 +0100, Mike Galbraith wrote:
> > On Tue, 2021-11-30 at 11:22 +0000, Mel Gorman wrote:
> > > On Tue, Nov 30, 2021 at 11:14:32AM +0100, Mike Galbraith wrote:
> > > > >       }
> > > > > +       if (2 * write_pending <= reclaimable)
> > > >
> > > > That is always true here...
> > > >
> > >
> > > Always true for you or always true in general?
> >
> > "Here" as in the boxen located at my GPS coordinates :)
> >
> > > The intent of the check is "are a majority of reclaimable pages
> > > marked WRITE_PENDING?". It's similar to the check that existed prior
> > > to 132b0d21d21f ("mm/page_alloc: remove the throttling logic from the
> > > page allocator").
> >
> > I'll put my trace_printk() back and see if I can't bend-adjust it.
> 
> As it sits, write_pending is always 0 with tail /dev/zero.
> 

That is not a surprise for the test in question as it doesn't trigger
a case where there are lots of page cache being marked dirty and write
pending.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-29 15:01   ` Mel Gorman
  2021-11-30 10:14     ` Mike Galbraith
@ 2021-11-30 16:03     ` Alexey Avramov
  2021-11-30 17:27       ` Mel Gorman
  1 sibling, 1 reply; 40+ messages in thread
From: Alexey Avramov @ 2021-11-30 16:03 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

I tested this [1] patch on top of 5.16-rc2. It's the same test with 10 tails.

- with noswap

Summary:

2021-11-30 23:32:36,890: Stall times for the last 548.6s:
2021-11-30 23:32:36,890: -----------
2021-11-30 23:32:36,891: some cpu     3.7s, avg 0.7%
2021-11-30 23:32:36,891: -----------
2021-11-30 23:32:36,891: some io      187.6s, avg 34.2%
2021-11-30 23:32:36,891: full io      178.3s, avg 32.5%
2021-11-30 23:32:36,891: -----------
2021-11-30 23:32:36,892: some memory  392.2s, avg 71.5%
2021-11-30 23:32:36,892: full memory  390.7s, avg 71.2%

full psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap/psi

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap/mem

- with swappiness=0

Summary:

2021-11-30 23:51:48,969: Stall times for the last 919.4s:
2021-11-30 23:51:48,969: -----------
2021-11-30 23:51:48,969: some cpu     5.5s, avg 0.6%
2021-11-30 23:51:48,970: -----------
2021-11-30 23:51:48,970: some io      240.4s, avg 26.2%
2021-11-30 23:51:48,970: full io      230.6s, avg 25.1%
2021-11-30 23:51:48,970: -----------
2021-11-30 23:51:48,970: some memory  806.1s, avg 87.7%
2021-11-30 23:51:48,971: full memory  800.5s, avg 87.1%

psi log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/swappiness0/psi

mem log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/swappiness0/mem

In some cases stalls was very short, but in many cases stalls was long. 
The result is still not good enough.


offtop
======

The same test with the patch [1] on top of 5.16-rc2 + le9 patch [2] 
with vm.clean_min_kbytes=99000.

- with noswap

Summary:

2021-11-30 23:59:32,209: Stall times for the last 73.1s:
2021-11-30 23:59:32,209: -----------
2021-11-30 23:59:32,209: some cpu     0.4s, avg 0.5%
2021-11-30 23:59:32,209: -----------
2021-11-30 23:59:32,210: some io      5.8s, avg 8.0%
2021-11-30 23:59:32,210: full io      5.3s, avg 7.3%
2021-11-30 23:59:32,210: -----------
2021-11-30 23:59:32,210: some memory  3.3s, avg 4.5%
2021-11-30 23:59:32,210: full memory  3.1s, avg 4.2%

This is just an example of what a result close to the expected 
result might be (especially note io pressure values).

full psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap_le9_min99k/psi

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap_le9_min99k/mem

[1] https://lore.kernel.org/lkml/20211129150117.GO3366@techsingularity.net/
[2] https://lore.kernel.org/all/20211130201652.2218636d@mail.inbox.lv/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 16:03     ` Alexey Avramov
@ 2021-11-30 17:27       ` Mel Gorman
  2021-11-30 17:59         ` Mike Galbraith
  2021-11-30 18:38         ` Alexey Avramov
  0 siblings, 2 replies; 40+ messages in thread
From: Mel Gorman @ 2021-11-30 17:27 UTC (permalink / raw)
  To: Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Wed, Dec 01, 2021 at 01:03:48AM +0900, Alexey Avramov wrote:
> I tested this [1] patch on top of 5.16-rc2. It's the same test with 10 tails.
> 
> - with noswap
> 
> Summary:
> 
> 2021-11-30 23:32:36,890: Stall times for the last 548.6s:
> 2021-11-30 23:32:36,890: -----------
> 2021-11-30 23:32:36,891: some cpu     3.7s, avg 0.7%
> 2021-11-30 23:32:36,891: -----------
> 2021-11-30 23:32:36,891: some io      187.6s, avg 34.2%
> 2021-11-30 23:32:36,891: full io      178.3s, avg 32.5%
> 2021-11-30 23:32:36,891: -----------
> 2021-11-30 23:32:36,892: some memory  392.2s, avg 71.5%
> 2021-11-30 23:32:36,892: full memory  390.7s, avg 71.2%
> 
> full psi:
> https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap/psi
> 
> mem:
> https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch5/noswap/mem
> 

Ok, taking just noswap in isolation, this is what I saw when running
firefox + youtube vido and running tail /dev/zero 10 times in a row

2021-11-30 17:10:11,817: =================================
2021-11-30 17:10:11,817: Peak values:  avg10  avg60 avg300
2021-11-30 17:10:11,817: -----------  ------ ------ ------
2021-11-30 17:10:11,817: some cpu       1.00   0.96   0.56
2021-11-30 17:10:11,817: -----------  ------ ------ ------
2021-11-30 17:10:11,817: some io        0.24   0.06   0.04
2021-11-30 17:10:11,817: full io        0.24   0.06   0.01
2021-11-30 17:10:11,817: -----------  ------ ------ ------
2021-11-30 17:10:11,817: some memory    2.48   0.51   0.38
2021-11-30 17:10:11,817: full memory    2.48   0.51   0.37
2021-11-30 17:10:11,817: =================================
2021-11-30 17:10:11,817: Stall times for the last 53.7s:
2021-11-30 17:10:11,817: -----------
2021-11-30 17:10:11,817: some cpu     0.4s, avg 0.8%
2021-11-30 17:10:11,817: -----------
2021-11-30 17:10:11,817: some io      0.1s, avg 0.2%
2021-11-30 17:10:11,817: full io      0.1s, avg 0.2%
2021-11-30 17:10:11,817: -----------
2021-11-30 17:10:11,817: some memory  0.3s, avg 0.6%
2021-11-30 17:10:11,817: full memory  0.3s, avg 0.6%

Obviously a fairly different experience and most likely due to the
underlying storage.

Can you run the same test but after doing this

$ echo 1 > /sys/kernel/debug/tracing/events/vmscan/mm_vmscan_throttled/enable
$ cat /sys/kernel/debug/tracing/trace_pipe > trace.out

and send me the trace.out file please?

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 17:27       ` Mel Gorman
@ 2021-11-30 17:59         ` Mike Galbraith
  2021-12-01 13:01           ` Mel Gorman
  2021-11-30 18:38         ` Alexey Avramov
  1 sibling, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-11-30 17:59 UTC (permalink / raw)
  To: Mel Gorman, Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Tue, 2021-11-30 at 17:27 +0000, Mel Gorman wrote:
>
> Obviously a fairly different experience and most likely due to the
> underlying storage.

I bet a virtual nickle this is the sore spot.

---
 mm/vmscan.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3360,7 +3360,17 @@ static void shrink_node(pg_data_t *pgdat
 	if (!current_is_kswapd() && current_may_throttle() &&
 	    !sc->hibernation_mode &&
 	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
+#if 0
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+#else
+		/*
+		 * wait_iff_congested() used to live here and was
+		 * _apparently_ a misspelled cond_resched()?????
+		 * In any case, we are most definitely NOT starting
+		 * any IO in reclaim_throttle(), so why bother?
+		 */
+		cond_resched();
+#endif

 	if (should_continue_reclaim(pgdat, sc->nr_reclaimed -
nr_reclaimed,
 				    sc))


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 17:27       ` Mel Gorman
  2021-11-30 17:59         ` Mike Galbraith
@ 2021-11-30 18:38         ` Alexey Avramov
  2021-12-01 14:00           ` Mel Gorman
  1 sibling, 1 reply; 40+ messages in thread
From: Alexey Avramov @ 2021-11-30 18:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

>due to the
>underlying storage.

I agree.

>and send me the trace.out file please?

https://drive.google.com/file/d/1FBjAmXwXakWPPjcn6K-B50S04vyySQO6/view

Typical entries:

           Timer-10841   [006] ..... 14341.639496: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
           gmain-1246    [004] ..... 14341.639498: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
          php7.0-11149   [001] ..... 14341.647498: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
      obfs-local-908     [001] ..... 14341.651510: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
     Web Content-10876   [000] ..... 14341.651522: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
  NetworkManager-812     [004] ..... 14341.667495: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
             tor-941     [000] ..... 14341.667495: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
   IndexedDB #18-11165   [007] ..... 14341.667520: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
           Timer-10755   [002] ..... 14341.667538: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
     JS Watchdog-10884   [000] ..... 14341.699489: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 13:09             ` Mel Gorman
@ 2021-12-01  4:32               ` Mike Galbraith
  0 siblings, 0 replies; 40+ messages in thread
From: Mike Galbraith @ 2021-12-01  4:32 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, 2021-11-30 at 13:09 +0000, Mel Gorman wrote:
> On Tue, Nov 30, 2021 at 01:51:10PM +0100, Mike Galbraith wrote:
> > On Tue, 2021-11-30 at 13:00 +0100, Mike Galbraith wrote:
> > > On Tue, 2021-11-30 at 11:22 +0000, Mel Gorman wrote:
> > > > On Tue, Nov 30, 2021 at 11:14:32AM +0100, Mike Galbraith wrote:
> > > > > >       }
> > > > > > +       if (2 * write_pending <= reclaimable)
> > > > >
> > > > > That is always true here...
> > > > >
> > > >
> > > > Always true for you or always true in general?
> > >
> > > "Here" as in the boxen located at my GPS coordinates :)
> > >
> > > > The intent of the check is "are a majority of reclaimable pages
> > > > marked WRITE_PENDING?". It's similar to the check that existed prior
> > > > to 132b0d21d21f ("mm/page_alloc: remove the throttling logic from the
> > > > page allocator").
> > >
> > > I'll put my trace_printk() back and see if I can't bend-adjust it.
> >
> > As it sits, write_pending is always 0 with tail /dev/zero.
> >
>
> That is not a surprise for the test in question as it doesn't trigger
> a case where there are lots of page cache being marked dirty and write
> pending.

I found a way to make it go false, around 80% of the time on the way to
the first oom-kill in fact. Starting 2 memcg stress instances and a
memory sized bonnie did the trick.  'course getting those started was
the last thing my desktop did before taking up residence on spinning
rust and staying there for the 10 minute test duration, mouse pointer
didn't even manage to twitch :)

That's way worse than having box try to swallow /dev/zero, that gets
killed pretty quickly with your latest patch.  The grim reaper tried to
help with nutty overcommit, but two kills in 10 minutes wasn't enough.

	-Mike


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 17:59         ` Mike Galbraith
@ 2021-12-01 13:01           ` Mel Gorman
  2021-12-01 13:52             ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-12-01 13:01 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Tue, Nov 30, 2021 at 06:59:58PM +0100, Mike Galbraith wrote:
> On Tue, 2021-11-30 at 17:27 +0000, Mel Gorman wrote:
> >
> > Obviously a fairly different experience and most likely due to the
> > underlying storage.
> 
> I bet a virtual nickle this is the sore spot.
> 

You win a virtual nickle!

Using an older laptop with slower storage and less memory a frequency
analysis of the stall reasons and source showed the top triggering event
was

Event count:                6210
 mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=xxx reason=VMSCAN_THROTTLE_WRITEBACK

 => trace_event_raw_event_mm_vmscan_throttled <ffffffff9987224a>
 => reclaim_throttle <ffffffff99873df2>
 => shrink_node <ffffffff99875bd5>
 => do_try_to_free_pages <ffffffff99875cf8>
 => try_to_free_pages <ffffffff998772e3>
 => __alloc_pages_slowpath.constprop.114 <ffffffff998c2ad9>
 => __alloc_pages <ffffffff998c366b>
 => folio_alloc <ffffffff998e4107>
 => page_cache_ra_unbounded <ffffffff99868fab>
 => filemap_fault <ffffffff9985cb13>
 => __do_fault <ffffffff99899361>
 => __handle_mm_fault <ffffffff998a1470>
 => handle_mm_fault <ffffffff998a19ba>
 => do_user_addr_fault <ffffffff99688734>
 => exc_page_fault <ffffffff9a07bcd7>
 => asm_exc_page_fault <ffffffff9a200ace>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 13:01           ` Mel Gorman
@ 2021-12-01 13:52             ` Mike Galbraith
  2021-12-01 15:06               ` Mel Gorman
  0 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-12-01 13:52 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Wed, 2021-12-01 at 13:01 +0000, Mel Gorman wrote:
> On Tue, Nov 30, 2021 at 06:59:58PM +0100, Mike Galbraith wrote:
> > On Tue, 2021-11-30 at 17:27 +0000, Mel Gorman wrote:
> > >
> > > Obviously a fairly different experience and most likely due to
> > > the
> > > underlying storage.
> >
> > I bet a virtual nickle this is the sore spot.
> >
>
> You win a virtual nickle!

I'm rich I'm rich... oh dang, virtual.

I went back to 5.15, and confirmed that wait_iff_congested() did not
ever sleep with the try to eat /dev/zero load.  Nor did it with insane
overcommit swap storm from hell with as much IO going on as my little
box is capable of generating, making the surrounding congestion bits
look.. down right expendable.

	-Mike

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-11-30 18:38         ` Alexey Avramov
@ 2021-12-01 14:00           ` Mel Gorman
  2021-12-01 17:29             ` Darrick J. Wong
                               ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Mel Gorman @ 2021-12-01 14:00 UTC (permalink / raw)
  To: Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Wed, Dec 01, 2021 at 03:38:36AM +0900, Alexey Avramov wrote:
> >due to the
> >underlying storage.
> 
> I agree.
> 
> >and send me the trace.out file please?
> 
> https://drive.google.com/file/d/1FBjAmXwXakWPPjcn6K-B50S04vyySQO6/view
> 
> Typical entries:
> 
>            Timer-10841   [006] ..... 14341.639496: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
>            gmain-1246    [004] ..... 14341.639498: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK

Ok, the bulk of the stalls were for the same reason. Using a laptop with
slower storge (SSD but slower than the first laptop), the stalls were
almost all from the same reason and from one callsite -- shrink_node.

I've included another patch below against 5.16-rc1 but it'll apply to
5.16-rc3. Using the same test I get

Swap off
--------
Kernel: 5.15
2021-12-01 13:06:23,432: Peak values:  avg10  avg60 avg300
2021-12-01 13:06:23,432: -----------  ------ ------ ------
2021-12-01 13:06:23,433: some cpu       4.72   2.68   1.25
2021-12-01 13:06:23,433: -----------  ------ ------ ------
2021-12-01 13:06:23,433: some io        7.63   3.29   1.03
2021-12-01 13:06:23,433: full io        4.28   1.91   0.63
2021-12-01 13:06:23,433: -----------  ------ ------ ------
2021-12-01 13:06:23,433: some memory   13.28   4.80   1.20
2021-12-01 13:06:23,434: full memory   12.22   4.44   1.11
2021-12-01 13:06:23,434: Stall times for the last 45.5s:
2021-12-01 13:06:23,434: -----------
2021-12-01 13:06:23,434: some cpu     1.8s, avg 4.0%
2021-12-01 13:06:23,435: -----------
2021-12-01 13:06:23,435: some io      2.9s, avg 6.5%
2021-12-01 13:06:23,435: full io      1.8s, avg 4.0%
2021-12-01 13:06:23,435: -----------
2021-12-01 13:06:23,435: some memory  4.4s, avg 9.6%
2021-12-01 13:06:23,435: full memory  4.0s, avg 8.8%
2021-12-01 13:06:23,435:

Kernel: 5.16-rc1
2021-12-01 13:13:33,662: Peak values:  avg10  avg60 avg300
2021-12-01 13:13:33,663: -----------  ------ ------ ------
2021-12-01 13:13:33,663: some cpu       1.97   1.84   0.80
2021-12-01 13:13:33,663: -----------  ------ ------ ------
2021-12-01 13:13:33,663: some io       49.60  26.97  13.26
2021-12-01 13:13:33,663: full io       45.91  24.56  12.13
2021-12-01 13:13:33,663: -----------  ------ ------ ------
2021-12-01 13:13:33,663: some memory   98.49  93.94  54.09
2021-12-01 13:13:33,663: full memory   98.49  93.75  53.82
2021-12-01 13:13:33,664: Stall times for the last 291.0s:
2021-12-01 13:13:33,664: -----------
2021-12-01 13:13:33,664: some cpu     2.3s, avg 0.8%
2021-12-01 13:13:33,664: -----------
2021-12-01 13:13:33,664: some io      57.2s, avg 19.7%
2021-12-01 13:13:33,664: full io      52.9s, avg 18.2%
2021-12-01 13:13:33,664: -----------
2021-12-01 13:13:33,664: some memory  244.8s, avg 84.1%
2021-12-01 13:13:33,664: full memory  242.9s, avg 83.5%
2021-12-01 13:13:33,664:

Kernel: 5.16-rc1-v3r4
2021-12-01 13:16:46,910: Peak values:  avg10  avg60 avg300
2021-12-01 13:16:46,910: -----------  ------ ------ ------
2021-12-01 13:16:46,910: some cpu       2.91   2.04   0.97
2021-12-01 13:16:46,910: -----------  ------ ------ ------
2021-12-01 13:16:46,910: some io        6.59   2.94   1.00
2021-12-01 13:16:46,910: full io        3.95   1.82   0.62
2021-12-01 13:16:46,910: -----------  ------ ------ ------
2021-12-01 13:16:46,910: some memory   10.29   3.67   0.90
2021-12-01 13:16:46,911: full memory    9.82   3.46   0.84
2021-12-01 13:16:46,911: Stall times for the last 44.4s:
2021-12-01 13:16:46,911: -----------
2021-12-01 13:16:46,911: some cpu     1.2s, avg 2.7%
2021-12-01 13:16:46,911: -----------
2021-12-01 13:16:46,911: some io      2.4s, avg 5.4%
2021-12-01 13:16:46,911: full io      1.5s, avg 3.5%
2021-12-01 13:16:46,911: -----------
2021-12-01 13:16:46,911: some memory  3.2s, avg 7.1%
2021-12-01 13:16:46,911: full memory  3.0s, avg 6.7%
2021-12-01 13:16:46,911:

So stall times with v5.15 and with the patch are vaguely similar with
stall times for 5.16-rc1 being terrible

Swap on
-------

Kernel: 5.15
2021-12-01 13:23:04,392: Peak values:  avg10  avg60 avg300
2021-12-01 13:23:04,392: -----------  ------ ------ ------
2021-12-01 13:23:04,393: some cpu       8.02   5.19   2.93
2021-12-01 13:23:04,393: -----------  ------ ------ ------
2021-12-01 13:23:04,393: some io       77.30  61.75  37.00
2021-12-01 13:23:04,393: full io       62.33  50.53  29.98
2021-12-01 13:23:04,393: -----------  ------ ------ ------
2021-12-01 13:23:04,393: some memory   82.05  66.42  39.74
2021-12-01 13:23:04,393: full memory   73.27  57.79  34.39
2021-12-01 13:23:04,393: Stall times for the last 272.6s:
2021-12-01 13:23:04,393: -----------
2021-12-01 13:23:04,393: some cpu     13.7s, avg 5.0%
2021-12-01 13:23:04,393: -----------
2021-12-01 13:23:04,393: some io      167.4s, avg 61.4%
2021-12-01 13:23:04,393: full io      135.0s, avg 49.5%
2021-12-01 13:23:04,394: -----------
2021-12-01 13:23:04,394: some memory  180.4s, avg 66.2%
2021-12-01 13:23:04,394: full memory  155.9s, avg 57.2%
2021-12-01 13:23:04,394:

Kernel: 5.16-rc1
2021-12-01 13:35:01,025: Peak values:  avg10  avg60 avg300
2021-12-01 13:35:01,025: -----------  ------ ------ ------
2021-12-01 13:35:01,025: some cpu       4.03   2.51   1.46
2021-12-01 13:35:01,025: -----------  ------ ------ ------
2021-12-01 13:35:01,025: some io       61.02  38.02  25.97
2021-12-01 13:35:01,025: full io       53.30  32.69  22.18
2021-12-01 13:35:01,025: -----------  ------ ------ ------
2021-12-01 13:35:01,025: some memory   99.02  87.77  68.67
2021-12-01 13:35:01,025: full memory   98.66  85.18  65.45
2021-12-01 13:35:01,026: Stall times for the last 620.8s:
2021-12-01 13:35:01,026: -----------
2021-12-01 13:35:01,026: some cpu     11.9s, avg 1.9%
2021-12-01 13:35:01,026: -----------
2021-12-01 13:35:01,026: some io      182.9s, avg 29.5%
2021-12-01 13:35:01,026: full io      156.0s, avg 25.1%
2021-12-01 13:35:01,026: -----------
2021-12-01 13:35:01,026: some memory  463.1s, avg 74.6%
2021-12-01 13:35:01,026: full memory  437.0s, avg 70.4%
2021-12-01 13:35:01,026:

Kernel: 5.16-rc1-v3r4
2021-12-01 13:42:43,066: Peak values:  avg10  avg60 avg300
2021-12-01 13:42:43,066: -----------  ------ ------ ------
2021-12-01 13:42:43,066: some cpu       6.45   3.62   2.21
2021-12-01 13:42:43,066: -----------  ------ ------ ------
2021-12-01 13:42:43,066: some io       74.16  58.74  34.14
2021-12-01 13:42:43,066: full io       62.77  48.21  27.97
2021-12-01 13:42:43,066: -----------  ------ ------ ------
2021-12-01 13:42:43,066: some memory   78.82  62.85  36.39
2021-12-01 13:42:43,067: full memory   71.42  55.12  31.92
2021-12-01 13:42:43,067: Stall times for the last 257.2s:
2021-12-01 13:42:43,067: -----------
2021-12-01 13:42:43,067: some cpu     9.9s, avg 3.9%
2021-12-01 13:42:43,067: -----------
2021-12-01 13:42:43,067: some io      150.9s, avg 58.7%
2021-12-01 13:42:43,067: full io      123.2s, avg 47.9%
2021-12-01 13:42:43,067: -----------
2021-12-01 13:42:43,067: some memory  161.5s, avg 62.8%
2021-12-01 13:42:43,067: full memory  141.6s, avg 55.1%
2021-12-01 13:42:43,067:

Again 5.16-rc1 stuttered badly but the new patch was comparable to 5.15.

As my baseline figures are very different to yours due to differences in
storage, can you test the following please?

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 58e744b78c2c..936dc0b6c226 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -277,6 +277,7 @@ enum vmscan_throttle_state {
 	VMSCAN_THROTTLE_WRITEBACK,
 	VMSCAN_THROTTLE_ISOLATED,
 	VMSCAN_THROTTLE_NOPROGRESS,
+	VMSCAN_THROTTLE_CONGESTED,
 	NR_VMSCAN_THROTTLE,
 };
 
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f25a6149d3ba..ca2e9009a651 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -30,12 +30,14 @@
 #define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
 #define _VMSCAN_THROTTLE_ISOLATED	(1 << VMSCAN_THROTTLE_ISOLATED)
 #define _VMSCAN_THROTTLE_NOPROGRESS	(1 << VMSCAN_THROTTLE_NOPROGRESS)
+#define _VMSCAN_THROTTLE_CONGESTED	(1 << VMSCAN_THROTTLE_CONGESTED)
 
 #define show_throttle_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",					\
 		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"},	\
 		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"},	\
-		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"}	\
+		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"},	\
+		{_VMSCAN_THROTTLE_CONGESTED,	"VMSCAN_THROTTLE_CONGESTED"}	\
 		) : "VMSCAN_THROTTLE_NONE"
 
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fb9584641ac7..e3f2dd1e8cd9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
 	unlock_page(page);
 }
 
+bool skip_throttle_noprogress(pg_data_t *pgdat)
+{
+	int reclaimable = 0, write_pending = 0;
+	int i;
+
+	/*
+	 * If kswapd is disabled, reschedule if necessary but do not
+	 * throttle as the system is likely near OOM.
+	 */
+	if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+		return true;
+
+	/*
+	 * If there are a lot of dirty/writeback pages then do not
+	 * throttle as throttling will occur when the pages cycle
+	 * towards the end of the LRU if still under writeback.
+	 */
+	for (i = 0; i < MAX_NR_ZONES; i++) {
+		struct zone *zone = pgdat->node_zones + i;
+
+		if (!populated_zone(zone))
+			continue;
+
+		reclaimable += zone_reclaimable_pages(zone);
+		write_pending += zone_page_state_snapshot(zone,
+						  NR_ZONE_WRITE_PENDING);
+	}
+	if (2 * write_pending <= reclaimable)
+		return true;
+
+	return false;
+}
+
 void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
@@ -1056,8 +1089,16 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 		}
 
 		break;
+	case VMSCAN_THROTTLE_CONGESTED:
+		fallthrough;
 	case VMSCAN_THROTTLE_NOPROGRESS:
-		timeout = HZ/2;
+		if (skip_throttle_noprogress(pgdat)) {
+			cond_resched();
+			return;
+		}
+
+		timeout = 1;
+
 		break;
 	case VMSCAN_THROTTLE_ISOLATED:
 		timeout = HZ/50;
@@ -3321,7 +3362,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 	if (!current_is_kswapd() && current_may_throttle() &&
 	    !sc->hibernation_mode &&
 	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_CONGESTED);
 
 	if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
 				    sc))
@@ -3386,16 +3427,16 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 	}
 
 	/*
-	 * Do not throttle kswapd on NOPROGRESS as it will throttle on
-	 * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under
-	 * writeback and marked for immediate reclaim at the tail of
-	 * the LRU.
+	 * Do not throttle kswapd or cgroup reclaim on NOPROGRESS as it will
+	 * throttle on VMSCAN_THROTTLE_WRITEBACK if there are too many pages
+	 * under writeback and marked for immediate reclaim at the tail of the
+	 * LRU.
 	 */
-	if (current_is_kswapd())
+	if (current_is_kswapd() || cgroup_reclaim(sc))
 		return;
 
 	/* Throttle if making no progress at high prioities. */
-	if (sc->priority < DEF_PRIORITY - 2)
+	if (sc->priority == 1 && !sc->nr_reclaimed)
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 
@@ -3415,6 +3456,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 	unsigned long nr_soft_scanned;
 	gfp_t orig_mask;
 	pg_data_t *last_pgdat = NULL;
+	pg_data_t *first_pgdat = NULL;
 
 	/*
 	 * If the number of buffer_heads in the machine exceeds the maximum
@@ -3478,14 +3520,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 			/* need some check for avoid more shrink_zone() */
 		}
 
+		if (!first_pgdat)
+			first_pgdat = zone->zone_pgdat;
+
 		/* See comment about same check for global reclaim above */
 		if (zone->zone_pgdat == last_pgdat)
 			continue;
 		last_pgdat = zone->zone_pgdat;
 		shrink_node(zone->zone_pgdat, sc);
-		consider_reclaim_throttle(zone->zone_pgdat, sc);
 	}
 
+	consider_reclaim_throttle(first_pgdat, sc);
+
 	/*
 	 * Restore to original mask to avoid the impact on the caller if we
 	 * promoted it to __GFP_HIGHMEM.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 13:52             ` Mike Galbraith
@ 2021-12-01 15:06               ` Mel Gorman
  0 siblings, 0 replies; 40+ messages in thread
From: Mel Gorman @ 2021-12-01 15:06 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Wed, Dec 01, 2021 at 02:52:01PM +0100, Mike Galbraith wrote:
> On Wed, 2021-12-01 at 13:01 +0000, Mel Gorman wrote:
> > On Tue, Nov 30, 2021 at 06:59:58PM +0100, Mike Galbraith wrote:
> > > On Tue, 2021-11-30 at 17:27 +0000, Mel Gorman wrote:
> > > >
> > > > Obviously a fairly different experience and most likely due to
> > > > the
> > > > underlying storage.
> > >
> > > I bet a virtual nickle this is the sore spot.
> > >
> >
> > You win a virtual nickle!
> 
> I'm rich I'm rich... oh dang, virtual.
> 
> I went back to 5.15, and confirmed that wait_iff_congested() did not
> ever sleep with the try to eat /dev/zero load.  Nor did it with insane
> overcommit swap storm from hell with as much IO going on as my little
> box is capable of generating, making the surrounding congestion bits
> look.. down right expendable.
> 

wait_iff_congested was broken once the block layer stopped tracking
congestion and became a glorified cond_resched() in most cases. This is
why the series aimed to remove the reliance on
congestion_wait/wait_iff_congested.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 14:00           ` Mel Gorman
@ 2021-12-01 17:29             ` Darrick J. Wong
  2021-12-02  9:43               ` Mel Gorman
  2021-12-02  3:11             ` Mike Galbraith
  2021-12-02 11:42             ` Alexey Avramov
  2 siblings, 1 reply; 40+ messages in thread
From: Darrick J. Wong @ 2021-12-01 17:29 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Mike Galbraith, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Wed, Dec 01, 2021 at 02:00:05PM +0000, Mel Gorman wrote:
> On Wed, Dec 01, 2021 at 03:38:36AM +0900, Alexey Avramov wrote:
> > >due to the
> > >underlying storage.
> > 
> > I agree.
> > 
> > >and send me the trace.out file please?
> > 
> > https://drive.google.com/file/d/1FBjAmXwXakWPPjcn6K-B50S04vyySQO6/view
> > 
> > Typical entries:
> > 
> >            Timer-10841   [006] ..... 14341.639496: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
> >            gmain-1246    [004] ..... 14341.639498: mm_vmscan_throttled: nid=0 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
> 
> Ok, the bulk of the stalls were for the same reason. Using a laptop with
> slower storge (SSD but slower than the first laptop), the stalls were
> almost all from the same reason and from one callsite -- shrink_node.
> 
> I've included another patch below against 5.16-rc1 but it'll apply to
> 5.16-rc3. Using the same test I get
> 
> Swap off
> --------
> Kernel: 5.15
> 2021-12-01 13:06:23,432: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:06:23,432: -----------  ------ ------ ------
> 2021-12-01 13:06:23,433: some cpu       4.72   2.68   1.25
> 2021-12-01 13:06:23,433: -----------  ------ ------ ------
> 2021-12-01 13:06:23,433: some io        7.63   3.29   1.03
> 2021-12-01 13:06:23,433: full io        4.28   1.91   0.63
> 2021-12-01 13:06:23,433: -----------  ------ ------ ------
> 2021-12-01 13:06:23,433: some memory   13.28   4.80   1.20
> 2021-12-01 13:06:23,434: full memory   12.22   4.44   1.11
> 2021-12-01 13:06:23,434: Stall times for the last 45.5s:
> 2021-12-01 13:06:23,434: -----------
> 2021-12-01 13:06:23,434: some cpu     1.8s, avg 4.0%
> 2021-12-01 13:06:23,435: -----------
> 2021-12-01 13:06:23,435: some io      2.9s, avg 6.5%
> 2021-12-01 13:06:23,435: full io      1.8s, avg 4.0%
> 2021-12-01 13:06:23,435: -----------
> 2021-12-01 13:06:23,435: some memory  4.4s, avg 9.6%
> 2021-12-01 13:06:23,435: full memory  4.0s, avg 8.8%
> 2021-12-01 13:06:23,435:
> 
> Kernel: 5.16-rc1
> 2021-12-01 13:13:33,662: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:13:33,663: -----------  ------ ------ ------
> 2021-12-01 13:13:33,663: some cpu       1.97   1.84   0.80
> 2021-12-01 13:13:33,663: -----------  ------ ------ ------
> 2021-12-01 13:13:33,663: some io       49.60  26.97  13.26
> 2021-12-01 13:13:33,663: full io       45.91  24.56  12.13
> 2021-12-01 13:13:33,663: -----------  ------ ------ ------
> 2021-12-01 13:13:33,663: some memory   98.49  93.94  54.09
> 2021-12-01 13:13:33,663: full memory   98.49  93.75  53.82
> 2021-12-01 13:13:33,664: Stall times for the last 291.0s:
> 2021-12-01 13:13:33,664: -----------
> 2021-12-01 13:13:33,664: some cpu     2.3s, avg 0.8%
> 2021-12-01 13:13:33,664: -----------
> 2021-12-01 13:13:33,664: some io      57.2s, avg 19.7%
> 2021-12-01 13:13:33,664: full io      52.9s, avg 18.2%
> 2021-12-01 13:13:33,664: -----------
> 2021-12-01 13:13:33,664: some memory  244.8s, avg 84.1%
> 2021-12-01 13:13:33,664: full memory  242.9s, avg 83.5%
> 2021-12-01 13:13:33,664:
> 
> Kernel: 5.16-rc1-v3r4
> 2021-12-01 13:16:46,910: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:16:46,910: -----------  ------ ------ ------
> 2021-12-01 13:16:46,910: some cpu       2.91   2.04   0.97
> 2021-12-01 13:16:46,910: -----------  ------ ------ ------
> 2021-12-01 13:16:46,910: some io        6.59   2.94   1.00
> 2021-12-01 13:16:46,910: full io        3.95   1.82   0.62
> 2021-12-01 13:16:46,910: -----------  ------ ------ ------
> 2021-12-01 13:16:46,910: some memory   10.29   3.67   0.90
> 2021-12-01 13:16:46,911: full memory    9.82   3.46   0.84
> 2021-12-01 13:16:46,911: Stall times for the last 44.4s:
> 2021-12-01 13:16:46,911: -----------
> 2021-12-01 13:16:46,911: some cpu     1.2s, avg 2.7%
> 2021-12-01 13:16:46,911: -----------
> 2021-12-01 13:16:46,911: some io      2.4s, avg 5.4%
> 2021-12-01 13:16:46,911: full io      1.5s, avg 3.5%
> 2021-12-01 13:16:46,911: -----------
> 2021-12-01 13:16:46,911: some memory  3.2s, avg 7.1%
> 2021-12-01 13:16:46,911: full memory  3.0s, avg 6.7%
> 2021-12-01 13:16:46,911:
> 
> So stall times with v5.15 and with the patch are vaguely similar with
> stall times for 5.16-rc1 being terrible
> 
> Swap on
> -------
> 
> Kernel: 5.15
> 2021-12-01 13:23:04,392: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:23:04,392: -----------  ------ ------ ------
> 2021-12-01 13:23:04,393: some cpu       8.02   5.19   2.93
> 2021-12-01 13:23:04,393: -----------  ------ ------ ------
> 2021-12-01 13:23:04,393: some io       77.30  61.75  37.00
> 2021-12-01 13:23:04,393: full io       62.33  50.53  29.98
> 2021-12-01 13:23:04,393: -----------  ------ ------ ------
> 2021-12-01 13:23:04,393: some memory   82.05  66.42  39.74
> 2021-12-01 13:23:04,393: full memory   73.27  57.79  34.39
> 2021-12-01 13:23:04,393: Stall times for the last 272.6s:
> 2021-12-01 13:23:04,393: -----------
> 2021-12-01 13:23:04,393: some cpu     13.7s, avg 5.0%
> 2021-12-01 13:23:04,393: -----------
> 2021-12-01 13:23:04,393: some io      167.4s, avg 61.4%
> 2021-12-01 13:23:04,393: full io      135.0s, avg 49.5%
> 2021-12-01 13:23:04,394: -----------
> 2021-12-01 13:23:04,394: some memory  180.4s, avg 66.2%
> 2021-12-01 13:23:04,394: full memory  155.9s, avg 57.2%
> 2021-12-01 13:23:04,394:
> 
> Kernel: 5.16-rc1
> 2021-12-01 13:35:01,025: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:35:01,025: -----------  ------ ------ ------
> 2021-12-01 13:35:01,025: some cpu       4.03   2.51   1.46
> 2021-12-01 13:35:01,025: -----------  ------ ------ ------
> 2021-12-01 13:35:01,025: some io       61.02  38.02  25.97
> 2021-12-01 13:35:01,025: full io       53.30  32.69  22.18
> 2021-12-01 13:35:01,025: -----------  ------ ------ ------
> 2021-12-01 13:35:01,025: some memory   99.02  87.77  68.67
> 2021-12-01 13:35:01,025: full memory   98.66  85.18  65.45
> 2021-12-01 13:35:01,026: Stall times for the last 620.8s:
> 2021-12-01 13:35:01,026: -----------
> 2021-12-01 13:35:01,026: some cpu     11.9s, avg 1.9%
> 2021-12-01 13:35:01,026: -----------
> 2021-12-01 13:35:01,026: some io      182.9s, avg 29.5%
> 2021-12-01 13:35:01,026: full io      156.0s, avg 25.1%
> 2021-12-01 13:35:01,026: -----------
> 2021-12-01 13:35:01,026: some memory  463.1s, avg 74.6%
> 2021-12-01 13:35:01,026: full memory  437.0s, avg 70.4%
> 2021-12-01 13:35:01,026:
> 
> Kernel: 5.16-rc1-v3r4
> 2021-12-01 13:42:43,066: Peak values:  avg10  avg60 avg300
> 2021-12-01 13:42:43,066: -----------  ------ ------ ------
> 2021-12-01 13:42:43,066: some cpu       6.45   3.62   2.21
> 2021-12-01 13:42:43,066: -----------  ------ ------ ------
> 2021-12-01 13:42:43,066: some io       74.16  58.74  34.14
> 2021-12-01 13:42:43,066: full io       62.77  48.21  27.97
> 2021-12-01 13:42:43,066: -----------  ------ ------ ------
> 2021-12-01 13:42:43,066: some memory   78.82  62.85  36.39
> 2021-12-01 13:42:43,067: full memory   71.42  55.12  31.92
> 2021-12-01 13:42:43,067: Stall times for the last 257.2s:
> 2021-12-01 13:42:43,067: -----------
> 2021-12-01 13:42:43,067: some cpu     9.9s, avg 3.9%
> 2021-12-01 13:42:43,067: -----------
> 2021-12-01 13:42:43,067: some io      150.9s, avg 58.7%
> 2021-12-01 13:42:43,067: full io      123.2s, avg 47.9%
> 2021-12-01 13:42:43,067: -----------
> 2021-12-01 13:42:43,067: some memory  161.5s, avg 62.8%
> 2021-12-01 13:42:43,067: full memory  141.6s, avg 55.1%
> 2021-12-01 13:42:43,067:
> 
> Again 5.16-rc1 stuttered badly but the new patch was comparable to 5.15.
> 
> As my baseline figures are very different to yours due to differences in
> storage, can you test the following please?

I don't know if this was directed at me, but I reran my swapfile
testcase on 5.16-rc3 and found that it had nearly the same runtime as it
did in 5.15.

--D

> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 58e744b78c2c..936dc0b6c226 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -277,6 +277,7 @@ enum vmscan_throttle_state {
>  	VMSCAN_THROTTLE_WRITEBACK,
>  	VMSCAN_THROTTLE_ISOLATED,
>  	VMSCAN_THROTTLE_NOPROGRESS,
> +	VMSCAN_THROTTLE_CONGESTED,
>  	NR_VMSCAN_THROTTLE,
>  };
>  
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index f25a6149d3ba..ca2e9009a651 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -30,12 +30,14 @@
>  #define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
>  #define _VMSCAN_THROTTLE_ISOLATED	(1 << VMSCAN_THROTTLE_ISOLATED)
>  #define _VMSCAN_THROTTLE_NOPROGRESS	(1 << VMSCAN_THROTTLE_NOPROGRESS)
> +#define _VMSCAN_THROTTLE_CONGESTED	(1 << VMSCAN_THROTTLE_CONGESTED)
>  
>  #define show_throttle_flags(flags)						\
>  	(flags) ? __print_flags(flags, "|",					\
>  		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"},	\
>  		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"},	\
> -		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"}	\
> +		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"},	\
> +		{_VMSCAN_THROTTLE_CONGESTED,	"VMSCAN_THROTTLE_CONGESTED"}	\
>  		) : "VMSCAN_THROTTLE_NONE"
>  
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fb9584641ac7..e3f2dd1e8cd9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
>  	unlock_page(page);
>  }
>  
> +bool skip_throttle_noprogress(pg_data_t *pgdat)
> +{
> +	int reclaimable = 0, write_pending = 0;
> +	int i;
> +
> +	/*
> +	 * If kswapd is disabled, reschedule if necessary but do not
> +	 * throttle as the system is likely near OOM.
> +	 */
> +	if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
> +		return true;
> +
> +	/*
> +	 * If there are a lot of dirty/writeback pages then do not
> +	 * throttle as throttling will occur when the pages cycle
> +	 * towards the end of the LRU if still under writeback.
> +	 */
> +	for (i = 0; i < MAX_NR_ZONES; i++) {
> +		struct zone *zone = pgdat->node_zones + i;
> +
> +		if (!populated_zone(zone))
> +			continue;
> +
> +		reclaimable += zone_reclaimable_pages(zone);
> +		write_pending += zone_page_state_snapshot(zone,
> +						  NR_ZONE_WRITE_PENDING);
> +	}
> +	if (2 * write_pending <= reclaimable)
> +		return true;
> +
> +	return false;
> +}
> +
>  void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
>  {
>  	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
> @@ -1056,8 +1089,16 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
>  		}
>  
>  		break;
> +	case VMSCAN_THROTTLE_CONGESTED:
> +		fallthrough;
>  	case VMSCAN_THROTTLE_NOPROGRESS:
> -		timeout = HZ/2;
> +		if (skip_throttle_noprogress(pgdat)) {
> +			cond_resched();
> +			return;
> +		}
> +
> +		timeout = 1;
> +
>  		break;
>  	case VMSCAN_THROTTLE_ISOLATED:
>  		timeout = HZ/50;
> @@ -3321,7 +3362,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  	if (!current_is_kswapd() && current_may_throttle() &&
>  	    !sc->hibernation_mode &&
>  	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
> -		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
> +		reclaim_throttle(pgdat, VMSCAN_THROTTLE_CONGESTED);
>  
>  	if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
>  				    sc))
> @@ -3386,16 +3427,16 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
>  	}
>  
>  	/*
> -	 * Do not throttle kswapd on NOPROGRESS as it will throttle on
> -	 * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under
> -	 * writeback and marked for immediate reclaim at the tail of
> -	 * the LRU.
> +	 * Do not throttle kswapd or cgroup reclaim on NOPROGRESS as it will
> +	 * throttle on VMSCAN_THROTTLE_WRITEBACK if there are too many pages
> +	 * under writeback and marked for immediate reclaim at the tail of the
> +	 * LRU.
>  	 */
> -	if (current_is_kswapd())
> +	if (current_is_kswapd() || cgroup_reclaim(sc))
>  		return;
>  
>  	/* Throttle if making no progress at high prioities. */
> -	if (sc->priority < DEF_PRIORITY - 2)
> +	if (sc->priority == 1 && !sc->nr_reclaimed)
>  		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
>  }
>  
> @@ -3415,6 +3456,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  	unsigned long nr_soft_scanned;
>  	gfp_t orig_mask;
>  	pg_data_t *last_pgdat = NULL;
> +	pg_data_t *first_pgdat = NULL;
>  
>  	/*
>  	 * If the number of buffer_heads in the machine exceeds the maximum
> @@ -3478,14 +3520,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  			/* need some check for avoid more shrink_zone() */
>  		}
>  
> +		if (!first_pgdat)
> +			first_pgdat = zone->zone_pgdat;
> +
>  		/* See comment about same check for global reclaim above */
>  		if (zone->zone_pgdat == last_pgdat)
>  			continue;
>  		last_pgdat = zone->zone_pgdat;
>  		shrink_node(zone->zone_pgdat, sc);
> -		consider_reclaim_throttle(zone->zone_pgdat, sc);
>  	}
>  
> +	consider_reclaim_throttle(first_pgdat, sc);
> +
>  	/*
>  	 * Restore to original mask to avoid the impact on the caller if we
>  	 * promoted it to __GFP_HIGHMEM.
> 
> -- 
> Mel Gorman
> SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 14:00           ` Mel Gorman
  2021-12-01 17:29             ` Darrick J. Wong
@ 2021-12-02  3:11             ` Mike Galbraith
  2021-12-02 10:13               ` Mel Gorman
  2021-12-02 11:42             ` Alexey Avramov
  2 siblings, 1 reply; 40+ messages in thread
From: Mike Galbraith @ 2021-12-02  3:11 UTC (permalink / raw)
  To: Mel Gorman, Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Darrick Wong, regressions, Linux-fsdevel, Linux-MM, LKML

On Wed, 2021-12-01 at 14:00 +0000, Mel Gorman wrote:
>
> I've included another patch below against 5.16-rc1 but it'll apply to
> 5.16-rc3. Using the same test I get

LTP testcases that stalled my box no longer do, nor can I manually
trigger any unexpected behavior.

	-Mike

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 17:29             ` Darrick J. Wong
@ 2021-12-02  9:43               ` Mel Gorman
  2021-12-02 16:09                 ` Darrick J. Wong
  0 siblings, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-12-02  9:43 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Mike Galbraith, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Wed, Dec 01, 2021 at 09:29:20AM -0800, Darrick J. Wong wrote:
> > Again 5.16-rc1 stuttered badly but the new patch was comparable to 5.15.
> > 
> > As my baseline figures are very different to yours due to differences in
> > storage, can you test the following please?
> 
> I don't know if this was directed at me, but I reran my swapfile
> testcase on 5.16-rc3 and found that it had nearly the same runtime as it
> did in 5.15.
> 

Thanks Darrick. Can I add the following?

Tested-by: Darrick J. Wong <djwong@kernel.org>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02  3:11             ` Mike Galbraith
@ 2021-12-02 10:13               ` Mel Gorman
  2021-12-02 10:51                 ` Mike Galbraith
  0 siblings, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-12-02 10:13 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Thu, Dec 02, 2021 at 04:11:38AM +0100, Mike Galbraith wrote:
> On Wed, 2021-12-01 at 14:00 +0000, Mel Gorman wrote:
> >
> > I've included another patch below against 5.16-rc1 but it'll apply to
> > 5.16-rc3. Using the same test I get
> 
> LTP testcases that stalled my box no longer do, nor can I manually
> trigger any unexpected behavior.
> 

Excellent, can I add the following?

Tested-by: Mike Galbraith <efault@gmx.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 10:13               ` Mel Gorman
@ 2021-12-02 10:51                 ` Mike Galbraith
  0 siblings, 0 replies; 40+ messages in thread
From: Mike Galbraith @ 2021-12-02 10:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Darrick Wong, regressions, Linux-fsdevel, Linux-MM,
	LKML

On Thu, 2021-12-02 at 10:13 +0000, Mel Gorman wrote:
> On Thu, Dec 02, 2021 at 04:11:38AM +0100, Mike Galbraith wrote:
> > On Wed, 2021-12-01 at 14:00 +0000, Mel Gorman wrote:
> > >
> > > I've included another patch below against 5.16-rc1 but it'll
> > > apply to
> > > 5.16-rc3. Using the same test I get
> >
> > LTP testcases that stalled my box no longer do, nor can I manually
> > trigger any unexpected behavior.
> >
>
> Excellent, can I add the following?

Sure.

> Tested-by: Mike Galbraith <efault@gmx.de>
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-01 14:00           ` Mel Gorman
  2021-12-01 17:29             ` Darrick J. Wong
  2021-12-02  3:11             ` Mike Galbraith
@ 2021-12-02 11:42             ` Alexey Avramov
  2021-12-02 12:14               ` Alexey Avramov
  2021-12-02 12:15               ` Mel Gorman
  2 siblings, 2 replies; 40+ messages in thread
From: Alexey Avramov @ 2021-12-02 11:42 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

I tested this [1] patch on top of 5.16-rc2, the same tests.

- with noswap:

2021-12-02 19:41:19,279: Stall times for the last 146.5s:
2021-12-02 19:41:19,279: -----------
2021-12-02 19:41:19,279: some cpu     1.1s, avg 0.8%
2021-12-02 19:41:19,279: -----------
2021-12-02 19:41:19,279: some io      116.2s, avg 79.3%
2021-12-02 19:41:19,280: full io      109.6s, avg 74.8%
2021-12-02 19:41:19,280: -----------
2021-12-02 19:41:19,280: some memory  3.9s, avg 2.6%
2021-12-02 19:41:19,280: full memory  3.8s, avg 2.6%

Excellent!

psi log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/noswap/psi

mem log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/noswap/mem

- with swappiness=0

2021-12-02 19:46:04,860: Stall times for the last 144.5s:
2021-12-02 19:46:04,860: -----------
2021-12-02 19:46:04,860: some cpu     1.1s, avg 0.8%
2021-12-02 19:46:04,860: -----------
2021-12-02 19:46:04,860: some io      106.9s, avg 74.0%
2021-12-02 19:46:04,861: full io      101.3s, avg 70.1%
2021-12-02 19:46:04,861: -----------
2021-12-02 19:46:04,861: some memory  99.6s, avg 68.9%
2021-12-02 19:46:04,861: full memory  95.6s, avg 66.2%

PSI mem pressure was high, but there were no long stalls.

psi log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/swappiness0/psi

mem log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/swappiness0/mem

[1] https://lore.kernel.org/all/20211201140005.GU3366@techsingularity.net/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 11:42             ` Alexey Avramov
@ 2021-12-02 12:14               ` Alexey Avramov
  2021-12-02 12:15               ` Mel Gorman
  1 sibling, 0 replies; 40+ messages in thread
From: Alexey Avramov @ 2021-12-02 12:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

and trace.out with swappiness=0:

typical entries:

 Chrome_~dThread-3463    [003] .....  5492.490187: mm_vmscan_throttled: nid=0 usec_timeout=20000 usect_delayed=4000 reason=VMSCAN_THROTTLE_ISOLATED
 DNS Res~ver #12-5720    [005] .....  5492.490207: mm_vmscan_throttled: nid=0 usec_timeout=20000 usect_delayed=4000 reason=VMSCAN_THROTTLE_ISOLATED
            tail-6015    [001] .....  5492.490545: mm_vmscan_throttled: nid=0 usec_timeout=20000 usect_delayed=4000 reason=VMSCAN_THROTTLE_ISOLATED

https://drive.google.com/file/d/1oXOVPFyNmDe2-PFUIAMHMPffZo1N_7vl/view

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 11:42             ` Alexey Avramov
  2021-12-02 12:14               ` Alexey Avramov
@ 2021-12-02 12:15               ` Mel Gorman
  2021-12-02 12:22                 ` Alexey Avramov
  1 sibling, 1 reply; 40+ messages in thread
From: Mel Gorman @ 2021-12-02 12:15 UTC (permalink / raw)
  To: Alexey Avramov
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Thu, Dec 02, 2021 at 08:42:29PM +0900, Alexey Avramov wrote:
> I tested this [1] patch on top of 5.16-rc2, the same tests.
> 

Thanks for all the tests you've done for this, it's much appreciated.

> - with noswap:
> 
> 2021-12-02 19:41:19,279: Stall times for the last 146.5s:
> 2021-12-02 19:41:19,279: -----------
> 2021-12-02 19:41:19,279: some cpu     1.1s, avg 0.8%
> 2021-12-02 19:41:19,279: -----------
> 2021-12-02 19:41:19,279: some io      116.2s, avg 79.3%
> 2021-12-02 19:41:19,280: full io      109.6s, avg 74.8%
> 2021-12-02 19:41:19,280: -----------
> 2021-12-02 19:41:19,280: some memory  3.9s, avg 2.6%
> 2021-12-02 19:41:19,280: full memory  3.8s, avg 2.6%
> 
> Excellent!
> 
> psi log:
> https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/noswap/psi
> 
> mem log:
> https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch6/noswap/mem
> 
> - with swappiness=0
> 
> 2021-12-02 19:46:04,860: Stall times for the last 144.5s:
> 2021-12-02 19:46:04,860: -----------
> 2021-12-02 19:46:04,860: some cpu     1.1s, avg 0.8%
> 2021-12-02 19:46:04,860: -----------
> 2021-12-02 19:46:04,860: some io      106.9s, avg 74.0%
> 2021-12-02 19:46:04,861: full io      101.3s, avg 70.1%
> 2021-12-02 19:46:04,861: -----------
> 2021-12-02 19:46:04,861: some memory  99.6s, avg 68.9%
> 2021-12-02 19:46:04,861: full memory  95.6s, avg 66.2%
> 
> PSI mem pressure was high, but there were no long stalls.
> 

Great, so can I assume you're ok with this version? If yes, can I add
the following?

Tested-by: Alexey Avramov <hakavlad@inbox.lv>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 12:15               ` Mel Gorman
@ 2021-12-02 12:22                 ` Alexey Avramov
  0 siblings, 0 replies; 40+ messages in thread
From: Alexey Avramov @ 2021-12-02 12:22 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML

>can I add

sure

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02  9:43               ` Mel Gorman
@ 2021-12-02 16:09                 ` Darrick J. Wong
  0 siblings, 0 replies; 40+ messages in thread
From: Darrick J. Wong @ 2021-12-02 16:09 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Alexey Avramov, Andrew Morton, Michal Hocko, Vlastimil Babka,
	Rik van Riel, Mike Galbraith, regressions, Linux-fsdevel,
	Linux-MM, LKML

On Thu, Dec 02, 2021 at 09:43:32AM +0000, Mel Gorman wrote:
> On Wed, Dec 01, 2021 at 09:29:20AM -0800, Darrick J. Wong wrote:
> > > Again 5.16-rc1 stuttered badly but the new patch was comparable to 5.15.
> > > 
> > > As my baseline figures are very different to yours due to differences in
> > > storage, can you test the following please?
> > 
> > I don't know if this was directed at me, but I reran my swapfile
> > testcase on 5.16-rc3 and found that it had nearly the same runtime as it
> > did in 5.15.
> > 
> 
> Thanks Darrick. Can I add the following?
> 
> Tested-by: Darrick J. Wong <djwong@kernel.org>

Yes, please! :)

--D

> -- 
> Mel Gorman
> SUSE Labs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 13:18 Mel Gorman
  2021-12-02 14:51 ` kernel test robot
@ 2021-12-02 16:02 ` kernel test robot
  1 sibling, 0 replies; 40+ messages in thread
From: kernel test robot @ 2021-12-02 16:02 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: llvm, kbuild-all, Linux Memory Management List, Michal Hocko,
	Vlastimil Babka, Alexey Avramov, Rik van Riel, Mike Galbraith,
	Darrick Wong, regressions, Linux-fsdevel

Hi Mel,

I love your patch! Perhaps something to improve:

[auto build test WARNING on hnaz-mm/master]

url:    https://github.com/0day-ci/linux/commits/Mel-Gorman/mm-vmscan-Reduce-throttling-due-to-a-failure-to-make-progress/20211202-212004
base:   https://github.com/hnaz/linux-mm master
config: x86_64-buildonly-randconfig-r001-20211202 (https://download.01.org/0day-ci/archive/20211203/202112030001.HUiErCyK-lkp@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 4b553297ef3ee4dc2119d5429adf3072e90fac38)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/01dada07590ae9c69a9415ba9af96d5ae184d861
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mel-Gorman/mm-vmscan-Reduce-throttling-due-to-a-failure-to-make-progress/20211202-212004
        git checkout 01dada07590ae9c69a9415ba9af96d5ae184d861
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> mm/vmscan.c:1024:6: warning: no previous prototype for function 'skip_throttle_noprogress' [-Wmissing-prototypes]
   bool skip_throttle_noprogress(pg_data_t *pgdat)
        ^
   mm/vmscan.c:1024:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   bool skip_throttle_noprogress(pg_data_t *pgdat)
   ^
   static 
   1 warning generated.


vim +/skip_throttle_noprogress +1024 mm/vmscan.c

  1023	
> 1024	bool skip_throttle_noprogress(pg_data_t *pgdat)
  1025	{
  1026		int reclaimable = 0, write_pending = 0;
  1027		int i;
  1028	
  1029		/*
  1030		 * If kswapd is disabled, reschedule if necessary but do not
  1031		 * throttle as the system is likely near OOM.
  1032		 */
  1033		if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
  1034			return true;
  1035	
  1036		/*
  1037		 * If there are a lot of dirty/writeback pages then do not
  1038		 * throttle as throttling will occur when the pages cycle
  1039		 * towards the end of the LRU if still under writeback.
  1040		 */
  1041		for (i = 0; i < MAX_NR_ZONES; i++) {
  1042			struct zone *zone = pgdat->node_zones + i;
  1043	
  1044			if (!populated_zone(zone))
  1045				continue;
  1046	
  1047			reclaimable += zone_reclaimable_pages(zone);
  1048			write_pending += zone_page_state_snapshot(zone,
  1049							  NR_ZONE_WRITE_PENDING);
  1050		}
  1051		if (2 * write_pending <= reclaimable)
  1052			return true;
  1053	
  1054		return false;
  1055	}
  1056	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
  2021-12-02 13:18 Mel Gorman
@ 2021-12-02 14:51 ` kernel test robot
  2021-12-02 16:02 ` kernel test robot
  1 sibling, 0 replies; 40+ messages in thread
From: kernel test robot @ 2021-12-02 14:51 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: kbuild-all, Linux Memory Management List, Michal Hocko,
	Vlastimil Babka, Alexey Avramov, Rik van Riel, Mike Galbraith,
	Darrick Wong, regressions, Linux-fsdevel

Hi Mel,

I love your patch! Perhaps something to improve:

[auto build test WARNING on hnaz-mm/master]

url:    https://github.com/0day-ci/linux/commits/Mel-Gorman/mm-vmscan-Reduce-throttling-due-to-a-failure-to-make-progress/20211202-212004
base:   https://github.com/hnaz/linux-mm master
config: um-i386_defconfig (https://download.01.org/0day-ci/archive/20211202/202112022232.gARZWe0c-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/01dada07590ae9c69a9415ba9af96d5ae184d861
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mel-Gorman/mm-vmscan-Reduce-throttling-due-to-a-failure-to-make-progress/20211202-212004
        git checkout 01dada07590ae9c69a9415ba9af96d5ae184d861
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> mm/vmscan.c:1024:6: warning: no previous prototype for 'skip_throttle_noprogress' [-Wmissing-prototypes]
    1024 | bool skip_throttle_noprogress(pg_data_t *pgdat)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/skip_throttle_noprogress +1024 mm/vmscan.c

  1023	
> 1024	bool skip_throttle_noprogress(pg_data_t *pgdat)
  1025	{
  1026		int reclaimable = 0, write_pending = 0;
  1027		int i;
  1028	
  1029		/*
  1030		 * If kswapd is disabled, reschedule if necessary but do not
  1031		 * throttle as the system is likely near OOM.
  1032		 */
  1033		if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
  1034			return true;
  1035	
  1036		/*
  1037		 * If there are a lot of dirty/writeback pages then do not
  1038		 * throttle as throttling will occur when the pages cycle
  1039		 * towards the end of the LRU if still under writeback.
  1040		 */
  1041		for (i = 0; i < MAX_NR_ZONES; i++) {
  1042			struct zone *zone = pgdat->node_zones + i;
  1043	
  1044			if (!populated_zone(zone))
  1045				continue;
  1046	
  1047			reclaimable += zone_reclaimable_pages(zone);
  1048			write_pending += zone_page_state_snapshot(zone,
  1049							  NR_ZONE_WRITE_PENDING);
  1050		}
  1051		if (2 * write_pending <= reclaimable)
  1052			return true;
  1053	
  1054		return false;
  1055	}
  1056	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress
@ 2021-12-02 13:18 Mel Gorman
  2021-12-02 14:51 ` kernel test robot
  2021-12-02 16:02 ` kernel test robot
  0 siblings, 2 replies; 40+ messages in thread
From: Mel Gorman @ 2021-12-02 13:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vlastimil Babka, Alexey Avramov, Rik van Riel,
	Mike Galbraith, Darrick Wong, regressions, Linux-fsdevel,
	Linux-MM, LKML, Mel Gorman

Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
problems due to reclaim throttling for excessive lengths of time.
In Alexey's case, a memory hog that should go OOM quickly stalls for
several minutes before stalling. In Mike and Darrick's cases, a small
memcg environment stalled excessively even though the system had enough
memory overall.

Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being
made") introduced the problem although commit a19594ca4a8b ("mm/vmscan:
increase the timeout if page reclaim is not making progress") made it
worse. Systems at or near an OOM state that cannot be recovered must
reach OOM quickly and memcg should kill tasks if a memcg is near OOM.

To address this, only stall for the first zone in the zonelist, reduce
the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
the scan control nr_reclaimed is 0, kswapd is still active and there were
excessive pages pending for writeback. If kswapd has stopped reclaiming due
to excessive failures, do not stall at all so that OOM triggers relatively
quickly. Similarly, if an LRU is simply congested, only lightly throttle
similar to NOPROGRESS.

Alexey's original case was the most straight forward

	for i in {1..3}; do tail /dev/zero; done

On vanilla 5.16-rc1, this test stalled heavily, after the patch the test
completes in a few seconds similar to 5.15.

Alexey's second test case added watching a youtube video while tail runs
10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a lot
with lots of frames missing and numerous audio glitches. With this patch
applies, the video plays similarly to 5.15.

Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de
Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv
Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net

Reported-and-tested-by: Alexey Avramov <hakavlad@inbox.lv>
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Reported-and-tested-by: Darrick J. Wong <djwong@kernel.org>
Fixes: 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mmzone.h        |  1 +
 include/trace/events/vmscan.h |  4 ++-
 mm/vmscan.c                   | 64 ++++++++++++++++++++++++++++++-----
 3 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 58e744b78c2c..936dc0b6c226 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -277,6 +277,7 @@ enum vmscan_throttle_state {
 	VMSCAN_THROTTLE_WRITEBACK,
 	VMSCAN_THROTTLE_ISOLATED,
 	VMSCAN_THROTTLE_NOPROGRESS,
+	VMSCAN_THROTTLE_CONGESTED,
 	NR_VMSCAN_THROTTLE,
 };
 
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f25a6149d3ba..ca2e9009a651 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -30,12 +30,14 @@
 #define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
 #define _VMSCAN_THROTTLE_ISOLATED	(1 << VMSCAN_THROTTLE_ISOLATED)
 #define _VMSCAN_THROTTLE_NOPROGRESS	(1 << VMSCAN_THROTTLE_NOPROGRESS)
+#define _VMSCAN_THROTTLE_CONGESTED	(1 << VMSCAN_THROTTLE_CONGESTED)
 
 #define show_throttle_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",					\
 		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"},	\
 		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"},	\
-		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"}	\
+		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"},	\
+		{_VMSCAN_THROTTLE_CONGESTED,	"VMSCAN_THROTTLE_CONGESTED"}	\
 		) : "VMSCAN_THROTTLE_NONE"
 
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fb9584641ac7..e3f2dd1e8cd9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1021,6 +1021,39 @@ static void handle_write_error(struct address_space *mapping,
 	unlock_page(page);
 }
 
+bool skip_throttle_noprogress(pg_data_t *pgdat)
+{
+	int reclaimable = 0, write_pending = 0;
+	int i;
+
+	/*
+	 * If kswapd is disabled, reschedule if necessary but do not
+	 * throttle as the system is likely near OOM.
+	 */
+	if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+		return true;
+
+	/*
+	 * If there are a lot of dirty/writeback pages then do not
+	 * throttle as throttling will occur when the pages cycle
+	 * towards the end of the LRU if still under writeback.
+	 */
+	for (i = 0; i < MAX_NR_ZONES; i++) {
+		struct zone *zone = pgdat->node_zones + i;
+
+		if (!populated_zone(zone))
+			continue;
+
+		reclaimable += zone_reclaimable_pages(zone);
+		write_pending += zone_page_state_snapshot(zone,
+						  NR_ZONE_WRITE_PENDING);
+	}
+	if (2 * write_pending <= reclaimable)
+		return true;
+
+	return false;
+}
+
 void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
@@ -1056,8 +1089,16 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 		}
 
 		break;
+	case VMSCAN_THROTTLE_CONGESTED:
+		fallthrough;
 	case VMSCAN_THROTTLE_NOPROGRESS:
-		timeout = HZ/2;
+		if (skip_throttle_noprogress(pgdat)) {
+			cond_resched();
+			return;
+		}
+
+		timeout = 1;
+
 		break;
 	case VMSCAN_THROTTLE_ISOLATED:
 		timeout = HZ/50;
@@ -3321,7 +3362,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 	if (!current_is_kswapd() && current_may_throttle() &&
 	    !sc->hibernation_mode &&
 	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_CONGESTED);
 
 	if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
 				    sc))
@@ -3386,16 +3427,16 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 	}
 
 	/*
-	 * Do not throttle kswapd on NOPROGRESS as it will throttle on
-	 * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under
-	 * writeback and marked for immediate reclaim at the tail of
-	 * the LRU.
+	 * Do not throttle kswapd or cgroup reclaim on NOPROGRESS as it will
+	 * throttle on VMSCAN_THROTTLE_WRITEBACK if there are too many pages
+	 * under writeback and marked for immediate reclaim at the tail of the
+	 * LRU.
 	 */
-	if (current_is_kswapd())
+	if (current_is_kswapd() || cgroup_reclaim(sc))
 		return;
 
 	/* Throttle if making no progress at high prioities. */
-	if (sc->priority < DEF_PRIORITY - 2)
+	if (sc->priority == 1 && !sc->nr_reclaimed)
 		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 
@@ -3415,6 +3456,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 	unsigned long nr_soft_scanned;
 	gfp_t orig_mask;
 	pg_data_t *last_pgdat = NULL;
+	pg_data_t *first_pgdat = NULL;
 
 	/*
 	 * If the number of buffer_heads in the machine exceeds the maximum
@@ -3478,14 +3520,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 			/* need some check for avoid more shrink_zone() */
 		}
 
+		if (!first_pgdat)
+			first_pgdat = zone->zone_pgdat;
+
 		/* See comment about same check for global reclaim above */
 		if (zone->zone_pgdat == last_pgdat)
 			continue;
 		last_pgdat = zone->zone_pgdat;
 		shrink_node(zone->zone_pgdat, sc);
-		consider_reclaim_throttle(zone->zone_pgdat, sc);
 	}
 
+	consider_reclaim_throttle(first_pgdat, sc);
+
 	/*
 	 * Restore to original mask to avoid the impact on the caller if we
 	 * promoted it to __GFP_HIGHMEM.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-12-02 16:09 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-25 15:18 [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to make progress Mel Gorman
2021-11-26 10:14 ` Vlastimil Babka
2021-11-26 10:26 ` Mike Galbraith
2021-11-26 16:12 ` Alexey Avramov
2021-11-26 16:52   ` Mel Gorman
2021-11-27 19:26     ` Alexey Avramov
2021-11-28 10:00       ` Mike Galbraith
2021-11-28 11:39         ` Mike Galbraith
2021-11-28 12:35           ` Mike Galbraith
2021-11-28 18:38             ` Mike Galbraith
2021-11-28 11:58         ` Alexey Avramov
2021-11-29  8:26         ` Mel Gorman
2021-11-29 15:01   ` Mel Gorman
2021-11-30 10:14     ` Mike Galbraith
2021-11-30 11:22       ` Mel Gorman
2021-11-30 12:00         ` Mike Galbraith
2021-11-30 12:51           ` Mike Galbraith
2021-11-30 13:09             ` Mel Gorman
2021-12-01  4:32               ` Mike Galbraith
2021-11-30 16:03     ` Alexey Avramov
2021-11-30 17:27       ` Mel Gorman
2021-11-30 17:59         ` Mike Galbraith
2021-12-01 13:01           ` Mel Gorman
2021-12-01 13:52             ` Mike Galbraith
2021-12-01 15:06               ` Mel Gorman
2021-11-30 18:38         ` Alexey Avramov
2021-12-01 14:00           ` Mel Gorman
2021-12-01 17:29             ` Darrick J. Wong
2021-12-02  9:43               ` Mel Gorman
2021-12-02 16:09                 ` Darrick J. Wong
2021-12-02  3:11             ` Mike Galbraith
2021-12-02 10:13               ` Mel Gorman
2021-12-02 10:51                 ` Mike Galbraith
2021-12-02 11:42             ` Alexey Avramov
2021-12-02 12:14               ` Alexey Avramov
2021-12-02 12:15               ` Mel Gorman
2021-12-02 12:22                 ` Alexey Avramov
2021-12-02 13:18 Mel Gorman
2021-12-02 14:51 ` kernel test robot
2021-12-02 16:02 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).