linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
@ 2020-02-09  7:41 Wei Yang
  2020-02-11 10:42 ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2020-02-09  7:41 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, shakeelb, yang.shi, mgorman, Wei Yang

Before commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
prematurely due to mismatched classzone_idx"), classzone_idx could have
two possibilities on a new loop based on whether there is a wakeup
during reclaiming:

  * 0 if no wakeup
  * the classzone_idx request by wakeup

As described in the changelog, this commit is willing to change the
first case to (MAX_NR_ZONES - 1) to avoid some premature sleep. But it
does not achieve the goal.

There are two versions of kswapd_classzone_idx() since this change:

  * commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
    prematurely due to mismatched classzone_idx")
  * commit dffcac2cb88e ("mm/vmscan.c: prevent useless kswapd loops")

Both of them would return the classzone_idx we passed as the 2nd
parameter when (pgdat->kswapd_classzone_idx == MAX_NR_ZONES). This
means if there is no wakeup during reclaiming, we would use
classzone_idx in previous round to sleep.

This patch fixes the logic by using (MAX_NR_ZONES - 1) for the first
case.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
---
 mm/vmscan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index e7b647f70407..ea2f0abef1d4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3879,7 +3879,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o
 static int kswapd(void *p)
 {
 	unsigned int alloc_order, reclaim_order;
-	unsigned int classzone_idx = MAX_NR_ZONES - 1;
+	unsigned int classzone_idx;
 	pg_data_t *pgdat = (pg_data_t*)p;
 	struct task_struct *tsk = current;
 	const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id);
@@ -3908,7 +3908,7 @@ static int kswapd(void *p)
 		bool ret;
 
 		alloc_order = reclaim_order = pgdat->kswapd_order;
-		classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx);
+		classzone_idx = kswapd_classzone_idx(pgdat, MAX_NR_ZONES - 1);
 
 kswapd_try_sleep:
 		kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-09  7:41 [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim Wei Yang
@ 2020-02-11 10:42 ` Mel Gorman
  2020-02-12  2:25   ` Wei Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2020-02-11 10:42 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Sun, Feb 09, 2020 at 03:41:45PM +0800, Wei Yang wrote:
> Before commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
> prematurely due to mismatched classzone_idx"), classzone_idx could have
> two possibilities on a new loop based on whether there is a wakeup
> during reclaiming:
> 
>   * 0 if no wakeup
>   * the classzone_idx request by wakeup
> 
> As described in the changelog, this commit is willing to change the
> first case to (MAX_NR_ZONES - 1) to avoid some premature sleep. But it
> does not achieve the goal.
> 
> There are two versions of kswapd_classzone_idx() since this change:
> 
>   * commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
>     prematurely due to mismatched classzone_idx")
>   * commit dffcac2cb88e ("mm/vmscan.c: prevent useless kswapd loops")
> 
> Both of them would return the classzone_idx we passed as the 2nd
> parameter when (pgdat->kswapd_classzone_idx == MAX_NR_ZONES). This
> means if there is no wakeup during reclaiming, we would use
> classzone_idx in previous round to sleep.
> 

This is somewhat intended.

> This patch fixes the logic by using (MAX_NR_ZONES - 1) for the first
> case.
> 

Ok, what is the user-visible impact that is fixed by this patch or is
this based on code review only? Please describe the test case exactly
and the before and after results. I ask because this area is a magnet for
regressions and intuitive ideas often lead to counter-intuitive results.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-11 10:42 ` Mel Gorman
@ 2020-02-12  2:25   ` Wei Yang
  2020-02-12  7:43     ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2020-02-12  2:25 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Tue, Feb 11, 2020 at 10:42:23AM +0000, Mel Gorman wrote:
>On Sun, Feb 09, 2020 at 03:41:45PM +0800, Wei Yang wrote:
>> Before commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
>> prematurely due to mismatched classzone_idx"), classzone_idx could have
>> two possibilities on a new loop based on whether there is a wakeup
>> during reclaiming:
>> 
>>   * 0 if no wakeup
>>   * the classzone_idx request by wakeup
>> 
>> As described in the changelog, this commit is willing to change the
>> first case to (MAX_NR_ZONES - 1) to avoid some premature sleep. But it
>> does not achieve the goal.
>> 
>> There are two versions of kswapd_classzone_idx() since this change:
>> 
>>   * commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
>>     prematurely due to mismatched classzone_idx")
>>   * commit dffcac2cb88e ("mm/vmscan.c: prevent useless kswapd loops")
>> 
>> Both of them would return the classzone_idx we passed as the 2nd
>> parameter when (pgdat->kswapd_classzone_idx == MAX_NR_ZONES). This
>> means if there is no wakeup during reclaiming, we would use
>> classzone_idx in previous round to sleep.
>> 
>
>This is somewhat intended.
>
>> This patch fixes the logic by using (MAX_NR_ZONES - 1) for the first
>> case.
>> 
>
>Ok, what is the user-visible impact that is fixed by this patch or is
>this based on code review only? Please describe the test case exactly
>and the before and after results. I ask because this area is a magnet for
>regressions and intuitive ideas often lead to counter-intuitive results.
>

This is based on code review only. I know your concern. This is an area more
like art then engineering :-)

Would you mind sharing some idea why we intend to inherit the classzone_idx?
And for kswapd_order, we would restart at 0 if no wakeup during reclaim.

I am curious about the idea behind this design :-)

>-- 
>Mel Gorman
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-12  2:25   ` Wei Yang
@ 2020-02-12  7:43     ` Mel Gorman
  2020-02-14  2:05       ` Wei Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2020-02-12  7:43 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Wed, Feb 12, 2020 at 10:25:55AM +0800, Wei Yang wrote:
> On Tue, Feb 11, 2020 at 10:42:23AM +0000, Mel Gorman wrote:
> >On Sun, Feb 09, 2020 at 03:41:45PM +0800, Wei Yang wrote:
> >> Before commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
> >> prematurely due to mismatched classzone_idx"), classzone_idx could have
> >> two possibilities on a new loop based on whether there is a wakeup
> >> during reclaiming:
> >> 
> >>   * 0 if no wakeup
> >>   * the classzone_idx request by wakeup
> >> 
> >> As described in the changelog, this commit is willing to change the
> >> first case to (MAX_NR_ZONES - 1) to avoid some premature sleep. But it
> >> does not achieve the goal.
> >> 
> >> There are two versions of kswapd_classzone_idx() since this change:
> >> 
> >>   * commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
> >>     prematurely due to mismatched classzone_idx")
> >>   * commit dffcac2cb88e ("mm/vmscan.c: prevent useless kswapd loops")
> >> 
> >> Both of them would return the classzone_idx we passed as the 2nd
> >> parameter when (pgdat->kswapd_classzone_idx == MAX_NR_ZONES). This
> >> means if there is no wakeup during reclaiming, we would use
> >> classzone_idx in previous round to sleep.
> >> 
> >
> >This is somewhat intended.
> >
> >> This patch fixes the logic by using (MAX_NR_ZONES - 1) for the first
> >> case.
> >> 
> >
> >Ok, what is the user-visible impact that is fixed by this patch or is
> >this based on code review only? Please describe the test case exactly
> >and the before and after results. I ask because this area is a magnet for
> >regressions and intuitive ideas often lead to counter-intuitive results.
> >
> 
> This is based on code review only. I know your concern. This is an area more
> like art then engineering :-)
> 

Then I'm afraid that until there is a corner case identified and a
description of the impact it's

Nacked-by: Mel Gorman <mgorman@techsingularity.net>

> Would you mind sharing some idea why we intend to inherit the classzone_idx?
> And for kswapd_order, we would restart at 0 if no wakeup during reclaim.
> 

Broadly speaking it was driven by cases whereby kswapd either a) fell
asleep prematurely and there were many stalls in direct reclaim before
kswapd recovered, b) stalls in direct reclaim immediately after kswapd went
to sleep or c) kswapd reclaimed for lower zones and went to sleep while
parallel tasks were direct reclaiming in higher zones or higher orders.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-12  7:43     ` Mel Gorman
@ 2020-02-14  2:05       ` Wei Yang
  2020-02-14  2:48         ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2020-02-14  2:05 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Wed, Feb 12, 2020 at 07:43:33AM +0000, Mel Gorman wrote:
>On Wed, Feb 12, 2020 at 10:25:55AM +0800, Wei Yang wrote:
>> On Tue, Feb 11, 2020 at 10:42:23AM +0000, Mel Gorman wrote:
>> >On Sun, Feb 09, 2020 at 03:41:45PM +0800, Wei Yang wrote:
>> >> Before commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
>> >> prematurely due to mismatched classzone_idx"), classzone_idx could have
>> >> two possibilities on a new loop based on whether there is a wakeup
>> >> during reclaiming:
>> >> 
>> >>   * 0 if no wakeup
>> >>   * the classzone_idx request by wakeup
>> >> 
>> >> As described in the changelog, this commit is willing to change the
>> >> first case to (MAX_NR_ZONES - 1) to avoid some premature sleep. But it
>> >> does not achieve the goal.
>> >> 
>> >> There are two versions of kswapd_classzone_idx() since this change:
>> >> 
>> >>   * commit e716f2eb24de ("mm, vmscan: prevent kswapd sleeping
>> >>     prematurely due to mismatched classzone_idx")
>> >>   * commit dffcac2cb88e ("mm/vmscan.c: prevent useless kswapd loops")
>> >> 
>> >> Both of them would return the classzone_idx we passed as the 2nd
>> >> parameter when (pgdat->kswapd_classzone_idx == MAX_NR_ZONES). This
>> >> means if there is no wakeup during reclaiming, we would use
>> >> classzone_idx in previous round to sleep.
>> >> 
>> >
>> >This is somewhat intended.
>> >
>> >> This patch fixes the logic by using (MAX_NR_ZONES - 1) for the first
>> >> case.
>> >> 
>> >
>> >Ok, what is the user-visible impact that is fixed by this patch or is
>> >this based on code review only? Please describe the test case exactly
>> >and the before and after results. I ask because this area is a magnet for
>> >regressions and intuitive ideas often lead to counter-intuitive results.
>> >
>> 
>> This is based on code review only. I know your concern. This is an area more
>> like art then engineering :-)
>> 
>
>Then I'm afraid that until there is a corner case identified and a
>description of the impact it's
>
>Nacked-by: Mel Gorman <mgorman@techsingularity.net>
>

Yep, no problem. I am glad if I could get some idea from you.

>> Would you mind sharing some idea why we intend to inherit the classzone_idx?
>> And for kswapd_order, we would restart at 0 if no wakeup during reclaim.
>> 
>
>Broadly speaking it was driven by cases whereby kswapd either a) fell
>asleep prematurely and there were many stalls in direct reclaim before
>kswapd recovered, b) stalls in direct reclaim immediately after kswapd went
>to sleep or c) kswapd reclaimed for lower zones and went to sleep while
>parallel tasks were direct reclaiming in higher zones or higher orders.
>

Thanks for your explanation. I am trying to understand the connection between
those cases and the behavior of kswapd.

In summary, all three cases are related to direct reclaim, while happens in
three different timing of kswapd:

   a) premature sleep
   b) full sleep
   c) full sleep after reclaim lower zone

Hmm... I am not sure the difference between b) and c). Looks both face direct
reclaim when kswapd is sleeping.

If I am correct, direct reclaim here is performed by function
__perform_reclaim(). Its scan order and zone_idx is retrieved from allocation
parameters, so it doesn't affect pgdat{.kswapd_order, .kswapd_classzone_idx} if
I am correct.

Direct reclaim do affect the pgdat status. After reclaiming, we may have more
available pages. But I am stuck in the connection between direct reclaim and
kswapd. Would you mind sharing more light on this part?

>-- 
>Mel Gorman
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-14  2:05       ` Wei Yang
@ 2020-02-14  2:48         ` Matthew Wilcox
  2020-02-14  7:35           ` Wei Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2020-02-14  2:48 UTC (permalink / raw)
  To: Wei Yang; +Cc: Mel Gorman, akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Fri, Feb 14, 2020 at 10:05:15AM +0800, Wei Yang wrote:
> On Wed, Feb 12, 2020 at 07:43:33AM +0000, Mel Gorman wrote:
> >Broadly speaking it was driven by cases whereby kswapd either a) fell
> >asleep prematurely and there were many stalls in direct reclaim before
> >kswapd recovered, b) stalls in direct reclaim immediately after kswapd went
> >to sleep or c) kswapd reclaimed for lower zones and went to sleep while
> >parallel tasks were direct reclaiming in higher zones or higher orders.
> 
> Thanks for your explanation. I am trying to understand the connection between
> those cases and the behavior of kswapd.
> 
> In summary, all three cases are related to direct reclaim, while happens in
> three different timing of kswapd:

Reclaim performed by kswapd is the opposite of direct reclaim.  Direct
reclaim is reclaim initiated by a task which is trying to allocate memory.
If a task cannot perform direct reclaim itself, it may ask kswapd to
attempt to reclaim memory for it.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim
  2020-02-14  2:48         ` Matthew Wilcox
@ 2020-02-14  7:35           ` Wei Yang
  0 siblings, 0 replies; 7+ messages in thread
From: Wei Yang @ 2020-02-14  7:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Wei Yang, Mel Gorman, akpm, linux-mm, linux-kernel, shakeelb, yang.shi

On Thu, Feb 13, 2020 at 06:48:06PM -0800, Matthew Wilcox wrote:
>On Fri, Feb 14, 2020 at 10:05:15AM +0800, Wei Yang wrote:
>> On Wed, Feb 12, 2020 at 07:43:33AM +0000, Mel Gorman wrote:
>> >Broadly speaking it was driven by cases whereby kswapd either a) fell
>> >asleep prematurely and there were many stalls in direct reclaim before
>> >kswapd recovered, b) stalls in direct reclaim immediately after kswapd went
>> >to sleep or c) kswapd reclaimed for lower zones and went to sleep while
>> >parallel tasks were direct reclaiming in higher zones or higher orders.
>> 
>> Thanks for your explanation. I am trying to understand the connection between
>> those cases and the behavior of kswapd.
>> 
>> In summary, all three cases are related to direct reclaim, while happens in
>> three different timing of kswapd:
>
>Reclaim performed by kswapd is the opposite of direct reclaim.  Direct
>reclaim is reclaim initiated by a task which is trying to allocate memory.
>If a task cannot perform direct reclaim itself, it may ask kswapd to
>attempt to reclaim memory for it.

Not totally opposite, I think.

They both reclaim some memory, while after direct reclaim, some freed memory
will be allocated.

Is this the difference you want to mention?

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-14  7:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-09  7:41 [RFC Patch] mm/vmscan.c: not inherit classzone_idx from previous reclaim Wei Yang
2020-02-11 10:42 ` Mel Gorman
2020-02-12  2:25   ` Wei Yang
2020-02-12  7:43     ` Mel Gorman
2020-02-14  2:05       ` Wei Yang
2020-02-14  2:48         ` Matthew Wilcox
2020-02-14  7:35           ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).