* [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark @ 2022-01-02 3:31 skseofh 2022-01-06 0:17 ` Andrew Morton 2022-01-06 9:46 ` Mel Gorman 0 siblings, 2 replies; 7+ messages in thread From: skseofh @ 2022-01-02 3:31 UTC (permalink / raw) To: akpm, linux-mm, linux-kernel; +Cc: Daero Lee From: Daero Lee <skseofh@gmail.com> In kswapd_try_to_sleep function, to check whether kswapd can sleep, the prepare_kswapd_sleep function is called twice. If free pages are below high-watermark in the first call, the @remaining variable is not updated at 0 and the prepare_kswapd_sleep function is called for the second time. I think it is necessary to set the initial value of the @remaining to a non-zero value to prevent consecutive calls to the same function. Signed-off-by: Daero Lee <skseofh@gmail.com> --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 700434db5735..1217ecec5bbb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) /* * Return the order kswapd stopped reclaiming at as * prepare_kswapd_sleep() takes it into account. If another caller - * entered the allocator slow path while kswapd was awake, order will + * entered the allqocator slow path while kswapd was awake, order will * remain at the higher level. */ return sc.order; @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, unsigned int highest_zoneidx) { - long remaining = 0; + long remaining = ~0; DEFINE_WAIT(wait); if (freezing(current) || kthread_should_stop()) -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-01-02 3:31 [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark skseofh @ 2022-01-06 0:17 ` Andrew Morton 2022-01-06 9:46 ` Mel Gorman 1 sibling, 0 replies; 7+ messages in thread From: Andrew Morton @ 2022-01-06 0:17 UTC (permalink / raw) To: skseofh; +Cc: linux-mm, linux-kernel, Mel Gorman, Joonsoo Kim On Sun, 2 Jan 2022 12:31:29 +0900 skseofh@gmail.com wrote: > From: Daero Lee <skseofh@gmail.com> > > In kswapd_try_to_sleep function, to check whether kswapd can sleep, > the prepare_kswapd_sleep function is called twice. > > If free pages are below high-watermark in the first call, > the @remaining variable is not updated at 0 and the > prepare_kswapd_sleep function is called for the second time. > > I think it is necessary to set the initial value of the > @remaining to a non-zero value to prevent consecutive calls > to the same function. Seems right. Mel & Joonsoo, could you please check? > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) > /* > * Return the order kswapd stopped reclaiming at as > * prepare_kswapd_sleep() takes it into account. If another caller > - * entered the allocator slow path while kswapd was awake, order will > + * entered the allqocator slow path while kswapd was awake, order will whoops. > * remain at the higher level. > */ > return sc.order; > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > unsigned int highest_zoneidx) > { > - long remaining = 0; > + long remaining = ~0; > DEFINE_WAIT(wait); > > if (freezing(current) || kthread_should_stop()) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-01-02 3:31 [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark skseofh 2022-01-06 0:17 ` Andrew Morton @ 2022-01-06 9:46 ` Mel Gorman 2022-01-06 12:03 ` DaeRo Lee 1 sibling, 1 reply; 7+ messages in thread From: Mel Gorman @ 2022-01-06 9:46 UTC (permalink / raw) To: skseofh; +Cc: akpm, linux-mm, linux-kernel On Sun, Jan 02, 2022 at 12:31:29PM +0900, skseofh@gmail.com wrote: > From: Daero Lee <skseofh@gmail.com> > > In kswapd_try_to_sleep function, to check whether kswapd can sleep, > the prepare_kswapd_sleep function is called twice. > > If free pages are below high-watermark in the first call, > the @remaining variable is not updated at 0 and the > prepare_kswapd_sleep function is called for the second time. > > I think it is necessary to set the initial value of the > @remaining to a non-zero value to prevent consecutive calls > to the same function. > > Signed-off-by: Daero Lee <skseofh@gmail.com> > --- > mm/vmscan.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 700434db5735..1217ecec5bbb 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) > /* > * Return the order kswapd stopped reclaiming at as > * prepare_kswapd_sleep() takes it into account. If another caller > - * entered the allocator slow path while kswapd was awake, order will > + * entered the allqocator slow path while kswapd was awake, order will > * remain at the higher level. > */ > return sc.order; This hunk just adds a typo, drop it. > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > unsigned int highest_zoneidx) > { > - long remaining = 0; > + long remaining = ~0; > DEFINE_WAIT(wait); > > if (freezing(current) || kthread_should_stop()) While this does avoid calling prepare_kswapd_sleep() twice if the pgdat is balanced on the first try, it then does not restore the vmstat thresholds and doesn't call schedul() for kswapd to go to sleep. I think you did spot a problem but I suspect you want something like the following untested patch diff --git a/mm/vmscan.c b/mm/vmscan.c index 700434db5735..40784693c840 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4355,7 +4355,8 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, unsigned int highest_zoneidx) { - long remaining = 0; + long remaining; + bool balanced; DEFINE_WAIT(wait); if (freezing(current) || kthread_should_stop()) @@ -4370,7 +4371,8 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * eligible zone balanced that it's also unlikely that compaction will * succeed. */ - if (prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { + balanced = prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx); + if (balanced) { /* * Compaction records what page blocks it recently failed to * isolate pages from and skips them in the future scanning. @@ -4387,6 +4389,10 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o remaining = schedule_timeout(HZ/10); + /* Is pgdat balanced after a short sleep? */ + balanced = prepare_kswapd_sleep(pgdat, reclaim_order, + highest_zoneidx); + /* * If woken prematurely then reset kswapd_highest_zoneidx and * order. The values will either be from a wakeup request or @@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o } /* - * After a short sleep, check if it was a premature sleep. If not, then - * go fully to sleep until explicitly woken up. + * If balanced to the high watermark, restore vmstat thresholds and + * kswapd goes to sleep. If kswapd remains awake, account whether + * the low or high watermark was hit quickly. */ - if (!remaining && - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { + if (balanced) { trace_mm_vmscan_kswapd_sleep(pgdat->node_id); /* ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-01-06 9:46 ` Mel Gorman @ 2022-01-06 12:03 ` DaeRo Lee 2022-01-06 12:57 ` Mel Gorman 0 siblings, 1 reply; 7+ messages in thread From: DaeRo Lee @ 2022-01-06 12:03 UTC (permalink / raw) To: Mel Gorman; +Cc: akpm, linux-mm, linux-kernel > > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > > unsigned int highest_zoneidx) > > { > > - long remaining = 0; > > + long remaining = ~0; > > DEFINE_WAIT(wait); > > > > if (freezing(current) || kthread_should_stop()) > > While this does avoid calling prepare_kswapd_sleep() twice if the pgdat > is balanced on the first try, it then does not restore the vmstat > thresholds and doesn't call schedul() for kswapd to go to sleep. I intended not to call prepare_kswapd_sleep() twice when the pgdat is NOT balanced on the first try:) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 700434db5735..40784693c840 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4355,7 +4355,8 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > unsigned int highest_zoneidx) > { > - long remaining = 0; > + long remaining; > + bool balanced; > DEFINE_WAIT(wait); > > if (freezing(current) || kthread_should_stop()) > @@ -4370,7 +4371,8 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > * eligible zone balanced that it's also unlikely that compaction will > * succeed. > */ > - if (prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { > + balanced = prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx); > + if (balanced) { > /* > * Compaction records what page blocks it recently failed to > * isolate pages from and skips them in the future scanning. > @@ -4387,6 +4389,10 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > > remaining = schedule_timeout(HZ/10); > > + /* Is pgdat balanced after a short sleep? */ > + balanced = prepare_kswapd_sleep(pgdat, reclaim_order, > + highest_zoneidx); > + > /* > * If woken prematurely then reset kswapd_highest_zoneidx and > * order. The values will either be from a wakeup request or > @@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > } > > /* > - * After a short sleep, check if it was a premature sleep. If not, then > - * go fully to sleep until explicitly woken up. > + * If balanced to the high watermark, restore vmstat thresholds and > + * kswapd goes to sleep. If kswapd remains awake, account whether > + * the low or high watermark was hit quickly. > */ > - if (!remaining && > - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { > + if (balanced) { > trace_mm_vmscan_kswapd_sleep(pgdat->node_id); > > /* But, I think what you did is more readable and nice. Thanks! > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 700434db5735..1217ecec5bbb 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) > > /* > > * Return the order kswapd stopped reclaiming at as > > * prepare_kswapd_sleep() takes it into account. If another caller > > - * entered the allocator slow path while kswapd was awake, order will > > + * entered the allqocator slow path while kswapd was awake, order will > > * remain at the higher level. > > */ > > return sc.order; > > This hunk just adds a typo, drop it. Sorry about that;; ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-01-06 12:03 ` DaeRo Lee @ 2022-01-06 12:57 ` Mel Gorman 2022-03-26 15:50 ` Wei Yang 0 siblings, 1 reply; 7+ messages in thread From: Mel Gorman @ 2022-01-06 12:57 UTC (permalink / raw) To: DaeRo Lee; +Cc: akpm, linux-mm, linux-kernel On Thu, Jan 06, 2022 at 09:03:34PM +0900, DaeRo Lee wrote: > > > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > > > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > > > unsigned int highest_zoneidx) > > > { > > > - long remaining = 0; > > > + long remaining = ~0; > > > DEFINE_WAIT(wait); > > > > > > if (freezing(current) || kthread_should_stop()) > > > > While this does avoid calling prepare_kswapd_sleep() twice if the pgdat > > is balanced on the first try, it then does not restore the vmstat > > thresholds and doesn't call schedul() for kswapd to go to sleep. > > I intended not to call prepare_kswapd_sleep() twice when the pgdat is NOT > balanced on the first try:) > Stupid typo on my part. > > @@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > > } > > > > /* > > - * After a short sleep, check if it was a premature sleep. If not, then > > - * go fully to sleep until explicitly woken up. > > + * If balanced to the high watermark, restore vmstat thresholds and > > + * kswapd goes to sleep. If kswapd remains awake, account whether > > + * the low or high watermark was hit quickly. > > */ > > - if (!remaining && > > - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { > > + if (balanced) { > > trace_mm_vmscan_kswapd_sleep(pgdat->node_id); > > > > /* > > But, I think what you did is more readable and nice. > Thanks! > Feel free to pick it up, rerun your tests to ensure it's behaving as expected and resend! Include something in the changelog about user-visible effects if any (or a note saying that it reduces unnecssary overhead) and resend with me added to the cc. > > > @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) > > > /* > > > * Return the order kswapd stopped reclaiming at as > > > * prepare_kswapd_sleep() takes it into account. If another caller > > > - * entered the allocator slow path while kswapd was awake, order will > > > + * entered the allqocator slow path while kswapd was awake, order will > > > * remain at the higher level. > > > */ > > > return sc.order; > > > > This hunk just adds a typo, drop it. > > Sorry about that;; No need to be sorry, it happens :) Thanks! -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-01-06 12:57 ` Mel Gorman @ 2022-03-26 15:50 ` Wei Yang 2022-04-07 13:56 ` Mel Gorman 0 siblings, 1 reply; 7+ messages in thread From: Wei Yang @ 2022-03-26 15:50 UTC (permalink / raw) To: Mel Gorman; +Cc: DaeRo Lee, akpm, linux-mm, linux-kernel On Thu, Jan 06, 2022 at 12:57:58PM +0000, Mel Gorman wrote: >On Thu, Jan 06, 2022 at 09:03:34PM +0900, DaeRo Lee wrote: >> > > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, >> > > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, >> > > unsigned int highest_zoneidx) >> > > { >> > > - long remaining = 0; >> > > + long remaining = ~0; >> > > DEFINE_WAIT(wait); >> > > >> > > if (freezing(current) || kthread_should_stop()) >> > >> > While this does avoid calling prepare_kswapd_sleep() twice if the pgdat >> > is balanced on the first try, it then does not restore the vmstat >> > thresholds and doesn't call schedul() for kswapd to go to sleep. >> >> I intended not to call prepare_kswapd_sleep() twice when the pgdat is NOT >> balanced on the first try:) >> > >Stupid typo on my part. > >> > @@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o >> > } >> > >> > /* >> > - * After a short sleep, check if it was a premature sleep. If not, then >> > - * go fully to sleep until explicitly woken up. >> > + * If balanced to the high watermark, restore vmstat thresholds and >> > + * kswapd goes to sleep. If kswapd remains awake, account whether >> > + * the low or high watermark was hit quickly. >> > */ >> > - if (!remaining && >> > - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { >> > + if (balanced) { >> > trace_mm_vmscan_kswapd_sleep(pgdat->node_id); >> > >> > /* >> >> But, I think what you did is more readable and nice. >> Thanks! >> > >Feel free to pick it up, rerun your tests to ensure it's behaving as >expected and resend! Include something in the changelog about user-visible >effects if any (or a note saying that it reduces unnecssary overhead) >and resend with me added to the cc. > Hi, All Seems this thread stops here. I don't see following patch and current upstream doesn't include this change. May I continue this? Of course, with author-ship from DaeRo Lee <skseofh@gmail.com>. Mel, Would you mind suggesting some cases that I could do to see the effects from this change? Such as the overhead or throughput? Or what cases you expect? -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark 2022-03-26 15:50 ` Wei Yang @ 2022-04-07 13:56 ` Mel Gorman 0 siblings, 0 replies; 7+ messages in thread From: Mel Gorman @ 2022-04-07 13:56 UTC (permalink / raw) To: Wei Yang; +Cc: DaeRo Lee, akpm, linux-mm, linux-kernel On Sat, Mar 26, 2022 at 03:50:22PM +0000, Wei Yang wrote: > On Thu, Jan 06, 2022 at 12:57:58PM +0000, Mel Gorman wrote: > >On Thu, Jan 06, 2022 at 09:03:34PM +0900, DaeRo Lee wrote: > >> > > @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, > >> > > static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, > >> > > unsigned int highest_zoneidx) > >> > > { > >> > > - long remaining = 0; > >> > > + long remaining = ~0; > >> > > DEFINE_WAIT(wait); > >> > > > >> > > if (freezing(current) || kthread_should_stop()) > >> > > >> > While this does avoid calling prepare_kswapd_sleep() twice if the pgdat > >> > is balanced on the first try, it then does not restore the vmstat > >> > thresholds and doesn't call schedul() for kswapd to go to sleep. > >> > >> I intended not to call prepare_kswapd_sleep() twice when the pgdat is NOT > >> balanced on the first try:) > >> > > > >Stupid typo on my part. > > > >> > @@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > >> > } > >> > > >> > /* > >> > - * After a short sleep, check if it was a premature sleep. If not, then > >> > - * go fully to sleep until explicitly woken up. > >> > + * If balanced to the high watermark, restore vmstat thresholds and > >> > + * kswapd goes to sleep. If kswapd remains awake, account whether > >> > + * the low or high watermark was hit quickly. > >> > */ > >> > - if (!remaining && > >> > - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { > >> > + if (balanced) { > >> > trace_mm_vmscan_kswapd_sleep(pgdat->node_id); > >> > > >> > /* > >> > >> But, I think what you did is more readable and nice. > >> Thanks! > >> > > > >Feel free to pick it up, rerun your tests to ensure it's behaving as > >expected and resend! Include something in the changelog about user-visible > >effects if any (or a note saying that it reduces unnecssary overhead) > >and resend with me added to the cc. > > > > Hi, All > > Seems this thread stops here. I don't see following patch and current upstream > doesn't include this change. > > May I continue this? Of course, with author-ship from DaeRo Lee <skseofh@gmail.com>. > I've no objections. When I said "Feel free to pick it up", I meant that I was ok with you taking the patch and putting your team on it. > Mel, > > Would you mind suggesting some cases that I could do to see the effects from > this change? Such as the overhead or throughput? Or what cases you expect? > I don't have any suggestions on artificially triggering it. I had assumed you had encountered the bug in practice and had a test case but it would be ok to note that the patch is a theoretical fix based on code review. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-04-07 13:58 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-01-02 3:31 [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark skseofh 2022-01-06 0:17 ` Andrew Morton 2022-01-06 9:46 ` Mel Gorman 2022-01-06 12:03 ` DaeRo Lee 2022-01-06 12:57 ` Mel Gorman 2022-03-26 15:50 ` Wei Yang 2022-04-07 13:56 ` Mel Gorman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.