linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
@ 2022-07-13  6:20 Chuyi Zhou
  2022-07-13 15:28 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Chuyi Zhou @ 2022-07-13  6:20 UTC (permalink / raw)
  To: linux-mm; +Cc: zhouchuyi

From: zhouchuyi <zhouchuyi@bytedance.com>

When we successfully find a pageblock in fast_find_migrateblock(),
the block will be set skip-flag through set_pageblock_skip(). However,
when entering isolate_migratepages_block(), the whole pageblock will
be skipped due to the branch
'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
Eventually we will goto isolate_abort and isolate nothing. That cause
fast_find_migrateblock useless.

In this Patch, when we find a suitable pageblock in fast_find_
migrateblock, we do noting but let isolate_migratepages_block
to set skip flag to the pageblock after scan it. Normally,
we would isolate some pages from the fast-find block.

I use mmtest/thpscale-madvhugepage test it. Here is the result:
                            baseline               patch
Amean     fault-both-1      1331.66 (   0.00%)     1261.04 *   5.30%*
Amean     fault-both-3      1383.95 (   0.00%)     1191.69 *  13.89%*
Amean     fault-both-5      1568.13 (   0.00%)     1445.20 *   7.84%*
Amean     fault-both-7      1819.62 (   0.00%)     1555.13 *  14.54%*
Amean     fault-both-12     1106.96 (   0.00%)     1149.43 *  -3.84%*
Amean     fault-both-18     2196.93 (   0.00%)     1875.77 *  14.62%*
Amean     fault-both-24     2642.69 (   0.00%)     2671.21 *  -1.08%*
Amean     fault-both-30     2901.89 (   0.00%)     2857.32 *   1.54%*
Amean     fault-both-32     3747.00 (   0.00%)     3479.23 *   7.15%*

Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate a migration source")

Signed-off-by: zhouchuyi <zhouchuyi@bytedance.com>
---
 mm/compaction.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 962d05d1e187..abc7b0834471 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1850,7 +1850,6 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
 					pfn = cc->zone->zone_start_pfn;
 				cc->fast_search_fail = 0;
 				found_block = true;
-				set_pageblock_skip(freepage);
 				break;
 			}
 		}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-13  6:20 [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock Chuyi Zhou
@ 2022-07-13 15:28 ` Andrew Morton
  2022-07-14 11:50   ` Mel Gorman
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2022-07-13 15:28 UTC (permalink / raw)
  To: Chuyi Zhou; +Cc: linux-mm, Mel Gorman

(cc Mel)

On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou <zhouchuyi@bytedance.com> wrote:

> From: zhouchuyi <zhouchuyi@bytedance.com>
> 
> When we successfully find a pageblock in fast_find_migrateblock(),
> the block will be set skip-flag through set_pageblock_skip(). However,
> when entering isolate_migratepages_block(), the whole pageblock will
> be skipped due to the branch
> 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
> Eventually we will goto isolate_abort and isolate nothing. That cause
> fast_find_migrateblock useless.
> 
> In this Patch, when we find a suitable pageblock in fast_find_
> migrateblock, we do noting but let isolate_migratepages_block
> to set skip flag to the pageblock after scan it. Normally,
> we would isolate some pages from the fast-find block.
> 
> I use mmtest/thpscale-madvhugepage test it. Here is the result:
>                             baseline               patch
> Amean     fault-both-1      1331.66 (   0.00%)     1261.04 *   5.30%*
> Amean     fault-both-3      1383.95 (   0.00%)     1191.69 *  13.89%*
> Amean     fault-both-5      1568.13 (   0.00%)     1445.20 *   7.84%*
> Amean     fault-both-7      1819.62 (   0.00%)     1555.13 *  14.54%*
> Amean     fault-both-12     1106.96 (   0.00%)     1149.43 *  -3.84%*
> Amean     fault-both-18     2196.93 (   0.00%)     1875.77 *  14.62%*
> Amean     fault-both-24     2642.69 (   0.00%)     2671.21 *  -1.08%*
> Amean     fault-both-30     2901.89 (   0.00%)     2857.32 *   1.54%*
> Amean     fault-both-32     3747.00 (   0.00%)     3479.23 *   7.15%*
> 
> Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate a migration source")
> 
> Signed-off-by: zhouchuyi <zhouchuyi@bytedance.com>
> ---
>  mm/compaction.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 962d05d1e187..abc7b0834471 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1850,7 +1850,6 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
>  					pfn = cc->zone->zone_start_pfn;
>  				cc->fast_search_fail = 0;
>  				found_block = true;
> -				set_pageblock_skip(freepage);
>  				break;
>  			}
>  		}
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-13 15:28 ` Andrew Morton
@ 2022-07-14 11:50   ` Mel Gorman
  2022-07-15 15:26     ` Chuyi Zhou
  0 siblings, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2022-07-14 11:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Chuyi Zhou, linux-mm

On Wed, Jul 13, 2022 at 08:28:14AM -0700, Andrew Morton wrote:
> (cc Mel)
> 
> On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou <zhouchuyi@bytedance.com> wrote:
> 
> > From: zhouchuyi <zhouchuyi@bytedance.com>
> > 
> > When we successfully find a pageblock in fast_find_migrateblock(),
> > the block will be set skip-flag through set_pageblock_skip(). However,
> > when entering isolate_migratepages_block(), the whole pageblock will
> > be skipped due to the branch
> > 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
> > Eventually we will goto isolate_abort and isolate nothing. That cause
> > fast_find_migrateblock useless.
> > 

It's not very clear *why* this failed from the changelog because
superficially !valid_page will be true for the first pageblock and there
is a reasonable expectation it will be aligned. Is the following accurate
based on your analysis?

	However, when entering isolate_migratepages_block(), the first
	pageblock will be skipped in the branch 'if (!valid_page &&
	IS_ALIGNED(low_pfn, pageblock_nr_pages))' as isolation_suitable
	returns true due to the skip bit set by fast_find_migrateblock().

If so, please update the changelog as a reviewer handling backports may
wonder what exactly is wrong with that branch.

Second, what guarantees a block returned by fast_find that is not
aligned gets marked skipped after it is scanned? The set_pageblock_skip
is only called when there is a valid page and it may not be set if
!IS_ALIGNED(low_pfn). Is something like this untested hunk also necessary?

diff --git a/mm/compaction.c b/mm/compaction.c
index 1f89b969c12b..112346b2f716 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -888,8 +888,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		 * COMPACT_CLUSTER_MAX at a time so the second call must
 		 * not falsely conclude that the block should be skipped.
 		 */
-		if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages)) {
-			if (!isolation_suitable(cc, page)) {
+		if (!valid_page) {
+			if (IS_ALIGNED(low_pfn, pageblock_nr_pages) &&
+			    !isolation_suitable(cc, page)) {
 				low_pfn = end_pfn;
 				page = NULL;
 				goto isolate_abort;

> > In this Patch, when we find a suitable pageblock in fast_find_
> > migrateblock, we do noting but let isolate_migratepages_block
> > to set skip flag to the pageblock after scan it. Normally,
> > we would isolate some pages from the fast-find block.
> > 
> > I use mmtest/thpscale-madvhugepage test it. Here is the result:
> >                             baseline               patch
> > Amean     fault-both-1      1331.66 (   0.00%)     1261.04 *   5.30%*
> > Amean     fault-both-3      1383.95 (   0.00%)     1191.69 *  13.89%*
> > Amean     fault-both-5      1568.13 (   0.00%)     1445.20 *   7.84%*
> > Amean     fault-both-7      1819.62 (   0.00%)     1555.13 *  14.54%*
> > Amean     fault-both-12     1106.96 (   0.00%)     1149.43 *  -3.84%*
> > Amean     fault-both-18     2196.93 (   0.00%)     1875.77 *  14.62%*
> > Amean     fault-both-24     2642.69 (   0.00%)     2671.21 *  -1.08%*
> > Amean     fault-both-30     2901.89 (   0.00%)     2857.32 *   1.54%*
> > Amean     fault-both-32     3747.00 (   0.00%)     3479.23 *   7.15%*
> > 
> > Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate a migration source")
> > 
> > Signed-off-by: zhouchuyi <zhouchuyi@bytedance.com>

No need for a newline between Fixes and Signed-off-by. The Signed-off-by
should have your full name, not a username.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-14 11:50   ` Mel Gorman
@ 2022-07-15 15:26     ` Chuyi Zhou
  2022-07-19  8:28       ` Mel Gorman
  0 siblings, 1 reply; 11+ messages in thread
From: Chuyi Zhou @ 2022-07-15 15:26 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton; +Cc: linux-mm



在 2022/7/14 下午7:50, Mel Gorman 写道:
> On Wed, Jul 13, 2022 at 08:28:14AM -0700, Andrew Morton wrote:
>> (cc Mel)
>>
>> On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou <zhouchuyi@bytedance.com> wrote:
>>
>>> From: zhouchuyi <zhouchuyi@bytedance.com>
>>>
>>> When we successfully find a pageblock in fast_find_migrateblock(),
>>> the block will be set skip-flag through set_pageblock_skip(). However,
>>> when entering isolate_migratepages_block(), the whole pageblock will
>>> be skipped due to the branch
>>> 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
>>> Eventually we will goto isolate_abort and isolate nothing. That cause
>>> fast_find_migrateblock useless.
>>>
> 
> It's not very clear *why* this failed from the changelog because
> superficially !valid_page will be true for the first pageblock and there
> is a reasonable expectation it will be aligned. Is the following accurate
> based on your analysis?
> 
> 	However, when entering isolate_migratepages_block(), the first
> 	pageblock will be skipped in the branch 'if (!valid_page &&
> 	IS_ALIGNED(low_pfn, pageblock_nr_pages))' as isolation_suitable
> 	returns true due to the skip bit set by fast_find_migrateblock().
> 
> If so, please update the changelog as a reviewer handling backports may
> wonder what exactly is wrong with that branch.
> 
Hi Mel, thanks for your review.

If fast scanning failed, the return block may not be aligned, because we 
get pfn from *free_pfn*. When fast-find success, the return value *pfn* 
is get from pageblock_start_pfn, and it will be passed to 
isolate_migratepages_block as low_pfn. I think normally the value get 
from pageblock_start_pfn should be aligned with pageblock_nr_pages. I 
have used printk test it. Maby I miss something important?
     pfn = pageblock_start_pfn(free_pfn);
     ...
     found_block = true;
     set_pageblock_skip(freepage);
     break;
> Second, what guarantees a block returned by fast_find that is not
> aligned gets marked skipped after it is scanned? The set_pageblock_skip
> is only called when there is a valid page and it may not be set if
> !IS_ALIGNED(low_pfn). Is something like this untested hunk also necessary?
> 
You are right, we do need some machenism to ensure mark skipped after
scanned a !IS_ALIGNED block from fast_find. However, I think the 
following code may not working. Because *skip_updated* has been reset:
     if (!skip_updated) {
         skip_updated = true;
         if (test_and_set_skip(cc, page, low_pfn))
             goto isolate_abort;
     }
Why not ignore skip_updated after scanned a block, just like this:

diff --git a/mm/compaction.c b/mm/compaction.c
index 962d05d1e187..1c388c45f127 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control 
*cc, unsigned long low_pfn,
          * rescanned twice in a row.
          */
         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
-               if (valid_page && !skip_updated)
+               if (valid_page)
                         set_pageblock_skip(valid_page);
                 update_cached_migrate(cc, low_pfn);
         }

> diff --git a/mm/compaction.c b/mm/compaction.c
> index 1f89b969c12b..112346b2f716 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -888,8 +888,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>   		 * COMPACT_CLUSTER_MAX at a time so the second call must
>   		 * not falsely conclude that the block should be skipped.
>   		 */
> -		if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages)) {
> -			if (!isolation_suitable(cc, page)) {
> +		if (!valid_page) {
> +			if (IS_ALIGNED(low_pfn, pageblock_nr_pages) &&
> +			    !isolation_suitable(cc, page)) {
>   				low_pfn = end_pfn;
>   				page = NULL;
>   				goto isolate_abort;
> 
>>> In this Patch, when we find a suitable pageblock in fast_find_
>>> migrateblock, we do noting but let isolate_migratepages_block
>>> to set skip flag to the pageblock after scan it. Normally,
>>> we would isolate some pages from the fast-find block.
>>>
>>> I use mmtest/thpscale-madvhugepage test it. Here is the result:
>>>                              baseline               patch
>>> Amean     fault-both-1      1331.66 (   0.00%)     1261.04 *   5.30%*
>>> Amean     fault-both-3      1383.95 (   0.00%)     1191.69 *  13.89%*
>>> Amean     fault-both-5      1568.13 (   0.00%)     1445.20 *   7.84%*
>>> Amean     fault-both-7      1819.62 (   0.00%)     1555.13 *  14.54%*
>>> Amean     fault-both-12     1106.96 (   0.00%)     1149.43 *  -3.84%*
>>> Amean     fault-both-18     2196.93 (   0.00%)     1875.77 *  14.62%*
>>> Amean     fault-both-24     2642.69 (   0.00%)     2671.21 *  -1.08%*
>>> Amean     fault-both-30     2901.89 (   0.00%)     2857.32 *   1.54%*
>>> Amean     fault-both-32     3747.00 (   0.00%)     3479.23 *   7.15%*
>>>
>>> Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate a migration source")
>>>
>>> Signed-off-by: zhouchuyi <zhouchuyi@bytedance.com>
> 
> No need for a newline between Fixes and Signed-off-by. The Signed-off-by
> should have your full name, not a username.
> 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-15 15:26     ` Chuyi Zhou
@ 2022-07-19  8:28       ` Mel Gorman
  2022-08-15  1:24         ` Andrew Morton
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Mel Gorman @ 2022-07-19  8:28 UTC (permalink / raw)
  To: Chuyi Zhou; +Cc: Andrew Morton, linux-mm

On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
> 
> 
> ??? 2022/7/14 ??????7:50, Mel Gorman ??????:
> > On Wed, Jul 13, 2022 at 08:28:14AM -0700, Andrew Morton wrote:
> > > (cc Mel)
> > > 
> > > On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou <zhouchuyi@bytedance.com> wrote:
> > > 
> > > > From: zhouchuyi <zhouchuyi@bytedance.com>
> > > > 
> > > > When we successfully find a pageblock in fast_find_migrateblock(),
> > > > the block will be set skip-flag through set_pageblock_skip(). However,
> > > > when entering isolate_migratepages_block(), the whole pageblock will
> > > > be skipped due to the branch
> > > > 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
> > > > Eventually we will goto isolate_abort and isolate nothing. That cause
> > > > fast_find_migrateblock useless.
> > > > 
> > 
> > It's not very clear *why* this failed from the changelog because
> > superficially !valid_page will be true for the first pageblock and there
> > is a reasonable expectation it will be aligned. Is the following accurate
> > based on your analysis?
> > 
> > 	However, when entering isolate_migratepages_block(), the first
> > 	pageblock will be skipped in the branch 'if (!valid_page &&
> > 	IS_ALIGNED(low_pfn, pageblock_nr_pages))' as isolation_suitable
> > 	returns true due to the skip bit set by fast_find_migrateblock().
> > 
> > If so, please update the changelog as a reviewer handling backports may
> > wonder what exactly is wrong with that branch.
> > 
> Hi Mel, thanks for your review.
> 
> If fast scanning failed, the return block may not be aligned, because we get
> pfn from *free_pfn*. When fast-find success, the return value *pfn* is get
> from pageblock_start_pfn, and it will be passed to
> isolate_migratepages_block as low_pfn. I think normally the value get from
> pageblock_start_pfn should be aligned with pageblock_nr_pages. I have used
> printk test it. Maby I miss something important?
>     pfn = pageblock_start_pfn(free_pfn);
>     ...
>     found_block = true;
>     set_pageblock_skip(freepage);
>     break;

The return block indeed may not be aligned. It could simply be a
restart. The changelog could still do with a little clarification but
your patch is still fine.

> > Second, what guarantees a block returned by fast_find that is not
> > aligned gets marked skipped after it is scanned? The set_pageblock_skip
> > is only called when there is a valid page and it may not be set if
> > !IS_ALIGNED(low_pfn). Is something like this untested hunk also necessary?
> > 
> You are right, we do need some machenism to ensure mark skipped after
> scanned a !IS_ALIGNED block from fast_find. However, I think the following
> code may not working. Because *skip_updated* has been reset:
>     if (!skip_updated) {
>         skip_updated = true;
>         if (test_and_set_skip(cc, page, low_pfn))
>             goto isolate_abort;
>     }
> Why not ignore skip_updated after scanned a block, just like this:
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 962d05d1e187..1c388c45f127 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control *cc,
> unsigned long low_pfn,
>          * rescanned twice in a row.
>          */
>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
> -               if (valid_page && !skip_updated)
> +               if (valid_page)
>                         set_pageblock_skip(valid_page);
>                 update_cached_migrate(cc, low_pfn);
>         }
> 

Because valid_page still needs to be set but ensuring valid_page is set
is sufficient.

The pageblock will be marked for skip on the first locking of a lruvec

                        /* Try get exclusive access under lock */
                        if (!skip_updated) {
                                skip_updated = true;
                                if (test_and_set_skip(cc, page, low_pfn))
                                        goto isolate_abort;
                        }

That will happen regardless of alignment. The intent here was to avoid
multiple scanners trying to isolate the same pageblock. The second
check for update happens at the end of the block

        if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
                if (valid_page && !skip_updated)
                        set_pageblock_skip(valid_page);
                update_cached_migrate(cc, low_pfn);
        }

And it's the second one that requires a valid_page. If valid_page is
available then the pageblock will get marked skip in most cases that
matter. If anything, it can get set prematurely due to "Try get
exclusive access" if the pageblock is not fully scanned but checking if
pageblock should be *cleared* if there was a partial scan would be a
different patch and not even clear that it worth the complexity.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-19  8:28       ` Mel Gorman
@ 2022-08-15  1:24         ` Andrew Morton
       [not found]           ` <70a434b2-7f1c-7fad-a7b7-cb038a13fd2c@bytedance.com>
       [not found]         ` <7d2bbb38-a96c-8212-8f89-915cd2c8668f@bytedance.com>
  2023-01-09 19:43         ` Vlastimil Babka
  2 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2022-08-15  1:24 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Chuyi Zhou, linux-mm

On Tue, 19 Jul 2022 09:28:54 +0100 Mel Gorman <mgorman@techsingularity.net> wrote:

> The return block indeed may not be aligned. It could simply be a
> restart. The changelog could still do with a little clarification but
> your patch is still fine.

I'm not seeing a v4 so I merged the v3.  It would be great if someone
could please send along an updated changelog?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
       [not found]           ` <70a434b2-7f1c-7fad-a7b7-cb038a13fd2c@bytedance.com>
@ 2022-08-15  3:22             ` Chuyi Zhou
  0 siblings, 0 replies; 11+ messages in thread
From: Chuyi Zhou @ 2022-08-15  3:22 UTC (permalink / raw)
  To: Andrew Morton, Mel Gorman; +Cc: linux-mm



在 2022/8/15 上午11:20, Chuyi Zhou 写道:
> 
> 
> 在 2022/8/15 上午9:24, Andrew Morton 写道:
>> On Tue, 19 Jul 2022 09:28:54 +0100 Mel Gorman 
>> <mgorman@techsingularity.net> wrote:
>>
>>> The return block indeed may not be aligned. It could simply be a
>>> restart. The changelog could still do with a little clarification but
>>> your patch is still fine.
>>
>> I'm not seeing a v4 so I merged the v3.  It would be great if someone
>> could please send along an updated changelog?
> 
> Hi, Morton, Here is the updated changelog:
> The fast_find_migrateblock could return a block aligned with
> pageblock_nr_pages. When we successfully find a block, we use
> pageblock_start_pfn(free_pfn) to get the first pfn of the pageblock,
> the block will be set skip through *set_pageblock_skip*, normally
> the value get from pageblock_start_pfn should be aligned with
> *pageblock_nr_pages*. Then the *first pfn* will be passed to
> isolate_migratepages_block, the whole pageblock will be skipped
> due to the branch
> if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages)), because
> !valid_page will be true for the first pageblock, and the low_pfn could
> be aligned with pageblock_nr_pages as mentioned above. Eventually we
> will goto isolate_abort and isolate nothing. That cause
> fast_find_migrateblock useless.
> In this Patch, when we find a suitable pageblock in fast_find_
> migrateblock, we do noting but let isolate_migratepages_block
> to set skip to the pageblock after scan it. Normally,
> we would isolate some pages from the fast-find block.
> I use mmtest/thpscale-madvhugepage test it. Here is the result:
>                               baseline               patch
> Amean     fault-both-1      1331.66 (   0.00%)     1261.04 *   5.30%*
> Amean     fault-both-3      1383.95 (   0.00%)     1191.69 *  13.89%*
> Amean     fault-both-5      1568.13 (   0.00%)     1445.20 *   7.84%*
> Amean     fault-both-7      1819.62 (   0.00%)     1555.13 *  14.54%*
> Amean     fault-both-12     1106.96 (   0.00%)     1149.43 *  -3.84%*
> Amean     fault-both-18     2196.93 (   0.00%)     1875.77 *  14.62%*
> Amean     fault-both-24     2642.69 (   0.00%)     2671.21 *  -1.08%*
> Amean     fault-both-30     2901.89 (   0.00%)     2857.32 *   1.54%*
> Amean     fault-both-32     3747.00 (   0.00%)     3479.23 *   7.15%*
> 
> Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate 
> a migration source")
> Signed-off-by: zhouchuyi <zhouchuyi@bytedance.com>
> =======================================================================
> Mel think it could be possible that the fast_find_migrateblock can return
> a pageblock not aligned pageblock_nr_pages. In that case,
> isolate_migratepages_block would not set skip flag.
> I want to fix this problem in v4.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
       [not found]         ` <7d2bbb38-a96c-8212-8f89-915cd2c8668f@bytedance.com>
@ 2022-08-17  3:10           ` Chuyi Zhou
  0 siblings, 0 replies; 11+ messages in thread
From: Chuyi Zhou @ 2022-08-17  3:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-mm



在 2022/8/17 上午11:10, Chuyi Zhou 写道:
> 
> 
> 在 2022/7/19 下午4:28, Mel Gorman 写道:
>> On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
>>>
>>>
>>> ??? 2022/7/14 ??????7:50, Mel Gorman ??????:
>>>> On Wed, Jul 13, 2022 at 08:28:14AM -0700, Andrew Morton wrote:
>>>>> (cc Mel)
>>>>>
>>>>> On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou 
>>>>> <zhouchuyi@bytedance.com> wrote:
>>>>>
>>>>>> From: zhouchuyi <zhouchuyi@bytedance.com>
>>>>>>
>>>>>> When we successfully find a pageblock in fast_find_migrateblock(),
>>>>>> the block will be set skip-flag through set_pageblock_skip(). 
>>>>>> However,
>>>>>> when entering isolate_migratepages_block(), the whole pageblock will
>>>>>> be skipped due to the branch
>>>>>> 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
>>>>>> Eventually we will goto isolate_abort and isolate nothing. That cause
>>>>>> fast_find_migrateblock useless.
>>>>>>
>>>>
>>>> It's not very clear *why* this failed from the changelog because
>>>> superficially !valid_page will be true for the first pageblock and 
>>>> there
>>>> is a reasonable expectation it will be aligned. Is the following 
>>>> accurate
>>>> based on your analysis?
>>>>
>>>>     However, when entering isolate_migratepages_block(), the first
>>>>     pageblock will be skipped in the branch 'if (!valid_page &&
>>>>     IS_ALIGNED(low_pfn, pageblock_nr_pages))' as isolation_suitable
>>>>     returns true due to the skip bit set by fast_find_migrateblock().
>>>>
>>>> If so, please update the changelog as a reviewer handling backports may
>>>> wonder what exactly is wrong with that branch.
>>>>
>>> Hi Mel, thanks for your review.
>>>
>>> If fast scanning failed, the return block may not be aligned, because 
>>> we get
>>> pfn from *free_pfn*. When fast-find success, the return value *pfn* 
>>> is get
>>> from pageblock_start_pfn, and it will be passed to
>>> isolate_migratepages_block as low_pfn. I think normally the value get 
>>> from
>>> pageblock_start_pfn should be aligned with pageblock_nr_pages. I have 
>>> used
>>> printk test it. Maby I miss something important?
>>>      pfn = pageblock_start_pfn(free_pfn);
>>>      ...
>>>      found_block = true;
>>>      set_pageblock_skip(freepage);
>>>      break;
>>
>> The return block indeed may not be aligned. It could simply be a
>> restart. The changelog could still do with a little clarification but
>> your patch is still fine.
>>
>>>> Second, what guarantees a block returned by fast_find that is not
>>>> aligned gets marked skipped after it is scanned? The set_pageblock_skip
>>>> is only called when there is a valid page and it may not be set if
>>>> !IS_ALIGNED(low_pfn). Is something like this untested hunk also 
>>>> necessary?
>>>>
>>> You are right, we do need some machenism to ensure mark skipped after
>>> scanned a !IS_ALIGNED block from fast_find. However, I think the 
>>> following
>>> code may not working. Because *skip_updated* has been reset:
>>>      if (!skip_updated) {
>>>          skip_updated = true;
>>>          if (test_and_set_skip(cc, page, low_pfn))
>>>              goto isolate_abort;
>>>      }
>>> Why not ignore skip_updated after scanned a block, just like this:
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index 962d05d1e187..1c388c45f127 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct 
>>> compact_control *cc,
>>> unsigned long low_pfn,
>>>           * rescanned twice in a row.
>>>           */
>>>          if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
>>> -               if (valid_page && !skip_updated)
>>> +               if (valid_page)
>>>                          set_pageblock_skip(valid_page);
>>>                  update_cached_migrate(cc, low_pfn);
>>>          }
>>>
>>
>> Because valid_page still needs to be set but ensuring valid_page is set
>> is sufficient.
>>
>> The pageblock will be marked for skip on the first locking of a lruvec
>>
>>                          /* Try get exclusive access under lock */
>>                          if (!skip_updated) {
>>                                  skip_updated = true;
>>                                  if (test_and_set_skip(cc, page, 
>> low_pfn))
>>                                          goto isolate_abort;
>>                          }
>>
> In test_and_set_skip(cc, page, low_pfn), if
> !IS_ALIGNED(low_pfn, pageblock_nr_pages) is true, the block would not
> get marked skip, furthermore *skip_updated* will be set true which result
> second check useless even valid_page has been set.
>> That will happen regardless of alignment. The intent here was to avoid
>> multiple scanners trying to isolate the same pageblock. The second
>> check for update happens at the end of the block
>>
>>          if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
>>                  if (valid_page && !skip_updated)
>>                          set_pageblock_skip(valid_page);
>>                  update_cached_migrate(cc, low_pfn);
>>          }
>>
> One way to guarantee a not aligned pageblock get marked is ensuring
> valid_page is set and ignoring skip_updated in the second check.
> Another way is ignoring IS_ALIGNED(pfn, pageblock_nr_pages) in
> test_and_set_skip().
> However, both of them may increase the chance of duplicate mark
> a pageblock skip.
> Tanks.
>> And it's the second one that requires a valid_page. If valid_page is
>> available then the pageblock will get marked skip in most cases that
>> matter. If anything, it can get set prematurely due to "Try get
>> exclusive access" if the pageblock is not fully scanned but checking if
>> pageblock should be *cleared* if there was a partial scan would be a
>> different patch and not even clear that it worth the complexity.
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2022-07-19  8:28       ` Mel Gorman
  2022-08-15  1:24         ` Andrew Morton
       [not found]         ` <7d2bbb38-a96c-8212-8f89-915cd2c8668f@bytedance.com>
@ 2023-01-09 19:43         ` Vlastimil Babka
  2023-01-11 14:21           ` Mel Gorman
  2 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2023-01-09 19:43 UTC (permalink / raw)
  To: Mel Gorman, Chuyi Zhou; +Cc: Andrew Morton, linux-mm

On 7/19/22 10:28, Mel Gorman wrote:
> On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
>> 
>> 
>> ??? 2022/7/14 ??????7:50, Mel Gorman ??????:
>> > On Wed, Jul 13, 2022 at 08:28:14AM -0700, Andrew Morton wrote:
>> > > (cc Mel)
>> > > 
>> > > On Wed, 13 Jul 2022 14:20:09 +0800 Chuyi Zhou <zhouchuyi@bytedance.com> wrote:
>> > > 
>> > > > From: zhouchuyi <zhouchuyi@bytedance.com>
>> > > > 
>> > > > When we successfully find a pageblock in fast_find_migrateblock(),
>> > > > the block will be set skip-flag through set_pageblock_skip(). However,
>> > > > when entering isolate_migratepages_block(), the whole pageblock will
>> > > > be skipped due to the branch
>> > > > 'if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages))'.
>> > > > Eventually we will goto isolate_abort and isolate nothing. That cause
>> > > > fast_find_migrateblock useless.
>> > > > 
>> > 
>> > It's not very clear *why* this failed from the changelog because
>> > superficially !valid_page will be true for the first pageblock and there
>> > is a reasonable expectation it will be aligned. Is the following accurate
>> > based on your analysis?
>> > 
>> > 	However, when entering isolate_migratepages_block(), the first
>> > 	pageblock will be skipped in the branch 'if (!valid_page &&
>> > 	IS_ALIGNED(low_pfn, pageblock_nr_pages))' as isolation_suitable
>> > 	returns true due to the skip bit set by fast_find_migrateblock().
>> > 
>> > If so, please update the changelog as a reviewer handling backports may
>> > wonder what exactly is wrong with that branch.
>> > 
>> Hi Mel, thanks for your review.
>> 
>> If fast scanning failed, the return block may not be aligned, because we get
>> pfn from *free_pfn*. When fast-find success, the return value *pfn* is get
>> from pageblock_start_pfn, and it will be passed to
>> isolate_migratepages_block as low_pfn. I think normally the value get from
>> pageblock_start_pfn should be aligned with pageblock_nr_pages. I have used
>> printk test it. Maby I miss something important?
>>     pfn = pageblock_start_pfn(free_pfn);
>>     ...
>>     found_block = true;
>>     set_pageblock_skip(freepage);
>>     break;
> 
> The return block indeed may not be aligned. It could simply be a
> restart. The changelog could still do with a little clarification but
> your patch is still fine.
> 
>> > Second, what guarantees a block returned by fast_find that is not
>> > aligned gets marked skipped after it is scanned? The set_pageblock_skip
>> > is only called when there is a valid page and it may not be set if
>> > !IS_ALIGNED(low_pfn). Is something like this untested hunk also necessary?
>> > 
>> You are right, we do need some machenism to ensure mark skipped after
>> scanned a !IS_ALIGNED block from fast_find. However, I think the following
>> code may not working. Because *skip_updated* has been reset:
>>     if (!skip_updated) {
>>         skip_updated = true;
>>         if (test_and_set_skip(cc, page, low_pfn))
>>             goto isolate_abort;
>>     }
>> Why not ignore skip_updated after scanned a block, just like this:
>> 
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 962d05d1e187..1c388c45f127 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control *cc,
>> unsigned long low_pfn,
>>          * rescanned twice in a row.
>>          */
>>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
>> -               if (valid_page && !skip_updated)
>> +               if (valid_page)
>>                         set_pageblock_skip(valid_page);
>>                 update_cached_migrate(cc, low_pfn);
>>         }
>> 
> 
> Because valid_page still needs to be set but ensuring valid_page is set
> is sufficient.
> 
> The pageblock will be marked for skip on the first locking of a lruvec
> 
>                         /* Try get exclusive access under lock */
>                         if (!skip_updated) {
>                                 skip_updated = true;
>                                 if (test_and_set_skip(cc, page, low_pfn))
>                                         goto isolate_abort;
>                         }
> 
> That will happen regardless of alignment.

I'm not sure about that as test_and_set_skip() has "if
(!pageblock_aligned(pfn)) return false". So I think it only takes for the
first pfn in pageblock to not be PageLRU and we don't try to get the
exclusive access for it, and the following pfn's will not be aligned, and
thus we never set_pageblock_skip() through this path. If we have nr_isolated
> 0 and not cc->rescan, we might not set_pageblock_skip() through the hunk
above neither.

I've been checking this area because we have some reports [1] for upstream
6.1 causing some long loops in khugepaged pegging 100% CPU in compaction.
Tracepoint data suggests we keep (successfully) isolating migratepages over
and over through fast_find_migrateblock() (the pfn ranges are not linearly
increasing but there's a cycle probably as we rotate the freelist), but are
failing to migrate them (for some reason). Between 6.0 and 6.1 there's this
patch as commit 7efc3b726103 so I strongly suspect it's the root cause and
will be providing a kernel with revert to in the bug to confirm.
Consider this an early heads up :)

[1] https://bugzilla.suse.com/show_bug.cgi?id=1206848

> The intent here was to avoid
> multiple scanners trying to isolate the same pageblock. The second
> check for update happens at the end of the block
> 
>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
>                 if (valid_page && !skip_updated)
>                         set_pageblock_skip(valid_page);
>                 update_cached_migrate(cc, low_pfn);
>         }
> 
> And it's the second one that requires a valid_page. If valid_page is
> available then the pageblock will get marked skip in most cases that
> matter. If anything, it can get set prematurely due to "Try get
> exclusive access" if the pageblock is not fully scanned but checking if
> pageblock should be *cleared* if there was a partial scan would be a
> different patch and not even clear that it worth the complexity.
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2023-01-09 19:43         ` Vlastimil Babka
@ 2023-01-11 14:21           ` Mel Gorman
  2023-01-13 17:13             ` Vlastimil Babka
  0 siblings, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2023-01-11 14:21 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Chuyi Zhou, Andrew Morton, linux-mm

On Mon, Jan 09, 2023 at 08:43:40PM +0100, Vlastimil Babka wrote:
> On 7/19/22 10:28, Mel Gorman wrote:
> > On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
> >> You are right, we do need some machenism to ensure mark skipped after
> >> scanned a !IS_ALIGNED block from fast_find. However, I think the following
> >> code may not working. Because *skip_updated* has been reset:
> >>     if (!skip_updated) {
> >>         skip_updated = true;
> >>         if (test_and_set_skip(cc, page, low_pfn))
> >>             goto isolate_abort;
> >>     }
> >> Why not ignore skip_updated after scanned a block, just like this:
> >> 
> >> diff --git a/mm/compaction.c b/mm/compaction.c
> >> index 962d05d1e187..1c388c45f127 100644
> >> --- a/mm/compaction.c
> >> +++ b/mm/compaction.c
> >> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control *cc,
> >> unsigned long low_pfn,
> >>          * rescanned twice in a row.
> >>          */
> >>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
> >> -               if (valid_page && !skip_updated)
> >> +               if (valid_page)
> >>                         set_pageblock_skip(valid_page);
> >>                 update_cached_migrate(cc, low_pfn);
> >>         }
> >> 
> > 
> > Because valid_page still needs to be set but ensuring valid_page is set
> > is sufficient.
> > 
> > The pageblock will be marked for skip on the first locking of a lruvec
> > 
> >                         /* Try get exclusive access under lock */
> >                         if (!skip_updated) {
> >                                 skip_updated = true;
> >                                 if (test_and_set_skip(cc, page, low_pfn))
> >                                         goto isolate_abort;
> >                         }
> > 
> > That will happen regardless of alignment.
> 
> I'm not sure about that as test_and_set_skip() has "if
> (!pageblock_aligned(pfn)) return false". So I think it only takes for the
> first pfn in pageblock to not be PageLRU and we don't try to get the
> exclusive access for it, and the following pfn's will not be aligned, and
> thus we never set_pageblock_skip() through this path. If we have nr_isolated
> > 0 and not cc->rescan, we might not set_pageblock_skip() through the hunk
> above neither.
> 

True.

> I've been checking this area because we have some reports [1] for upstream
> 6.1 causing some long loops in khugepaged pegging 100% CPU in compaction.
> Tracepoint data suggests we keep (successfully) isolating migratepages over
> and over through fast_find_migrateblock() (the pfn ranges are not linearly
> increasing but there's a cycle probably as we rotate the freelist), but are
> failing to migrate them (for some reason). Between 6.0 and 6.1 there's this
> patch as commit 7efc3b726103 so I strongly suspect it's the root cause and
> will be providing a kernel with revert to in the bug to confirm.
> Consider this an early heads up :)
> 
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1206848

The trace shows that there is a lot of rescanning between a small subset
of pageblocks because the conditions for marking the block skip are not
met. The scan is not reaching the end of the pageblock because enough
pages were isolated, but none were migrated successfully. Eventually it
circles back to the same block.

Pageblock skipping is tricky because we are trying to minimise both
latency and excessive scanning. However, tracking exactly when a block
is fully scanned requires an excessive amount of state.

One potential mitigating factor is to forcibly scan a pageblock when
migrations fail even though it could be for transient reasons such as
page writeback or page dirty. This will sometimes migrate too many pages
(worst case 500) but pageblocks will be marked skip.

This is not tested at all

diff --git a/mm/compaction.c b/mm/compaction.c
index ca1603524bbe..9242fcaeb09c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2387,6 +2387,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 			cc->rescan = true;
 		}
 
+retry_rescan:
 		switch (isolate_migratepages(cc)) {
 		case ISOLATE_ABORT:
 			ret = COMPACT_CONTENDED;
@@ -2429,15 +2430,28 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 				goto out;
 			}
 			/*
-			 * We failed to migrate at least one page in the current
-			 * order-aligned block, so skip the rest of it.
+			 * If an ASYNC or SYNC_LIGHT fails to migrate a page
+			 * within the current order-aligned block, rescan the
+			 * entire block. Once the whole block is scanned, it'll
+			 * be marked "skip" to avoid rescanning in the near
+			 * future. This will isolate more pages than necessary
+			 * for the request but avoid loops due to
+			 * fast_find_migrateblock revisiting blocks that were
+			 * recently partially scanned.
 			 */
-			if (cc->direct_compaction &&
-						(cc->mode == MIGRATE_ASYNC)) {
-				cc->migrate_pfn = block_end_pfn(
-						cc->migrate_pfn - 1, cc->order);
-				/* Draining pcplists is useless in this case */
-				last_migrated_pfn = 0;
+			if (cc->direct_compaction && !cc->rescan &&
+						(cc->mode < MIGRATE_SYNC)) {
+				cc->rescan = true;
+
+				/*
+				 * Draining pcplists does not help THP if
+				 * any page failed to migrate. Even after
+				 * drain, the pageblock will not be free.
+				 */
+				if (cc->order == HUGETLB_PAGE_ORDER)
+					last_migrated_pfn = 0;
+
+				goto retry_rescan;
 			}
 		}
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock
  2023-01-11 14:21           ` Mel Gorman
@ 2023-01-13 17:13             ` Vlastimil Babka
  0 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2023-01-13 17:13 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Chuyi Zhou, Andrew Morton, linux-mm, Michal Hocko

On 1/11/23 15:21, Mel Gorman wrote:
> On Mon, Jan 09, 2023 at 08:43:40PM +0100, Vlastimil Babka wrote:
>> On 7/19/22 10:28, Mel Gorman wrote:
>> > On Fri, Jul 15, 2022 at 11:26:01PM +0800, Chuyi Zhou wrote:
>> >> You are right, we do need some machenism to ensure mark skipped after
>> >> scanned a !IS_ALIGNED block from fast_find. However, I think the following
>> >> code may not working. Because *skip_updated* has been reset:
>> >>     if (!skip_updated) {
>> >>         skip_updated = true;
>> >>         if (test_and_set_skip(cc, page, low_pfn))
>> >>             goto isolate_abort;
>> >>     }
>> >> Why not ignore skip_updated after scanned a block, just like this:
>> >> 
>> >> diff --git a/mm/compaction.c b/mm/compaction.c
>> >> index 962d05d1e187..1c388c45f127 100644
>> >> --- a/mm/compaction.c
>> >> +++ b/mm/compaction.c
>> >> @@ -1180,7 +1180,7 @@ isolate_migratepages_block(struct compact_control *cc,
>> >> unsigned long low_pfn,
>> >>          * rescanned twice in a row.
>> >>          */
>> >>         if (low_pfn == end_pfn && (!nr_isolated || cc->rescan)) {
>> >> -               if (valid_page && !skip_updated)
>> >> +               if (valid_page)
>> >>                         set_pageblock_skip(valid_page);
>> >>                 update_cached_migrate(cc, low_pfn);
>> >>         }
>> >> 
>> > 
>> > Because valid_page still needs to be set but ensuring valid_page is set
>> > is sufficient.
>> > 
>> > The pageblock will be marked for skip on the first locking of a lruvec
>> > 
>> >                         /* Try get exclusive access under lock */
>> >                         if (!skip_updated) {
>> >                                 skip_updated = true;
>> >                                 if (test_and_set_skip(cc, page, low_pfn))
>> >                                         goto isolate_abort;
>> >                         }
>> > 
>> > That will happen regardless of alignment.
>> 
>> I'm not sure about that as test_and_set_skip() has "if
>> (!pageblock_aligned(pfn)) return false". So I think it only takes for the
>> first pfn in pageblock to not be PageLRU and we don't try to get the
>> exclusive access for it, and the following pfn's will not be aligned, and
>> thus we never set_pageblock_skip() through this path. If we have nr_isolated
>> > 0 and not cc->rescan, we might not set_pageblock_skip() through the hunk
>> above neither.
>> 
> 
> True.
> 
>> I've been checking this area because we have some reports [1] for upstream
>> 6.1 causing some long loops in khugepaged pegging 100% CPU in compaction.
>> Tracepoint data suggests we keep (successfully) isolating migratepages over
>> and over through fast_find_migrateblock() (the pfn ranges are not linearly
>> increasing but there's a cycle probably as we rotate the freelist), but are
>> failing to migrate them (for some reason). Between 6.0 and 6.1 there's this
>> patch as commit 7efc3b726103 so I strongly suspect it's the root cause and
>> will be providing a kernel with revert to in the bug to confirm.
>> Consider this an early heads up :)
>> 
>> [1] https://bugzilla.suse.com/show_bug.cgi?id=1206848
> 
> The trace shows that there is a lot of rescanning between a small subset
> of pageblocks because the conditions for marking the block skip are not
> met. The scan is not reaching the end of the pageblock because enough
> pages were isolated, but none were migrated successfully. Eventually it
> circles back to the same block.
> 
> Pageblock skipping is tricky because we are trying to minimise both
> latency and excessive scanning. However, tracking exactly when a block
> is fully scanned requires an excessive amount of state.
> 
> One potential mitigating factor is to forcibly scan a pageblock when
> migrations fail even though it could be for transient reasons such as
> page writeback or page dirty. This will sometimes migrate too many pages
> (worst case 500) but pageblocks will be marked skip.
> 
> This is not tested at all
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ca1603524bbe..9242fcaeb09c 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2387,6 +2387,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
>  			cc->rescan = true;
>  		}
>  
> +retry_rescan:
>  		switch (isolate_migratepages(cc)) {
>  		case ISOLATE_ABORT:
>  			ret = COMPACT_CONTENDED;
> @@ -2429,15 +2430,28 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
>  				goto out;
>  			}
>  			/*
> -			 * We failed to migrate at least one page in the current
> -			 * order-aligned block, so skip the rest of it.
> +			 * If an ASYNC or SYNC_LIGHT fails to migrate a page
> +			 * within the current order-aligned block, rescan the
> +			 * entire block. Once the whole block is scanned, it'll
> +			 * be marked "skip" to avoid rescanning in the near
> +			 * future. This will isolate more pages than necessary
> +			 * for the request but avoid loops due to
> +			 * fast_find_migrateblock revisiting blocks that were
> +			 * recently partially scanned.
>  			 */
> -			if (cc->direct_compaction &&
> -						(cc->mode == MIGRATE_ASYNC)) {
> -				cc->migrate_pfn = block_end_pfn(
> -						cc->migrate_pfn - 1, cc->order);
> -				/* Draining pcplists is useless in this case */
> -				last_migrated_pfn = 0;
> +			if (cc->direct_compaction && !cc->rescan &&
> +						(cc->mode < MIGRATE_SYNC)) {
> +				cc->rescan = true;
> +
> +				/*
> +				 * Draining pcplists does not help THP if
> +				 * any page failed to migrate. Even after
> +				 * drain, the pageblock will not be free.
> +				 */
> +				if (cc->order == HUGETLB_PAGE_ORDER)
> +					last_migrated_pfn = 0;
> +
> +				goto retry_rescan;

I think it won't work as expected, because we will not be rescanning the
same pageblock, fast_find_migrateblock() will AFAICS pick a different one,
there's nothing to prevent it.
But the possible compaction state is rather complex for me to be confident
about a quick fix. I think it will be safer to revert the patch for 6.2 and
6.1 stable, and redo it more carefully.

>  			}
>  		}
>  



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-13 17:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-13  6:20 [PATCH v3] mm/compaction: fix set skip in fast_find_migrateblock Chuyi Zhou
2022-07-13 15:28 ` Andrew Morton
2022-07-14 11:50   ` Mel Gorman
2022-07-15 15:26     ` Chuyi Zhou
2022-07-19  8:28       ` Mel Gorman
2022-08-15  1:24         ` Andrew Morton
     [not found]           ` <70a434b2-7f1c-7fad-a7b7-cb038a13fd2c@bytedance.com>
2022-08-15  3:22             ` Chuyi Zhou
     [not found]         ` <7d2bbb38-a96c-8212-8f89-915cd2c8668f@bytedance.com>
2022-08-17  3:10           ` Chuyi Zhou
2023-01-09 19:43         ` Vlastimil Babka
2023-01-11 14:21           ` Mel Gorman
2023-01-13 17:13             ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).