linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Qian Cai <quic_qiancai@quicinc.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v11 0/6] Use pageblock_order for cma and alloc_contig_range alignment.
Date: Fri, 20 May 2022 09:43:40 -0400	[thread overview]
Message-ID: <48D48FDC-B8DD-41C9-B56A-EBD7314883AB@nvidia.com> (raw)
In-Reply-To: <Yod71OhUa3VWWPCG@qian>

[-- Attachment #1: Type: text/plain, Size: 2938 bytes --]

On 20 May 2022, at 7:30, Qian Cai wrote:

> On Thu, May 19, 2022 at 05:35:15PM -0400, Zi Yan wrote:
>> Do you have a complete reproducer? From your printout, it is clear that a 512-page compound
>> page caused the infinite loop, because the page was not migrated and the code kept
>> retrying. But __alloc_contig_migrate_range() is supposed to return non-zero to tell the
>> code the page cannot be migrated and the code will goto failed without retrying. It will be
>> great you can share what exactly has run after boot, so that I can reproduce locally to
>> identify what makes __alloc_contig_migrate_range() return 0 without migrating the page.
>
> The reproducer is just to run the same script I shared with you previously
> multiple times instead. It is still quite reproducible here as it usually
> happens within a hour.
>
> $ for i in `seq 1 100`; do ./flip_mem.py; done
>
>> Can you also try the patch below to see if it fixes the infinite loop?
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index b3f074d1682e..abde1877bbcb 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -417,10 +417,9 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, gfp_t gfp_flags,
>>                                 order = 0;
>>                                 outer_pfn = pfn;
>>                                 while (!PageBuddy(pfn_to_page(outer_pfn))) {
>> -                                       if (++order >= MAX_ORDER) {
>> -                                               outer_pfn = pfn;
>> -                                               break;
>> -                                       }
>> +                                       /* abort if the free page cannot be found */
>> +                                       if (++order >= MAX_ORDER)
>> +                                               goto failed;
>>                                         outer_pfn &= ~0UL << order;
>>                                 }
>>                                 pfn = outer_pfn;
>>
>
> Can you explain a bit how this patch is the right thing to do here? I am a
> little bit worry about shooting into the dark. Otherwise, I'll be running
> the off-by-one part over the weekend to see if that helps.

The code kept retrying to migrate a 512-page compound page, so it seems to me
that __alloc_contig_migrate_range() did not migrate the page but returned
0 every time, otherwise, if (ret) goto failed; would bail out of the loop
already. The original code above assumed a free page can always be found after
__alloc_contig_migrate_range(), so it will retry if no free page is found.
But that assumption is not true from your infinite loop result, the new
code quits retrying when no free page can be found.

I will dig into it deeper to make sure it is the correct fix. I will
update you when I am done.

Thanks.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  reply	other threads:[~2022-05-20 13:43 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-25 14:31 [PATCH v11 0/6] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-04-25 14:31 ` [PATCH v11 1/6] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-04-25 14:31 ` [PATCH v11 2/6] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-04-25 14:31 ` [PATCH v11 3/6] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-04-29 13:54   ` Zi Yan
2022-05-24 19:00     ` Zi Yan
2022-05-25 17:41     ` Doug Berger
2022-05-25 17:53       ` Zi Yan
2022-05-25 21:03         ` Doug Berger
2022-05-25 21:11           ` Zi Yan
2022-05-26 17:34             ` Zi Yan
2022-05-26 19:46               ` Doug Berger
2022-04-25 14:31 ` [PATCH v11 4/6] mm: page_isolation: enable arbitrary range page isolation Zi Yan
2022-05-24 19:02   ` Zi Yan
2022-04-25 14:31 ` [PATCH v11 5/6] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-04-25 14:31 ` [PATCH v11 6/6] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-04-26 20:18 ` [PATCH v11 0/6] Use pageblock_order for cma and alloc_contig_range alignment Qian Cai
2022-04-26 20:26   ` Zi Yan
2022-04-26 21:08     ` Qian Cai
2022-04-26 21:38       ` Zi Yan
2022-04-27 12:41         ` Qian Cai
2022-04-27 13:10         ` Qian Cai
2022-04-27 13:27         ` Qian Cai
2022-04-27 13:30           ` Zi Yan
2022-04-27 21:04             ` Zi Yan
2022-04-28 12:33               ` Qian Cai
2022-04-28 12:39                 ` Zi Yan
2022-04-28 16:19                   ` Qian Cai
2022-04-29 13:38                     ` Zi Yan
2022-05-19 20:57                   ` Qian Cai
2022-05-19 21:35                     ` Zi Yan
2022-05-19 23:24                       ` Zi Yan
2022-05-20 11:30                       ` Qian Cai
2022-05-20 13:43                         ` Zi Yan [this message]
2022-05-20 14:13                           ` Zi Yan
2022-05-20 19:41                             ` Qian Cai
2022-05-20 21:56                               ` Zi Yan
2022-05-20 23:41                                 ` Qian Cai
2022-05-22 16:54                                   ` Zi Yan
2022-05-22 19:33                                     ` Zi Yan
2022-05-24 16:59                                     ` Qian Cai
2022-05-10  1:03 ` Andrew Morton
2022-05-10  1:07   ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48D48FDC-B8DD-41C9-B56A-EBD7314883AB@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=quic_qiancai@quicinc.com \
    --cc=renzhengeek@gmail.com \
    --cc=rppt@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).