All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Mike Kravetz <mike.kravetz@oracle.com>, Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	syzbot <syzbot+acf65ca584991f3cc447@syzkaller.appspotmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	llvm@lists.linux.dev, nathan@kernel.org, ndesaulniers@google.com,
	syzkaller-bugs@googlegroups.com, trix@redhat.com,
	Matthew Wilcox <willy@infradead.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [syzbot] WARNING in follow_hugetlb_page
Date: Wed, 18 May 2022 00:12:18 -0700	[thread overview]
Message-ID: <d0abf3db-0726-81b9-5a44-36cb1736a155@nvidia.com> (raw)
In-Reply-To: <6d281052-485c-5e17-4f1c-ef5689831450@oracle.com>

On 5/16/22 20:37, Mike Kravetz wrote:

> Later, that code was modified (for performance reasons) in commit  0fa5bc4023c18 to do a single try_grab_compound_page() instead of multiple
> calls to try_grab_page().  At the time it was not noticed that
> try_grab_compound_page had a check for 'pinnable' when try_grab_page did
> not.  So, I think this commit actually changed the behavior.

This makes me unhappy with the try_grab_page() situation: the name
doesn't give you any hint that it now has extra policy in there.
Furthermore, the policy is unaware of all of the call sites and
is already getting it wrong.

After looking through the call sites, I'm leaning slightly toward pulling
is_pinnable_page() out, and letting the call sites decide if they want
or need to apply that policy.

Just because try_grab_folio() is a choke point, does not mean that it
is the right place for the is_pinnable_page() checks. And while it's
nice to avoid scattering is_pinnable_page() all over the place, it's
more important to *not* have it when it is not correct.

There is a counter-argument, though, which goes something like,
"try_grab_folio(FOLL_PIN) is never ever correct if the page is not
pinnable". But that's not as clear cut as one might think. See below.

> 
>>
>> try_grab_folio()
>>      /*
>>       * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
>>       * right zone, so fail and let the caller fall back to the slow
>>       * path.
>>       */
>>      if (unlikely((flags & FOLL_LONGTERM) &&
>>               !is_pinnable_page(page))) /* which we just changed */
>>          return NULL;
>>
>> ...and now I'm starting to think that this warning might fire even with
>> the corrected check for MIGRATE_CMA || MIGRATE_ISOLATE. Because
>> try_grab_folio() didn't always have this early exit and it is starting
>> to look wrong.
>>
>> Simply attempting to pin a non-pinnable huge page would hit this
>> warning. Adding additional reasons that a page is not pinnable (which
>> the patch does) could make this more likely to fire.
> 
> Yes, that is correct.  One could easily allocate a hugetlb page from
> CMA and trigger this warning.
> 

Thanks for confirming that!

>>
>> I need to look at this a little more closely, it is making me wonder
>> whether the is_pinnable_page() check is a problem in this path. The
>> comment in try_grab_folio() indicates that the early return is a hack
>> (it assumes that the caller is in the gup fast path), and maybe the hack
>> is just wrong here--I think we're actually on the slow gup path. Not
>> good.
>>
>> Mike, any thoughts here?
>>
> 
> Do you know why try_grab_compound_page(now try_grab_folio) checks for
> pinnable when try_grab_page does not?

I think it's just an oversight. The CMA and migrate-out logic was sort of
retrofitted and I think it's just incomplete, yet.

> 
> Then I guess the next question is 'Should we allow pinning of hugetlb pages
> in these areas?'.  My first thought would be no.  But, recall it was 'allowed'
> until that commit which changed try_grab_page to try_grab_compound_page.
> In the 'common' case of compaction, we do not attempt to migrate/move hugetlb
> pages (last time I looked), so long term pinning should not be an issue.
> However, for things like memory offline or alloc_contig_range() we want to
> migrate hugetlb pages, so this would be an issue there.
> 
> At a minimum, I think the warning should go.

Agreed. That, and either a) bring try_grab_folio() and try_grab_page() into
the same behavior with respect to checking for pinnable, or b) lifting
is_pinnable_page() out of try_grab_folio() and letting the callers decide, as
mentioned above.

Your point above makes it seem like the flexibility of (b) might be better...

thanks,
-- 
John Hubbard
NVIDIA



  reply	other threads:[~2022-05-18  7:12 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-13  9:03 [syzbot] WARNING in follow_hugetlb_page syzbot
2022-05-13 16:43 ` syzbot
2022-05-13 17:26   ` Andrew Morton
2022-05-13 18:09     ` Mike Kravetz
2022-05-13 22:48       ` Mike Kravetz
2022-05-13 23:19         ` Andrew Morton
2022-05-13 23:54           ` Minchan Kim
2022-05-14  0:09             ` John Hubbard
2022-05-14  0:26               ` Minchan Kim
2022-05-14  0:56                 ` John Hubbard
2022-05-14  1:16                   ` John Hubbard
2022-05-17  3:37                   ` Mike Kravetz
2022-05-18  7:12                     ` John Hubbard [this message]
2022-05-20 22:19                     ` Minchan Kim
2022-05-20 22:56                       ` John Hubbard
2022-05-20 23:25                         ` Minchan Kim
2022-05-20 23:31                         ` Mike Kravetz
2022-05-20 23:43                           ` Minchan Kim
2022-05-21  0:04                             ` Mike Kravetz
2022-05-21 15:24                               ` Minchan Kim
2022-05-21 15:51                                 ` David Hildenbrand
2022-05-21 16:36                                   ` Minchan Kim
2022-05-21 16:46                                     ` David Hildenbrand
2022-05-21 18:25                                       ` Minchan Kim
2022-05-21 23:50                                         ` Mike Kravetz
2022-05-14  0:18             ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0abf3db-0726-81b9-5a44-36cb1736a155@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llvm@lists.linux.dev \
    --cc=mike.kravetz@oracle.com \
    --cc=minchan@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=sfr@canb.auug.org.au \
    --cc=syzbot+acf65ca584991f3cc447@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=trix@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.