All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Jann Horn <jannh@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Jan Kara <jack@suse.cz>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page()
Date: Tue, 15 Jun 2021 16:10:46 -0700	[thread overview]
Message-ID: <CAHbLzkomex+fgC8RyogXu-s5o2UrORMO6D2yTsSXW5Wo5z9WRA@mail.gmail.com> (raw)
In-Reply-To: <CAG48ez3Vbcvh4AisU7=ukeJeSjHGTKQVd0NOU6XOpRru7oP_ig@mail.gmail.com>

On Tue, Jun 15, 2021 at 5:10 AM Jann Horn <jannh@google.com> wrote:
>
> On Tue, Jun 15, 2021 at 8:37 AM John Hubbard <jhubbard@nvidia.com> wrote:
> > On 6/14/21 6:20 PM, Jann Horn wrote:
> > > try_grab_compound_head() is used to grab a reference to a page from
> > > get_user_pages_fast(), which is only protected against concurrent
> > > freeing of page tables (via local_irq_save()), but not against
> > > concurrent TLB flushes, freeing of data pages, or splitting of compound
> > > pages.
> [...]
> > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
>
> Thanks!
>
> [...]
> > > @@ -55,8 +72,23 @@ static inline struct page *try_get_compound_head(struct page *page, int refs)
> > >       if (WARN_ON_ONCE(page_ref_count(head) < 0))
> > >               return NULL;
> > >       if (unlikely(!page_cache_add_speculative(head, refs)))
> > >               return NULL;
> > > +
> > > +     /*
> > > +      * At this point we have a stable reference to the head page; but it
> > > +      * could be that between the compound_head() lookup and the refcount
> > > +      * increment, the compound page was split, in which case we'd end up
> > > +      * holding a reference on a page that has nothing to do with the page
> > > +      * we were given anymore.
> > > +      * So now that the head page is stable, recheck that the pages still
> > > +      * belong together.
> > > +      */
> > > +     if (unlikely(compound_head(page) != head)) {
> >
> > I was just wondering about what all could happen here. Such as: page gets split,
> > reallocated into a different-sized compound page, one that still has page pointing
> > to head. I think that's OK, because we don't look at or change other huge page
> > fields.
> >
> > But I thought I'd mention the idea in case anyone else has any clever ideas about
> > how this simple check might be insufficient here. It seems fine to me, but I
> > routinely lack enough imagination about concurrent operations. :)
>
> Hmmm... I think the scariest aspect here is probably the interaction
> with concurrent allocation of a compound page on architectures with
> store-store reordering (like ARM). *If* the page allocator handled
> compound pages with lockless, non-atomic percpu freelists, I think it
> might be possible that the zeroing of tail_page->compound_head in
> put_page() could be reordered after the page has been freed,
> reallocated and set to refcount 1 again?
>
> That shouldn't be possible at the moment, but it is still a bit scary.

It might be possible after Mel's "mm/page_alloc: Allow high-order
pages to be stored on the per-cpu lists" patch
(https://patchwork.kernel.org/project/linux-mm/patch/20210611135753.GC30378@techsingularity.net/).

>
>
> I think the lockless page cache code also has to deal with somewhat
> similar ordering concerns when it uses page_cache_get_speculative(),
> e.g. in mapping_get_entry() - first it looks up a page pointer with
> xas_load(), and any access to the page later on would be a _dependent
> load_, but if the page then gets freed, reallocated, and inserted into
> the page cache again before the refcount increment and the re-check
> using xas_reload(), then there would be no data dependency from
> xas_reload() to the following use of the page...
>

  reply	other threads:[~2021-06-15 23:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15  1:20 [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page() Jann Horn
2021-06-15  2:00 ` Andrew Morton
2021-06-15  2:36   ` Jann Horn
2021-06-15  2:36     ` Jann Horn
2021-06-15  2:38     ` Jann Horn
2021-06-15  2:38       ` Jann Horn
2021-06-15  6:37 ` John Hubbard
2021-06-15 12:09   ` Jann Horn
2021-06-15 12:09     ` Jann Horn
2021-06-15 23:10     ` Yang Shi [this message]
2021-06-15 23:10       ` Yang Shi
2021-06-16 17:27       ` Vlastimil Babka
2021-06-16 18:40         ` Yang Shi
2021-06-16 18:40           ` Yang Shi
2021-06-17 16:09           ` Vlastimil Babka
2021-06-18 13:25     ` Jason Gunthorpe
2021-06-18 13:50       ` Matthew Wilcox
2021-06-18 14:58         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHbLzkomex+fgC8RyogXu-s5o2UrORMO6D2yTsSXW5Wo5z9WRA@mail.gmail.com \
    --to=shy828301@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.