All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: John Hubbard <jhubbard@nvidia.com>, Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Jan Kara <jack@suse.cz>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page()
Date: Tue, 15 Jun 2021 14:09:38 +0200	[thread overview]
Message-ID: <CAG48ez3Vbcvh4AisU7=ukeJeSjHGTKQVd0NOU6XOpRru7oP_ig@mail.gmail.com> (raw)
In-Reply-To: <50d828d1-2ce6-21b4-0e27-fb15daa77561@nvidia.com>

On Tue, Jun 15, 2021 at 8:37 AM John Hubbard <jhubbard@nvidia.com> wrote:
> On 6/14/21 6:20 PM, Jann Horn wrote:
> > try_grab_compound_head() is used to grab a reference to a page from
> > get_user_pages_fast(), which is only protected against concurrent
> > freeing of page tables (via local_irq_save()), but not against
> > concurrent TLB flushes, freeing of data pages, or splitting of compound
> > pages.
[...]
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>

Thanks!

[...]
> > @@ -55,8 +72,23 @@ static inline struct page *try_get_compound_head(struct page *page, int refs)
> >       if (WARN_ON_ONCE(page_ref_count(head) < 0))
> >               return NULL;
> >       if (unlikely(!page_cache_add_speculative(head, refs)))
> >               return NULL;
> > +
> > +     /*
> > +      * At this point we have a stable reference to the head page; but it
> > +      * could be that between the compound_head() lookup and the refcount
> > +      * increment, the compound page was split, in which case we'd end up
> > +      * holding a reference on a page that has nothing to do with the page
> > +      * we were given anymore.
> > +      * So now that the head page is stable, recheck that the pages still
> > +      * belong together.
> > +      */
> > +     if (unlikely(compound_head(page) != head)) {
>
> I was just wondering about what all could happen here. Such as: page gets split,
> reallocated into a different-sized compound page, one that still has page pointing
> to head. I think that's OK, because we don't look at or change other huge page
> fields.
>
> But I thought I'd mention the idea in case anyone else has any clever ideas about
> how this simple check might be insufficient here. It seems fine to me, but I
> routinely lack enough imagination about concurrent operations. :)

Hmmm... I think the scariest aspect here is probably the interaction
with concurrent allocation of a compound page on architectures with
store-store reordering (like ARM). *If* the page allocator handled
compound pages with lockless, non-atomic percpu freelists, I think it
might be possible that the zeroing of tail_page->compound_head in
put_page() could be reordered after the page has been freed,
reallocated and set to refcount 1 again?

That shouldn't be possible at the moment, but it is still a bit scary.


I think the lockless page cache code also has to deal with somewhat
similar ordering concerns when it uses page_cache_get_speculative(),
e.g. in mapping_get_entry() - first it looks up a page pointer with
xas_load(), and any access to the page later on would be a _dependent
load_, but if the page then gets freed, reallocated, and inserted into
the page cache again before the refcount increment and the re-check
using xas_reload(), then there would be no data dependency from
xas_reload() to the following use of the page...

  reply	other threads:[~2021-06-15 12:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15  1:20 Jann Horn
2021-06-15  2:00 ` Andrew Morton
2021-06-15  2:36   ` Jann Horn
2021-06-15  2:36     ` Jann Horn
2021-06-15  2:38     ` Jann Horn
2021-06-15  2:38       ` Jann Horn
2021-06-15  6:37 ` John Hubbard
2021-06-15 12:09   ` Jann Horn [this message]
2021-06-15 12:09     ` Jann Horn
2021-06-15 23:10     ` Yang Shi
2021-06-15 23:10       ` Yang Shi
2021-06-16 17:27       ` Vlastimil Babka
2021-06-16 18:40         ` Yang Shi
2021-06-16 18:40           ` Yang Shi
2021-06-17 16:09           ` Vlastimil Babka
2021-06-18 13:25     ` Jason Gunthorpe
2021-06-18 13:50       ` Matthew Wilcox
2021-06-18 14:58         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez3Vbcvh4AisU7=ukeJeSjHGTKQVd0NOU6XOpRru7oP_ig@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.