From: Yang Shi <shy828301@gmail.com> To: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com>, John Hubbard <jhubbard@nvidia.com>, Matthew Wilcox <willy@infradead.org>, Andrew Morton <akpm@linux-foundation.org>, Linux-MM <linux-mm@kvack.org>, kernel list <linux-kernel@vger.kernel.org>, "Kirill A . Shutemov" <kirill@shutemov.name>, Jan Kara <jack@suse.cz>, stable <stable@vger.kernel.org> Subject: Re: [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page() Date: Wed, 16 Jun 2021 11:40:50 -0700 [thread overview] Message-ID: <CAHbLzkpa5MQBtYcRPWu4vNDn=Q8SKStQ-9wKYWogqRrMR3Aonw@mail.gmail.com> (raw) In-Reply-To: <6d21f8cb-4b72-bdec-386c-684ddbcdada1@suse.cz> On Wed, Jun 16, 2021 at 10:27 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 6/16/21 1:10 AM, Yang Shi wrote: > > On Tue, Jun 15, 2021 at 5:10 AM Jann Horn <jannh@google.com> wrote: > >> > >> On Tue, Jun 15, 2021 at 8:37 AM John Hubbard <jhubbard@nvidia.com> wrote: > >> > On 6/14/21 6:20 PM, Jann Horn wrote: > >> > > try_grab_compound_head() is used to grab a reference to a page from > >> > > get_user_pages_fast(), which is only protected against concurrent > >> > > freeing of page tables (via local_irq_save()), but not against > >> > > concurrent TLB flushes, freeing of data pages, or splitting of compound > >> > > pages. > >> [...] > >> > Reviewed-by: John Hubbard <jhubbard@nvidia.com> > >> > >> Thanks! > >> > >> [...] > >> > > @@ -55,8 +72,23 @@ static inline struct page *try_get_compound_head(struct page *page, int refs) > >> > > if (WARN_ON_ONCE(page_ref_count(head) < 0)) > >> > > return NULL; > >> > > if (unlikely(!page_cache_add_speculative(head, refs))) > >> > > return NULL; > >> > > + > >> > > + /* > >> > > + * At this point we have a stable reference to the head page; but it > >> > > + * could be that between the compound_head() lookup and the refcount > >> > > + * increment, the compound page was split, in which case we'd end up > >> > > + * holding a reference on a page that has nothing to do with the page > >> > > + * we were given anymore. > >> > > + * So now that the head page is stable, recheck that the pages still > >> > > + * belong together. > >> > > + */ > >> > > + if (unlikely(compound_head(page) != head)) { > >> > > >> > I was just wondering about what all could happen here. Such as: page gets split, > >> > reallocated into a different-sized compound page, one that still has page pointing > >> > to head. I think that's OK, because we don't look at or change other huge page > >> > fields. > >> > > >> > But I thought I'd mention the idea in case anyone else has any clever ideas about > >> > how this simple check might be insufficient here. It seems fine to me, but I > >> > routinely lack enough imagination about concurrent operations. :) > >> > >> Hmmm... I think the scariest aspect here is probably the interaction > >> with concurrent allocation of a compound page on architectures with > >> store-store reordering (like ARM). *If* the page allocator handled > >> compound pages with lockless, non-atomic percpu freelists, I think it > >> might be possible that the zeroing of tail_page->compound_head in > >> put_page() could be reordered after the page has been freed, > >> reallocated and set to refcount 1 again? > >> > >> That shouldn't be possible at the moment, but it is still a bit scary. > > > > It might be possible after Mel's "mm/page_alloc: Allow high-order > > pages to be stored on the per-cpu lists" patch > > (https://patchwork.kernel.org/project/linux-mm/patch/20210611135753.GC30378@techsingularity.net/). > > Those would be percpu indeed, but not "lockless, non-atomic", no? They are > protected by a local_lock. The local_lock is *not* a lock on non-PREEMPT_RT kernel IIUC. It disables preempt and IRQ. But preempt disable is no-op on non-preempt kernel. IRQ disable can guarantee it is atomic context, but I'm not sure if it is equivalent to "atomic freelists" in Jann's context. > > >> > >> > >> I think the lockless page cache code also has to deal with somewhat > >> similar ordering concerns when it uses page_cache_get_speculative(), > >> e.g. in mapping_get_entry() - first it looks up a page pointer with > >> xas_load(), and any access to the page later on would be a _dependent > >> load_, but if the page then gets freed, reallocated, and inserted into > >> the page cache again before the refcount increment and the re-check > >> using xas_reload(), then there would be no data dependency from > >> xas_reload() to the following use of the page... > >> > > >
next prev parent reply other threads:[~2021-06-16 18:41 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-15 1:20 Jann Horn 2021-06-15 2:00 ` Andrew Morton 2021-06-15 2:36 ` Jann Horn 2021-06-15 2:36 ` Jann Horn 2021-06-15 2:38 ` Jann Horn 2021-06-15 2:38 ` Jann Horn 2021-06-15 6:37 ` John Hubbard 2021-06-15 12:09 ` Jann Horn 2021-06-15 12:09 ` Jann Horn 2021-06-15 23:10 ` Yang Shi 2021-06-15 23:10 ` Yang Shi 2021-06-16 17:27 ` Vlastimil Babka 2021-06-16 18:40 ` Yang Shi [this message] 2021-06-16 18:40 ` Yang Shi 2021-06-17 16:09 ` Vlastimil Babka 2021-06-18 13:25 ` Jason Gunthorpe 2021-06-18 13:50 ` Matthew Wilcox 2021-06-18 14:58 ` Jason Gunthorpe
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHbLzkpa5MQBtYcRPWu4vNDn=Q8SKStQ-9wKYWogqRrMR3Aonw@mail.gmail.com' \ --to=shy828301@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=jack@suse.cz \ --cc=jannh@google.com \ --cc=jhubbard@nvidia.com \ --cc=kirill@shutemov.name \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=stable@vger.kernel.org \ --cc=vbabka@suse.cz \ --cc=willy@infradead.org \ --subject='Re: [PATCH v2] mm/gup: fix try_grab_compound_head() race with split_huge_page()' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.