linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	Tyler Hicks <tyhicks@linux.microsoft.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	mike.kravetz@oracle.com, Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	David Rientjes <rientjes@google.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone
Date: Thu, 3 Dec 2020 14:15:36 -0500	[thread overview]
Message-ID: <CA+CK2bAErUyg26UNKQJwaZ7wsv4+qv4Ws58Vdmv+QZfLDyrkeQ@mail.gmail.com> (raw)
In-Reply-To: <CA+CK2bBRqK6_ZOkTbz8qe4Gz6T5pwt-KjJQ0+NzYOC1qO1sPMg@mail.gmail.com>

> > > > Looking at this code some more.. How is it even correct?
> > > >
> > > > 1633                            if (!isolate_lru_page(head)) {
> > > > 1634                                    list_add_tail(&head->lru, &cma_page_list);
> > > >
> > > > Here we are only running under the read side of the mmap sem so multiple
> > > > GUPs can be calling that sequence in parallel. I don't see any
> > > > obvious exclusion that will prevent corruption of head->lru. The first
> > > > GUP thread to do isolate_lru_page() will ClearPageLRU() and the second
> > > > GUP thread will be a NOP for isolate_lru_page().
> > > >
> > > > They will both race list_add_tail and other list ops. That is not OK.
> > >
> > > Good question. I studied it, and I do not see how this is OK. Worse,
> > > this race is also exposable as a syscall instead of via driver: two
> > > move_pages() run simultaneously. Perhaps in other places?
> > >
> > > move_pages()
> > >   kernel_move_pages()
> > >     mmget()
> > >     do_pages_move()
> > >       add_page_for_migratio()
> > >          mmap_read_lock(mm);
> > >          list_add_tail(&head->lru, pagelist); <- Not protected
> >
> > When this was CMA only it might have been rarer to trigger, but this
> > move stuff sounds like it makes it much more broadly, eg on typical
> > servers with RDMA exposed/etc
> >
> > Seems like it needs fixing as part of this too :\
>
> Just to clarify the stack that I showed above is outside of gup, it is
> the same issue that you pointed out that happens elsewhere. I suspect
> there might be more. All of them should be addressed together.

Hi Jason,

I studied some more, and I think this is not a race:
list_add_tail(&head->lru, &cma_page_list) is called only when
isolate_lru_page(head) succeeds.
isolate_lru_page(head) succeeds only when PageLRU(head) is true.
However, in this function we also clear LRU flag before returning
success.
This means, that if we race with another thread, the other thread
won't get to unprotected list_add_tail(&head->lru, &cma_page_list)
until head is is back on LRU list.

Please let me know if I am missing anything.

Thank you,
Pasha

  reply	other threads:[~2020-12-03 19:16 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02  5:23 [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Pavel Tatashin
2020-12-02  5:23 ` [PATCH 1/6] mm/gup: perform check_dax_vmas only when FS_DAX is enabled Pavel Tatashin
2020-12-02 16:22   ` Ira Weiny
2020-12-02 18:15     ` Pavel Tatashin
2020-12-02 16:29   ` Jason Gunthorpe
2020-12-02 18:16     ` Pavel Tatashin
2020-12-03  7:59   ` John Hubbard
2020-12-03 14:52     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 2/6] mm/gup: don't pin migrated cma pages in movable zone Pavel Tatashin
2020-12-02 16:31   ` David Hildenbrand
2020-12-02 18:17     ` Pavel Tatashin
2020-12-03  8:01   ` John Hubbard
2020-12-03  8:46   ` Michal Hocko
2020-12-03 14:58     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 3/6] mm/gup: make __gup_longterm_locked common Pavel Tatashin
2020-12-02 16:31   ` Ira Weiny
2020-12-02 16:33     ` Ira Weiny
2020-12-02 18:19       ` Pavel Tatashin
2020-12-03  0:03         ` Pavel Tatashin
2020-12-03  8:03   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 4/6] mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_NOMOVABLE Pavel Tatashin
2020-12-03  8:04   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-03  8:57   ` Michal Hocko
2020-12-03 15:02     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations Pavel Tatashin
2020-12-03  8:17   ` John Hubbard
2020-12-03 15:06     ` Pavel Tatashin
2020-12-03 16:51       ` John Hubbard
2020-12-03  9:17   ` Michal Hocko
2020-12-03 15:15     ` Pavel Tatashin
2020-12-04  8:43       ` Michal Hocko
2020-12-04  8:54         ` Michal Hocko
2020-12-04 16:07           ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone Pavel Tatashin
2020-12-02 16:35   ` Jason Gunthorpe
2020-12-03  0:19     ` Pavel Tatashin
2020-12-03  1:08       ` Jason Gunthorpe
2020-12-03  1:34         ` Pavel Tatashin
2020-12-03 14:17           ` Jason Gunthorpe
2020-12-03 16:40             ` Pavel Tatashin
2020-12-03 16:59               ` Jason Gunthorpe
2020-12-03 17:14                 ` Pavel Tatashin
2020-12-03 19:15                   ` Pavel Tatashin [this message]
2020-12-03 19:36                     ` Jason Gunthorpe
2020-12-04 16:24                       ` Pavel Tatashin
2020-12-04 17:06                         ` Jason Gunthorpe
2020-12-04 20:05             ` Daniel Jordan
2020-12-04 20:16               ` Pavel Tatashin
2020-12-08  2:27                 ` Daniel Jordan
2020-12-04 20:52               ` Jason Gunthorpe
2020-12-08  2:48                 ` Daniel Jordan
2020-12-08 13:24                   ` Jason Gunthorpe
2020-12-03  8:22   ` John Hubbard
2020-12-03 15:55     ` Pavel Tatashin
2020-12-04  4:13   ` Joonsoo Kim
2020-12-04 17:43     ` Pavel Tatashin
2020-12-07  7:13       ` Joonsoo Kim
2020-12-04  4:02 ` [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Joonsoo Kim
2020-12-04 15:55   ` Pavel Tatashin
2020-12-04 16:10     ` Jason Gunthorpe
2020-12-04 17:50       ` Pavel Tatashin
2020-12-04 18:01         ` David Hildenbrand
2020-12-04 18:10           ` Pavel Tatashin
2020-12-07  7:12         ` Joonsoo Kim
2020-12-07 12:13           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+CK2bAErUyg26UNKQJwaZ7wsv4+qv4Ws58Vdmv+QZfLDyrkeQ@mail.gmail.com \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).