From: Dan Williams <dan.j.williams@intel.com> To: "Jérôme Glisse" <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com>, John Hubbard <john.hubbard@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>, tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>, benve@cisco.com, Christoph Hellwig <hch@infradead.org>, Christopher Lameter <cl@linux.com>, "Dalessandro, Dennis" <dennis.dalessandro@intel.com>, Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@ziepe.ca>, Matthew Wilcox <willy@infradead.org>, Michal Hocko <mhocko@kernel.org>, mike.marciniszyn@intel.com, rcampbell@nvidia.com, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions Date: Tue, 4 Dec 2018 16:40:07 -0800 Message-ID: <CAPcyv4hhkB09eBzPErVntZrNYE0RP5+6Z6+sP0kq6ng9ZOzTfg@mail.gmail.com> (raw) In-Reply-To: <20181205003648.GT2937@redhat.com> On Tue, Dec 4, 2018 at 4:37 PM Jerome Glisse <jglisse@redhat.com> wrote: > > On Tue, Dec 04, 2018 at 03:03:02PM -0800, Dan Williams wrote: > > On Tue, Dec 4, 2018 at 1:56 PM John Hubbard <jhubbard@nvidia.com> wrote: > > > > > > On 12/4/18 12:28 PM, Dan Williams wrote: > > > > On Mon, Dec 3, 2018 at 4:17 PM <john.hubbard@gmail.com> wrote: > > > >> > > > >> From: John Hubbard <jhubbard@nvidia.com> > > > >> > > > >> Introduces put_user_page(), which simply calls put_page(). > > > >> This provides a way to update all get_user_pages*() callers, > > > >> so that they call put_user_page(), instead of put_page(). > > > >> > > > >> Also introduces put_user_pages(), and a few dirty/locked variations, > > > >> as a replacement for release_pages(), and also as a replacement > > > >> for open-coded loops that release multiple pages. > > > >> These may be used for subsequent performance improvements, > > > >> via batching of pages to be released. > > > >> > > > >> This is the first step of fixing the problem described in [1]. The steps > > > >> are: > > > >> > > > >> 1) (This patch): provide put_user_page*() routines, intended to be used > > > >> for releasing pages that were pinned via get_user_pages*(). > > > >> > > > >> 2) Convert all of the call sites for get_user_pages*(), to > > > >> invoke put_user_page*(), instead of put_page(). This involves dozens of > > > >> call sites, and will take some time. > > > >> > > > >> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to > > > >> implement tracking of these pages. This tracking will be separate from > > > >> the existing struct page refcounting. > > > >> > > > >> 4) Use the tracking and identification of these pages, to implement > > > >> special handling (especially in writeback paths) when the pages are > > > >> backed by a filesystem. Again, [1] provides details as to why that is > > > >> desirable. > > > > > > > > I thought at Plumbers we talked about using a page bit to tag pages > > > > that have had their reference count elevated by get_user_pages()? That > > > > way there is no need to distinguish put_page() from put_user_page() it > > > > just happens internally to put_page(). At the conference Matthew was > > > > offering to free up a page bit for this purpose. > > > > > > > > > > ...but then, upon further discussion in that same session, we realized that > > > that doesn't help. You need a reference count. Otherwise a random put_page > > > could affect your dma-pinned pages, etc, etc. > > > > Ok, sorry, I mis-remembered. So, you're effectively trying to capture > > the end of the page pin event separate from the final 'put' of the > > page? Makes sense. > > > > > I was not able to actually find any place where a single additional page > > > bit would help our situation, which is why this still uses LRU fields for > > > both the two bits required (the RFC [1] still applies), and the dma_pinned_count. > > > > Except the LRU fields are already in use for ZONE_DEVICE pages... how > > does this proposal interact with those? > > > > > [1] https://lore.kernel.org/r/20181110085041.10071-7-jhubbard@nvidia.com > > > > > > >> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()" > > > >> > > > >> Reviewed-by: Jan Kara <jack@suse.cz> > > > > > > > > Wish, you could have been there Jan. I'm missing why it's safe to > > > > assume that a single put_user_page() is paired with a get_user_page()? > > > > > > > > > > A put_user_page() per page, or a put_user_pages() for an array of pages. See > > > patch 0002 for several examples. > > > > Yes, however I was more concerned about validation and trying to > > locate missed places where put_page() is used instead of > > put_user_page(). > > > > It would be interesting to see if we could have a debug mode where > > get_user_pages() returned dynamically allocated pages from a known > > address range and catch drivers that operate on a user-pinned page > > without using the proper helper to 'put' it. I think we might also > > need a ref_user_page() for drivers that may do their own get_page() > > and expect the dma_pinned_count to also increase. > > Total crazy idea for this, but this is the right time of day > for this (for me at least it is beer time :)) What about mapping > all struct page in two different range of kernel virtual address > and when get user space is use it returns a pointer from the second > range of kernel virtual address to the struct page. Then in put_page > you know for sure if the code putting the page got it from GUP or > from somewhere else. page_to_pfn() would need some trickery to > handle that. Yes, exactly what I was thinking, if only as a debug mode since instrumenting every pfn/page translation would be expensive. > Dunno if we are running out of kernel virtual address (outside > 32bits that i believe we are trying to shot down quietly behind > the bar). There's room, KASAN is in a roughly similar place.
next prev parent reply index Thread overview: 213+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-04 0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard 2018-12-04 0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard 2018-12-04 7:53 ` Mike Rapoport 2018-12-05 1:40 ` John Hubbard 2018-12-04 20:28 ` Dan Williams 2018-12-04 21:56 ` John Hubbard 2018-12-04 23:03 ` Dan Williams 2018-12-05 0:36 ` Jerome Glisse 2018-12-05 0:40 ` Dan Williams [this message] 2018-12-05 0:59 ` John Hubbard 2018-12-05 0:58 ` John Hubbard 2018-12-05 1:00 ` Dan Williams 2018-12-05 1:15 ` Matthew Wilcox 2018-12-05 1:44 ` Jerome Glisse 2018-12-05 1:57 ` John Hubbard 2018-12-07 2:45 ` John Hubbard 2018-12-07 19:16 ` Jerome Glisse 2018-12-07 19:26 ` Dan Williams 2018-12-07 19:40 ` Jerome Glisse 2018-12-08 0:52 ` John Hubbard 2018-12-08 2:24 ` Jerome Glisse 2018-12-10 10:28 ` Jan Kara 2018-12-12 15:03 ` Jerome Glisse 2018-12-12 16:27 ` Dan Williams 2018-12-12 17:02 ` Jerome Glisse 2018-12-12 17:49 ` Dan Williams 2018-12-12 19:07 ` John Hubbard 2018-12-12 21:30 ` Jerome Glisse 2018-12-12 21:40 ` Dan Williams 2018-12-12 21:53 ` Jerome Glisse 2018-12-12 22:11 ` Matthew Wilcox 2018-12-12 22:16 ` Jerome Glisse 2018-12-12 23:37 ` Jason Gunthorpe 2018-12-12 23:46 ` John Hubbard 2018-12-12 23:54 ` Dan Williams 2018-12-13 0:01 ` Jerome Glisse 2018-12-13 0:18 ` Dan Williams 2018-12-13 0:44 ` Jerome Glisse 2018-12-13 3:26 ` Jason Gunthorpe 2018-12-13 3:20 ` Jason Gunthorpe 2018-12-13 12:43 ` Jerome Glisse 2018-12-13 13:40 ` Tom Talpey 2018-12-13 14:18 ` Jerome Glisse 2018-12-13 14:51 ` Tom Talpey 2018-12-13 15:18 ` Jerome Glisse 2018-12-13 18:12 ` Tom Talpey 2018-12-13 19:18 ` Jerome Glisse 2018-12-14 10:41 ` Jan Kara 2018-12-14 15:25 ` Jerome Glisse 2018-12-12 21:56 ` John Hubbard 2018-12-12 22:04 ` Jerome Glisse 2018-12-12 22:11 ` John Hubbard 2018-12-12 22:14 ` Jerome Glisse 2018-12-12 22:17 ` John Hubbard 2018-12-12 21:46 ` Dave Chinner 2018-12-12 21:59 ` Jerome Glisse 2018-12-13 0:51 ` Dave Chinner 2018-12-13 2:02 ` Jerome Glisse 2018-12-13 15:56 ` Christopher Lameter 2018-12-13 16:02 ` Jerome Glisse 2018-12-14 6:00 ` Dave Chinner 2018-12-14 15:13 ` Jerome Glisse 2018-12-14 3:52 ` John Hubbard 2018-12-14 5:21 ` Dan Williams 2018-12-14 6:11 ` John Hubbard 2018-12-14 15:20 ` Jerome Glisse 2018-12-14 19:38 ` Dan Williams 2018-12-14 19:48 ` Matthew Wilcox 2018-12-14 19:53 ` Dave Hansen 2018-12-14 20:03 ` Matthew Wilcox 2018-12-14 20:17 ` Dan Williams 2018-12-14 20:29 ` Matthew Wilcox 2018-12-15 0:41 ` John Hubbard 2018-12-17 8:56 ` Jan Kara 2018-12-17 18:28 ` Dan Williams 2018-12-14 15:43 ` Jan Kara 2018-12-16 21:58 ` Dave Chinner 2018-12-17 18:11 ` Jerome Glisse 2018-12-17 18:34 ` Matthew Wilcox 2018-12-17 19:48 ` Jerome Glisse 2018-12-17 19:51 ` Matthew Wilcox 2018-12-17 19:54 ` Jerome Glisse 2018-12-17 19:59 ` Matthew Wilcox 2018-12-17 20:55 ` Jerome Glisse 2018-12-17 21:03 ` Matthew Wilcox 2018-12-17 21:15 ` Jerome Glisse 2018-12-18 1:09 ` Dave Chinner 2018-12-18 6:12 ` Darrick J. Wong 2018-12-18 9:30 ` Jan Kara 2018-12-18 23:29 ` John Hubbard 2018-12-19 2:07 ` Jerome Glisse 2018-12-19 11:08 ` Jan Kara 2018-12-20 10:54 ` John Hubbard 2018-12-20 16:50 ` Jerome Glisse 2018-12-20 16:57 ` Dan Williams 2018-12-20 16:49 ` Jerome Glisse 2019-01-03 1:55 ` Jerome Glisse 2019-01-03 3:27 ` John Hubbard 2019-01-03 14:57 ` Jerome Glisse 2019-01-03 9:26 ` Jan Kara 2019-01-03 14:44 ` Jerome Glisse 2019-01-11 2:59 ` John Hubbard 2019-01-11 2:59 ` John Hubbard 2019-01-11 16:51 ` Jerome Glisse 2019-01-11 16:51 ` Jerome Glisse 2019-01-12 1:04 ` John Hubbard 2019-01-12 1:04 ` John Hubbard 2019-01-12 2:02 ` Jerome Glisse 2019-01-12 2:02 ` Jerome Glisse 2019-01-12 2:38 ` John Hubbard 2019-01-12 2:38 ` John Hubbard 2019-01-12 2:46 ` Jerome Glisse 2019-01-12 2:46 ` Jerome Glisse 2019-01-12 3:06 ` John Hubbard 2019-01-12 3:06 ` John Hubbard 2019-01-12 3:25 ` Jerome Glisse 2019-01-12 3:25 ` Jerome Glisse 2019-01-12 20:46 ` John Hubbard 2019-01-12 20:46 ` John Hubbard 2019-01-14 14:54 ` Jan Kara 2019-01-14 14:54 ` Jan Kara 2019-01-14 17:21 ` Jerome Glisse 2019-01-14 17:21 ` Jerome Glisse 2019-01-14 19:09 ` John Hubbard 2019-01-14 19:09 ` John Hubbard 2019-01-15 8:34 ` Jan Kara 2019-01-15 8:34 ` Jan Kara 2019-01-15 21:39 ` John Hubbard 2019-01-15 21:39 ` John Hubbard 2019-01-15 8:07 ` Jan Kara 2019-01-15 8:07 ` Jan Kara 2019-01-15 17:15 ` Jerome Glisse 2019-01-15 17:15 ` Jerome Glisse 2019-01-15 21:56 ` John Hubbard 2019-01-15 21:56 ` John Hubbard 2019-01-15 22:12 ` Jerome Glisse 2019-01-15 22:12 ` Jerome Glisse 2019-01-16 0:44 ` John Hubbard 2019-01-16 0:44 ` John Hubbard 2019-01-16 1:56 ` Jerome Glisse 2019-01-16 1:56 ` Jerome Glisse 2019-01-16 2:01 ` Dan Williams 2019-01-16 2:01 ` Dan Williams 2019-01-16 2:23 ` Jerome Glisse 2019-01-16 2:23 ` Jerome Glisse 2019-01-16 4:34 ` Dave Chinner 2019-01-16 4:34 ` Dave Chinner 2019-01-16 14:50 ` Jerome Glisse 2019-01-16 14:50 ` Jerome Glisse 2019-01-16 22:51 ` Dave Chinner 2019-01-16 22:51 ` Dave Chinner 2019-01-16 11:38 ` Jan Kara 2019-01-16 11:38 ` Jan Kara 2019-01-16 13:08 ` Jerome Glisse 2019-01-16 13:08 ` Jerome Glisse 2019-01-17 5:42 ` John Hubbard 2019-01-17 5:42 ` John Hubbard 2019-01-17 15:21 ` Jerome Glisse 2019-01-17 15:21 ` Jerome Glisse 2019-01-18 0:16 ` Dave Chinner 2019-01-18 1:59 ` Jerome Glisse 2019-01-17 9:30 ` Jan Kara 2019-01-17 9:30 ` Jan Kara 2019-01-17 15:17 ` Jerome Glisse 2019-01-17 15:17 ` Jerome Glisse 2019-01-22 15:24 ` Jan Kara 2019-01-22 16:46 ` Jerome Glisse 2019-01-23 18:02 ` Jan Kara 2019-01-23 19:04 ` Jerome Glisse 2019-01-29 0:22 ` John Hubbard 2019-01-29 1:23 ` Jerome Glisse 2019-01-29 6:41 ` John Hubbard 2019-01-29 10:12 ` Jan Kara 2019-01-30 2:21 ` John Hubbard 2019-01-17 5:25 ` John Hubbard 2019-01-17 5:25 ` John Hubbard 2019-01-17 9:04 ` Jan Kara 2019-01-17 9:04 ` Jan Kara 2019-01-12 3:14 ` Jerome Glisse 2019-01-12 3:14 ` Jerome Glisse 2018-12-18 10:33 ` Jan Kara 2018-12-18 23:42 ` Dave Chinner 2018-12-19 3:03 ` Jason Gunthorpe 2018-12-19 5:26 ` Dan Williams 2018-12-19 11:19 ` Jan Kara 2018-12-19 10:28 ` Dave Chinner 2018-12-19 11:35 ` Jan Kara 2018-12-19 16:56 ` Jason Gunthorpe 2018-12-19 22:33 ` Dave Chinner 2018-12-20 9:07 ` Jan Kara 2018-12-20 16:54 ` Jerome Glisse 2018-12-19 13:24 ` Jan Kara 2018-12-08 5:18 ` Matthew Wilcox 2018-12-12 19:13 ` John Hubbard 2018-12-08 7:16 ` Dan Williams 2018-12-08 16:33 ` Jerome Glisse 2018-12-08 16:48 ` Christoph Hellwig 2018-12-08 17:47 ` Jerome Glisse 2018-12-08 18:26 ` Christoph Hellwig 2018-12-08 18:45 ` Jerome Glisse 2018-12-08 18:09 ` Dan Williams 2018-12-08 18:12 ` Christoph Hellwig 2018-12-11 6:18 ` Dave Chinner 2018-12-05 5:52 ` Dan Williams 2018-12-05 11:16 ` Jan Kara 2018-12-04 0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard 2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight 2018-12-05 1:05 ` John Hubbard 2018-12-05 14:08 ` David Laight 2018-12-28 8:37 ` Pavel Machek 2019-02-08 7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard 2019-02-08 7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard 2019-02-08 10:32 ` Mike Rapoport 2019-02-08 20:44 ` John Hubbard
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAPcyv4hhkB09eBzPErVntZrNYE0RP5+6Z6+sP0kq6ng9ZOzTfg@mail.gmail.com \ --to=dan.j.williams@intel.com \ --cc=akpm@linux-foundation.org \ --cc=benve@cisco.com \ --cc=cl@linux.com \ --cc=dennis.dalessandro@intel.com \ --cc=dledford@redhat.com \ --cc=hch@infradead.org \ --cc=jack@suse.cz \ --cc=jgg@ziepe.ca \ --cc=jglisse@redhat.com \ --cc=jhubbard@nvidia.com \ --cc=john.hubbard@gmail.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=mike.marciniszyn@intel.com \ --cc=rcampbell@nvidia.com \ --cc=tom@talpey.com \ --cc=viro@zeniv.linux.org.uk \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-Fsdevel Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \ linux-fsdevel@vger.kernel.org public-inbox-index linux-fsdevel Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git