From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions To: Dan Williams , John Hubbard CC: Andrew Morton , Linux MM , Jan Kara , , Al Viro , , Christoph Hellwig , Christopher Lameter , "Dalessandro, Dennis" , Doug Ledford , Jason Gunthorpe , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Matthew Wilcox , Michal Hocko , , , Linux Kernel Mailing List , linux-fsdevel References: <20181204001720.26138-1-jhubbard@nvidia.com> <20181204001720.26138-2-jhubbard@nvidia.com> From: John Hubbard Message-ID: Date: Tue, 4 Dec 2018 13:56:36 -0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US-large Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On 12/4/18 12:28 PM, Dan Williams wrote: > On Mon, Dec 3, 2018 at 4:17 PM wrote: >> >> From: John Hubbard >> >> Introduces put_user_page(), which simply calls put_page(). >> This provides a way to update all get_user_pages*() callers, >> so that they call put_user_page(), instead of put_page(). >> >> Also introduces put_user_pages(), and a few dirty/locked variations, >> as a replacement for release_pages(), and also as a replacement >> for open-coded loops that release multiple pages. >> These may be used for subsequent performance improvements, >> via batching of pages to be released. >> >> This is the first step of fixing the problem described in [1]. The steps >> are: >> >> 1) (This patch): provide put_user_page*() routines, intended to be used >> for releasing pages that were pinned via get_user_pages*(). >> >> 2) Convert all of the call sites for get_user_pages*(), to >> invoke put_user_page*(), instead of put_page(). This involves dozens of >> call sites, and will take some time. >> >> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to >> implement tracking of these pages. This tracking will be separate from >> the existing struct page refcounting. >> >> 4) Use the tracking and identification of these pages, to implement >> special handling (especially in writeback paths) when the pages are >> backed by a filesystem. Again, [1] provides details as to why that is >> desirable. > > I thought at Plumbers we talked about using a page bit to tag pages > that have had their reference count elevated by get_user_pages()? That > way there is no need to distinguish put_page() from put_user_page() it > just happens internally to put_page(). At the conference Matthew was > offering to free up a page bit for this purpose. > ...but then, upon further discussion in that same session, we realized that that doesn't help. You need a reference count. Otherwise a random put_page could affect your dma-pinned pages, etc, etc. I was not able to actually find any place where a single additional page bit would help our situation, which is why this still uses LRU fields for both the two bits required (the RFC [1] still applies), and the dma_pinned_count. [1] https://lore.kernel.org/r/20181110085041.10071-7-jhubbard@nvidia.com >> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()" >> >> Reviewed-by: Jan Kara > > Wish, you could have been there Jan. I'm missing why it's safe to > assume that a single put_user_page() is paired with a get_user_page()? > A put_user_page() per page, or a put_user_pages() for an array of pages. See patch 0002 for several examples. thanks, -- John Hubbard NVIDIA