From: Jerome Glisse <jglisse@redhat.com> To: Jan Kara <jack@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com>, Matthew Wilcox <willy@infradead.org>, Dave Chinner <david@fromorbit.com>, Dan Williams <dan.j.williams@intel.com>, John Hubbard <john.hubbard@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Linux MM <linux-mm@kvack.org>, tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>, benve@cisco.com, Christoph Hellwig <hch@infradead.org>, Christopher Lameter <cl@linux.com>, "Dalessandro, Dennis" <dennis.dalessandro@intel.com>, Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@ziepe.ca>, Michal Hocko <mhocko@kernel.org>, mike.marciniszyn@intel.com, rcampbell@nvidia.com, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions Date: Wed, 16 Jan 2019 08:08:14 -0500 Message-ID: <20190116130813.GA3617@redhat.com> (raw) Message-ID: <20190116130814.RRLFoLsFOQKt6Es_4GgOCdVDWw3hxWaVH3Wg1D9tra0@z> (raw) In-Reply-To: <20190116113819.GD26069@quack2.suse.cz> On Wed, Jan 16, 2019 at 12:38:19PM +0100, Jan Kara wrote: > On Tue 15-01-19 09:07:59, Jan Kara wrote: > > Agreed. So with page lock it would actually look like: > > > > get_page_pin() > > lock_page(page); > > wait_for_stable_page(); > > atomic_add(&page->_refcount, PAGE_PIN_BIAS); > > unlock_page(page); > > > > And if we perform page_pinned() check under page lock, then if > > page_pinned() returned false, we are sure page is not and will not be > > pinned until we drop the page lock (and also until page writeback is > > completed if needed). > > After some more though, why do we even need wait_for_stable_page() and > lock_page() in get_page_pin()? > > During writepage page_mkclean() will write protect all page tables. So > there can be no new writeable GUP pins until we unlock the page as all such > GUPs will have to first go through fault and ->page_mkwrite() handler. And > that will wait on page lock and do wait_for_stable_page() for us anyway. > Am I just confused? Yeah with page lock it should synchronize on the pte but you still need to check for writeback iirc the page is unlocked after file system has queue up the write and thus the page can be unlock with write back pending (and PageWriteback() == trye) and i am not sure that in that states we can safely let anyone write to that page. I am assuming that in some case the block device also expect stable page content (RAID stuff). So the PageWriteback() test is not only for racing page_mkclean()/ test_set_page_writeback() and GUP but also for pending write back. > That actually touches on another question I wanted to get opinions on. GUP > can be for read and GUP can be for write (that is one of GUP flags). > Filesystems with page cache generally have issues only with GUP for write > as it can currently corrupt data, unexpectedly dirty page etc.. DAX & memory > hotplug have issues with both (DAX cannot truncate page pinned in any way, > memory hotplug will just loop in kernel until the page gets unpinned). So > we probably want to track both types of GUP pins and page-cache based > filesystems will take the hit even if they don't have to for read-pins? Yes the distinction between read and write would be nice. With the map count solution you can only increment the mapcount for GUP(write=true). With pin bias the issue is that a big number of read pin can trigger false positive ie you would do: GUP(vaddr, write) ... if (write) atomic_add(page->refcount, PAGE_PIN_BIAS) else atomic_inc(page->refcount) PUP(page, write) if (write) atomic_add(page->refcount, -PAGE_PIN_BIAS) else atomic_dec(page->refcount) I am guessing false positive because of too many read GUP is ok as it should be unlikely and when it happens then we take the hit. Cheers, Jérôme
next prev parent reply index Thread overview: 213+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-04 0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard 2018-12-04 0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard 2018-12-04 7:53 ` Mike Rapoport 2018-12-05 1:40 ` John Hubbard 2018-12-04 20:28 ` Dan Williams 2018-12-04 21:56 ` John Hubbard 2018-12-04 23:03 ` Dan Williams 2018-12-05 0:36 ` Jerome Glisse 2018-12-05 0:40 ` Dan Williams 2018-12-05 0:59 ` John Hubbard 2018-12-05 0:58 ` John Hubbard 2018-12-05 1:00 ` Dan Williams 2018-12-05 1:15 ` Matthew Wilcox 2018-12-05 1:44 ` Jerome Glisse 2018-12-05 1:57 ` John Hubbard 2018-12-07 2:45 ` John Hubbard 2018-12-07 19:16 ` Jerome Glisse 2018-12-07 19:26 ` Dan Williams 2018-12-07 19:40 ` Jerome Glisse 2018-12-08 0:52 ` John Hubbard 2018-12-08 2:24 ` Jerome Glisse 2018-12-10 10:28 ` Jan Kara 2018-12-12 15:03 ` Jerome Glisse 2018-12-12 16:27 ` Dan Williams 2018-12-12 17:02 ` Jerome Glisse 2018-12-12 17:49 ` Dan Williams 2018-12-12 19:07 ` John Hubbard 2018-12-12 21:30 ` Jerome Glisse 2018-12-12 21:40 ` Dan Williams 2018-12-12 21:53 ` Jerome Glisse 2018-12-12 22:11 ` Matthew Wilcox 2018-12-12 22:16 ` Jerome Glisse 2018-12-12 23:37 ` Jason Gunthorpe 2018-12-12 23:46 ` John Hubbard 2018-12-12 23:54 ` Dan Williams 2018-12-13 0:01 ` Jerome Glisse 2018-12-13 0:18 ` Dan Williams 2018-12-13 0:44 ` Jerome Glisse 2018-12-13 3:26 ` Jason Gunthorpe 2018-12-13 3:20 ` Jason Gunthorpe 2018-12-13 12:43 ` Jerome Glisse 2018-12-13 13:40 ` Tom Talpey 2018-12-13 14:18 ` Jerome Glisse 2018-12-13 14:51 ` Tom Talpey 2018-12-13 15:18 ` Jerome Glisse 2018-12-13 18:12 ` Tom Talpey 2018-12-13 19:18 ` Jerome Glisse 2018-12-14 10:41 ` Jan Kara 2018-12-14 15:25 ` Jerome Glisse 2018-12-12 21:56 ` John Hubbard 2018-12-12 22:04 ` Jerome Glisse 2018-12-12 22:11 ` John Hubbard 2018-12-12 22:14 ` Jerome Glisse 2018-12-12 22:17 ` John Hubbard 2018-12-12 21:46 ` Dave Chinner 2018-12-12 21:59 ` Jerome Glisse 2018-12-13 0:51 ` Dave Chinner 2018-12-13 2:02 ` Jerome Glisse 2018-12-13 15:56 ` Christopher Lameter 2018-12-13 16:02 ` Jerome Glisse 2018-12-14 6:00 ` Dave Chinner 2018-12-14 15:13 ` Jerome Glisse 2018-12-14 3:52 ` John Hubbard 2018-12-14 5:21 ` Dan Williams 2018-12-14 6:11 ` John Hubbard 2018-12-14 15:20 ` Jerome Glisse 2018-12-14 19:38 ` Dan Williams 2018-12-14 19:48 ` Matthew Wilcox 2018-12-14 19:53 ` Dave Hansen 2018-12-14 20:03 ` Matthew Wilcox 2018-12-14 20:17 ` Dan Williams 2018-12-14 20:29 ` Matthew Wilcox 2018-12-15 0:41 ` John Hubbard 2018-12-17 8:56 ` Jan Kara 2018-12-17 18:28 ` Dan Williams 2018-12-14 15:43 ` Jan Kara 2018-12-16 21:58 ` Dave Chinner 2018-12-17 18:11 ` Jerome Glisse 2018-12-17 18:34 ` Matthew Wilcox 2018-12-17 19:48 ` Jerome Glisse 2018-12-17 19:51 ` Matthew Wilcox 2018-12-17 19:54 ` Jerome Glisse 2018-12-17 19:59 ` Matthew Wilcox 2018-12-17 20:55 ` Jerome Glisse 2018-12-17 21:03 ` Matthew Wilcox 2018-12-17 21:15 ` Jerome Glisse 2018-12-18 1:09 ` Dave Chinner 2018-12-18 6:12 ` Darrick J. Wong 2018-12-18 9:30 ` Jan Kara 2018-12-18 23:29 ` John Hubbard 2018-12-19 2:07 ` Jerome Glisse 2018-12-19 11:08 ` Jan Kara 2018-12-20 10:54 ` John Hubbard 2018-12-20 16:50 ` Jerome Glisse 2018-12-20 16:57 ` Dan Williams 2018-12-20 16:49 ` Jerome Glisse 2019-01-03 1:55 ` Jerome Glisse 2019-01-03 3:27 ` John Hubbard 2019-01-03 14:57 ` Jerome Glisse 2019-01-03 9:26 ` Jan Kara 2019-01-03 14:44 ` Jerome Glisse 2019-01-11 2:59 ` John Hubbard 2019-01-11 2:59 ` John Hubbard 2019-01-11 16:51 ` Jerome Glisse 2019-01-11 16:51 ` Jerome Glisse 2019-01-12 1:04 ` John Hubbard 2019-01-12 1:04 ` John Hubbard 2019-01-12 2:02 ` Jerome Glisse 2019-01-12 2:02 ` Jerome Glisse 2019-01-12 2:38 ` John Hubbard 2019-01-12 2:38 ` John Hubbard 2019-01-12 2:46 ` Jerome Glisse 2019-01-12 2:46 ` Jerome Glisse 2019-01-12 3:06 ` John Hubbard 2019-01-12 3:06 ` John Hubbard 2019-01-12 3:25 ` Jerome Glisse 2019-01-12 3:25 ` Jerome Glisse 2019-01-12 20:46 ` John Hubbard 2019-01-12 20:46 ` John Hubbard 2019-01-14 14:54 ` Jan Kara 2019-01-14 14:54 ` Jan Kara 2019-01-14 17:21 ` Jerome Glisse 2019-01-14 17:21 ` Jerome Glisse 2019-01-14 19:09 ` John Hubbard 2019-01-14 19:09 ` John Hubbard 2019-01-15 8:34 ` Jan Kara 2019-01-15 8:34 ` Jan Kara 2019-01-15 21:39 ` John Hubbard 2019-01-15 21:39 ` John Hubbard 2019-01-15 8:07 ` Jan Kara 2019-01-15 8:07 ` Jan Kara 2019-01-15 17:15 ` Jerome Glisse 2019-01-15 17:15 ` Jerome Glisse 2019-01-15 21:56 ` John Hubbard 2019-01-15 21:56 ` John Hubbard 2019-01-15 22:12 ` Jerome Glisse 2019-01-15 22:12 ` Jerome Glisse 2019-01-16 0:44 ` John Hubbard 2019-01-16 0:44 ` John Hubbard 2019-01-16 1:56 ` Jerome Glisse 2019-01-16 1:56 ` Jerome Glisse 2019-01-16 2:01 ` Dan Williams 2019-01-16 2:01 ` Dan Williams 2019-01-16 2:23 ` Jerome Glisse 2019-01-16 2:23 ` Jerome Glisse 2019-01-16 4:34 ` Dave Chinner 2019-01-16 4:34 ` Dave Chinner 2019-01-16 14:50 ` Jerome Glisse 2019-01-16 14:50 ` Jerome Glisse 2019-01-16 22:51 ` Dave Chinner 2019-01-16 22:51 ` Dave Chinner 2019-01-16 11:38 ` Jan Kara 2019-01-16 11:38 ` Jan Kara 2019-01-16 13:08 ` Jerome Glisse [this message] 2019-01-16 13:08 ` Jerome Glisse 2019-01-17 5:42 ` John Hubbard 2019-01-17 5:42 ` John Hubbard 2019-01-17 15:21 ` Jerome Glisse 2019-01-17 15:21 ` Jerome Glisse 2019-01-18 0:16 ` Dave Chinner 2019-01-18 1:59 ` Jerome Glisse 2019-01-17 9:30 ` Jan Kara 2019-01-17 9:30 ` Jan Kara 2019-01-17 15:17 ` Jerome Glisse 2019-01-17 15:17 ` Jerome Glisse 2019-01-22 15:24 ` Jan Kara 2019-01-22 16:46 ` Jerome Glisse 2019-01-23 18:02 ` Jan Kara 2019-01-23 19:04 ` Jerome Glisse 2019-01-29 0:22 ` John Hubbard 2019-01-29 1:23 ` Jerome Glisse 2019-01-29 6:41 ` John Hubbard 2019-01-29 10:12 ` Jan Kara 2019-01-30 2:21 ` John Hubbard 2019-01-17 5:25 ` John Hubbard 2019-01-17 5:25 ` John Hubbard 2019-01-17 9:04 ` Jan Kara 2019-01-17 9:04 ` Jan Kara 2019-01-12 3:14 ` Jerome Glisse 2019-01-12 3:14 ` Jerome Glisse 2018-12-18 10:33 ` Jan Kara 2018-12-18 23:42 ` Dave Chinner 2018-12-19 3:03 ` Jason Gunthorpe 2018-12-19 5:26 ` Dan Williams 2018-12-19 11:19 ` Jan Kara 2018-12-19 10:28 ` Dave Chinner 2018-12-19 11:35 ` Jan Kara 2018-12-19 16:56 ` Jason Gunthorpe 2018-12-19 22:33 ` Dave Chinner 2018-12-20 9:07 ` Jan Kara 2018-12-20 16:54 ` Jerome Glisse 2018-12-19 13:24 ` Jan Kara 2018-12-08 5:18 ` Matthew Wilcox 2018-12-12 19:13 ` John Hubbard 2018-12-08 7:16 ` Dan Williams 2018-12-08 16:33 ` Jerome Glisse 2018-12-08 16:48 ` Christoph Hellwig 2018-12-08 17:47 ` Jerome Glisse 2018-12-08 18:26 ` Christoph Hellwig 2018-12-08 18:45 ` Jerome Glisse 2018-12-08 18:09 ` Dan Williams 2018-12-08 18:12 ` Christoph Hellwig 2018-12-11 6:18 ` Dave Chinner 2018-12-05 5:52 ` Dan Williams 2018-12-05 11:16 ` Jan Kara 2018-12-04 0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard 2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight 2018-12-05 1:05 ` John Hubbard 2018-12-05 14:08 ` David Laight 2018-12-28 8:37 ` Pavel Machek 2019-02-08 7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard 2019-02-08 7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard 2019-02-08 10:32 ` Mike Rapoport 2019-02-08 20:44 ` John Hubbard
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190116130813.GA3617@redhat.com \ --to=jglisse@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=benve@cisco.com \ --cc=cl@linux.com \ --cc=dan.j.williams@intel.com \ --cc=david@fromorbit.com \ --cc=dennis.dalessandro@intel.com \ --cc=dledford@redhat.com \ --cc=hch@infradead.org \ --cc=jack@suse.cz \ --cc=jgg@ziepe.ca \ --cc=jhubbard@nvidia.com \ --cc=john.hubbard@gmail.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=mike.marciniszyn@intel.com \ --cc=rcampbell@nvidia.com \ --cc=tom@talpey.com \ --cc=viro@zeniv.linux.org.uk \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-Fsdevel Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \ linux-fsdevel@vger.kernel.org public-inbox-index linux-fsdevel Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git