LKML Archive on lore.kernel.org
 help / Atom feed
From: Jan Kara <jack@suse.cz>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Dan Williams <dan.j.williams@intel.com>,
	John Hubbard <john.hubbard@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>,
	benve@cisco.com, Christoph Hellwig <hch@infradead.org>,
	Christopher Lameter <cl@linux.com>,
	"Dalessandro, Dennis" <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Michal Hocko <mhocko@kernel.org>,
	mike.marciniszyn@intel.com, rcampbell@nvidia.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
Date: Mon, 14 Jan 2019 15:54:47 +0100
Message-ID: <20190114145447.GJ13316@quack2.suse.cz> (raw)
In-Reply-To: <b6f4ed36-fc8d-1f9b-8c74-b12f61d496ae@nvidia.com>

On Fri 11-01-19 19:06:08, John Hubbard wrote:
> On 1/11/19 6:46 PM, Jerome Glisse wrote:
> > On Fri, Jan 11, 2019 at 06:38:44PM -0800, John Hubbard wrote:
> > [...]
> > 
> >>>> The other idea that you and Dan (and maybe others) pointed out was a debug
> >>>> option, which we'll certainly need in order to safely convert all the call
> >>>> sites. (Mirror the mappings at a different kernel offset, so that put_page()
> >>>> and put_user_page() can verify that the right call was made.)  That will be
> >>>> a separate patchset, as you recommended.
> >>>>
> >>>> I'll even go as far as recommending the page lock itself. I realize that this 
> >>>> adds overhead to gup(), but we *must* hold off page_mkclean(), and I believe
> >>>> that this (below) has similar overhead to the notes above--but is *much* easier
> >>>> to verify correct. (If the page lock is unacceptable due to being so widely used,
> >>>> then I'd recommend using another page bit to do the same thing.)
> >>>
> >>> Please page lock is pointless and it will not work for GUP fast. The above
> >>> scheme do work and is fine. I spend the day again thinking about all memory
> >>> ordering and i do not see any issues.
> >>>
> >>
> >> Why is it that page lock cannot be used for gup fast, btw?
> > 
> > Well it can not happen within the preempt disable section. But after
> > as a post pass before GUP_fast return and after reenabling preempt then
> > it is fine like it would be for regular GUP. But locking page for GUP
> > is also likely to slow down some workload (with direct-IO).
> > 
> 
> Right, and so to crux of the matter: taking an uncontended page lock
> involves pretty much the same set of operations that your approach does.
> (If gup ends up contended with the page lock for other reasons than these
> paths, that seems surprising.) I'd expect very similar performance.
> 
> But the page lock approach leads to really dramatically simpler code (and
> code reviews, let's not forget). Any objection to my going that
> direction, and keeping this idea as a Plan B? I think the next step will
> be, once again, to gather some performance metrics, so maybe that will
> help us decide.

FWIW I agree that using page lock for protecting page pinning (and thus
avoid races with page_mkclean()) looks simpler to me as well and I'm not
convinced there will be measurable difference to the more complex scheme
with barriers Jerome suggests unless that page lock contended. Jerome is
right that you cannot just do lock_page() in gup_fast() path. There you
have to do trylock_page() and if that fails just bail out to the slow gup
path.

Regarding places other than page_mkclean() that need to check pinned state:
Definitely page migration will want to check whether the page is pinned or
not so that it can deal differently with short-term page references vs
longer-term pins.

Also there is one more idea I had how to record number of pins in the page:

#define PAGE_PIN_BIAS	1024

get_page_pin()
	atomic_add(&page->_refcount, PAGE_PIN_BIAS);

put_page_pin();
	atomic_add(&page->_refcount, -PAGE_PIN_BIAS);

page_pinned(page)
	(atomic_read(&page->_refcount) - page_mapcount(page)) > PAGE_PIN_BIAS

This is pretty trivial scheme. It still gives us 22-bits for page pins
which should be plenty (but we should check for that and bail with error if
it would overflow). Also there will be no false negatives and false
positives only if there are more than 1024 non-page-table references to the
page which I expect to be rare (we might want to also subtract
hpage_nr_pages() for radix tree references to avoid excessive false
positives for huge pages although at this point I don't think they would
matter). Thoughts?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply index

Thread overview: 179+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04  0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard
2018-12-04  0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-12-04  7:53   ` Mike Rapoport
2018-12-05  1:40     ` John Hubbard
2018-12-04 20:28   ` Dan Williams
2018-12-04 21:56     ` John Hubbard
2018-12-04 23:03       ` Dan Williams
2018-12-05  0:36         ` Jerome Glisse
2018-12-05  0:40           ` Dan Williams
2018-12-05  0:59             ` John Hubbard
2018-12-05  0:58         ` John Hubbard
2018-12-05  1:00           ` Dan Williams
2018-12-05  1:15           ` Matthew Wilcox
2018-12-05  1:44             ` Jerome Glisse
2018-12-05  1:57               ` John Hubbard
2018-12-07  2:45                 ` John Hubbard
2018-12-07 19:16                   ` Jerome Glisse
2018-12-07 19:26                     ` Dan Williams
2018-12-07 19:40                       ` Jerome Glisse
2018-12-08  0:52                     ` John Hubbard
2018-12-08  2:24                       ` Jerome Glisse
2018-12-10 10:28                         ` Jan Kara
2018-12-12 15:03                           ` Jerome Glisse
2018-12-12 16:27                             ` Dan Williams
2018-12-12 17:02                               ` Jerome Glisse
2018-12-12 17:49                                 ` Dan Williams
2018-12-12 19:07                                   ` John Hubbard
2018-12-12 21:30                               ` Jerome Glisse
2018-12-12 21:40                                 ` Dan Williams
2018-12-12 21:53                                   ` Jerome Glisse
2018-12-12 22:11                                     ` Matthew Wilcox
2018-12-12 22:16                                       ` Jerome Glisse
2018-12-12 23:37                                     ` Jason Gunthorpe
2018-12-12 23:46                                       ` John Hubbard
2018-12-12 23:54                                       ` Dan Williams
2018-12-13  0:01                                       ` Jerome Glisse
2018-12-13  0:18                                         ` Dan Williams
2018-12-13  0:44                                           ` Jerome Glisse
2018-12-13  3:26                                             ` Jason Gunthorpe
2018-12-13  3:20                                         ` Jason Gunthorpe
2018-12-13 12:43                                           ` Jerome Glisse
2018-12-13 13:40                                             ` Tom Talpey
2018-12-13 14:18                                               ` Jerome Glisse
2018-12-13 14:51                                                 ` Tom Talpey
2018-12-13 15:18                                                   ` Jerome Glisse
2018-12-13 18:12                                                     ` Tom Talpey
2018-12-13 19:18                                                       ` Jerome Glisse
2018-12-14 10:41                                             ` Jan Kara
2018-12-14 15:25                                               ` Jerome Glisse
2018-12-12 21:56                                 ` John Hubbard
2018-12-12 22:04                                   ` Jerome Glisse
2018-12-12 22:11                                     ` John Hubbard
2018-12-12 22:14                                       ` Jerome Glisse
2018-12-12 22:17                                         ` John Hubbard
2018-12-12 21:46                             ` Dave Chinner
2018-12-12 21:59                               ` Jerome Glisse
2018-12-13  0:51                                 ` Dave Chinner
2018-12-13  2:02                                   ` Jerome Glisse
2018-12-13 15:56                                     ` Christopher Lameter
2018-12-13 16:02                                       ` Jerome Glisse
2018-12-14  6:00                                     ` Dave Chinner
2018-12-14 15:13                                       ` Jerome Glisse
2018-12-14  3:52                                   ` John Hubbard
2018-12-14  5:21                                     ` Dan Williams
2018-12-14  6:11                                       ` John Hubbard
2018-12-14 15:20                                         ` Jerome Glisse
2018-12-14 19:38                                         ` Dan Williams
2018-12-14 19:48                                           ` Matthew Wilcox
2018-12-14 19:53                                             ` Dave Hansen
2018-12-14 20:03                                               ` Matthew Wilcox
2018-12-14 20:17                                                 ` Dan Williams
2018-12-14 20:29                                                   ` Matthew Wilcox
2018-12-15  0:41                                                 ` John Hubbard
2018-12-17  8:56                                           ` Jan Kara
2018-12-17 18:28                                             ` Dan Williams
2018-12-14 15:43                               ` Jan Kara
2018-12-16 21:58                                 ` Dave Chinner
2018-12-17 18:11                                   ` Jerome Glisse
2018-12-17 18:34                                     ` Matthew Wilcox
2018-12-17 19:48                                       ` Jerome Glisse
2018-12-17 19:51                                         ` Matthew Wilcox
2018-12-17 19:54                                           ` Jerome Glisse
2018-12-17 19:59                                             ` Matthew Wilcox
2018-12-17 20:55                                               ` Jerome Glisse
2018-12-17 21:03                                                 ` Matthew Wilcox
2018-12-17 21:15                                                   ` Jerome Glisse
2018-12-18  1:09                                       ` Dave Chinner
2018-12-18  6:12                                       ` Darrick J. Wong
2018-12-18  9:30                                       ` Jan Kara
2018-12-18 23:29                                         ` John Hubbard
2018-12-19  2:07                                           ` Jerome Glisse
2018-12-19 11:08                                             ` Jan Kara
2018-12-20 10:54                                               ` John Hubbard
2018-12-20 16:50                                                 ` Jerome Glisse
2018-12-20 16:57                                                   ` Dan Williams
2018-12-20 16:49                                               ` Jerome Glisse
2019-01-03  1:55                                               ` Jerome Glisse
2019-01-03  3:27                                                 ` John Hubbard
2019-01-03 14:57                                                   ` Jerome Glisse
2019-01-03  9:26                                                 ` Jan Kara
2019-01-03 14:44                                                   ` Jerome Glisse
2019-01-11  2:59                                                     ` John Hubbard
2019-01-11 16:51                                                       ` Jerome Glisse
2019-01-12  1:04                                                         ` John Hubbard
2019-01-12  2:02                                                           ` Jerome Glisse
2019-01-12  2:38                                                             ` John Hubbard
2019-01-12  2:46                                                               ` Jerome Glisse
2019-01-12  3:06                                                                 ` John Hubbard
2019-01-12  3:25                                                                   ` Jerome Glisse
2019-01-12 20:46                                                                     ` John Hubbard
2019-01-14 14:54                                                                   ` Jan Kara [this message]
2019-01-14 17:21                                                                     ` Jerome Glisse
2019-01-14 19:09                                                                       ` John Hubbard
2019-01-15  8:34                                                                         ` Jan Kara
2019-01-15 21:39                                                                           ` John Hubbard
2019-01-15  8:07                                                                       ` Jan Kara
2019-01-15 17:15                                                                         ` Jerome Glisse
2019-01-15 21:56                                                                           ` John Hubbard
2019-01-15 22:12                                                                             ` Jerome Glisse
2019-01-16  0:44                                                                               ` John Hubbard
2019-01-16  1:56                                                                                 ` Jerome Glisse
2019-01-16  2:01                                                                                   ` Dan Williams
2019-01-16  2:23                                                                                     ` Jerome Glisse
2019-01-16  4:34                                                                                       ` Dave Chinner
2019-01-16 14:50                                                                                         ` Jerome Glisse
2019-01-16 22:51                                                                                           ` Dave Chinner
2019-01-16 11:38                                                                         ` Jan Kara
2019-01-16 13:08                                                                           ` Jerome Glisse
2019-01-17  5:42                                                                             ` John Hubbard
2019-01-17 15:21                                                                               ` Jerome Glisse
2019-01-18  0:16                                                                                 ` Dave Chinner
2019-01-18  1:59                                                                                   ` Jerome Glisse
2019-01-17  9:30                                                                             ` Jan Kara
2019-01-17 15:17                                                                               ` Jerome Glisse
2019-01-22 15:24                                                                                 ` Jan Kara
2019-01-22 16:46                                                                                   ` Jerome Glisse
2019-01-23 18:02                                                                                     ` Jan Kara
2019-01-23 19:04                                                                                       ` Jerome Glisse
2019-01-29  0:22                                                                                         ` John Hubbard
2019-01-29  1:23                                                                                           ` Jerome Glisse
2019-01-29  6:41                                                                                             ` John Hubbard
2019-01-29 10:12                                                                                               ` Jan Kara
2019-01-30  2:21                                                                                                 ` John Hubbard
2019-01-17  5:25                                                                         ` John Hubbard
2019-01-17  9:04                                                                           ` Jan Kara
2019-01-12  3:14                                                               ` Jerome Glisse
2018-12-18 10:33                                   ` Jan Kara
2018-12-18 23:42                                     ` Dave Chinner
2018-12-19  3:03                                       ` Jason Gunthorpe
2018-12-19  5:26                                         ` Dan Williams
2018-12-19 11:19                                           ` Jan Kara
2018-12-19 10:28                                         ` Dave Chinner
2018-12-19 11:35                                           ` Jan Kara
2018-12-19 16:56                                             ` Jason Gunthorpe
2018-12-19 22:33                                             ` Dave Chinner
2018-12-20  9:07                                               ` Jan Kara
2018-12-20 16:54                                               ` Jerome Glisse
2018-12-19 13:24                                       ` Jan Kara
2018-12-08  5:18                       ` Matthew Wilcox
2018-12-12 19:13                         ` John Hubbard
2018-12-08  7:16                       ` Dan Williams
2018-12-08 16:33                         ` Jerome Glisse
2018-12-08 16:48                           ` Christoph Hellwig
2018-12-08 17:47                             ` Jerome Glisse
2018-12-08 18:26                               ` Christoph Hellwig
2018-12-08 18:45                                 ` Jerome Glisse
2018-12-08 18:09                             ` Dan Williams
2018-12-08 18:12                               ` Christoph Hellwig
2018-12-11  6:18                               ` Dave Chinner
2018-12-05  5:52             ` Dan Williams
2018-12-05 11:16       ` Jan Kara
2018-12-04  0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight
2018-12-05  1:05   ` John Hubbard
2018-12-05 14:08     ` David Laight
2018-12-28  8:37       ` Pavel Machek
2019-02-08  7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard
2019-02-08  7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-02-08 10:32   ` Mike Rapoport
2019-02-08 20:44     ` John Hubbard

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190114145447.GJ13316@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=benve@cisco.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=rcampbell@nvidia.com \
    --cc=tom@talpey.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox