All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ira Weiny <ira.weiny@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Jerome Glisse <jglisse@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Daniel Black <daniel@linux.ibm.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*()
Date: Fri, 9 Aug 2019 19:52:10 +0200	[thread overview]
Message-ID: <20190809175210.GR18351@dhcp22.suse.cz> (raw)
In-Reply-To: <20190809135813.GF17568@quack2.suse.cz>

On Fri 09-08-19 15:58:13, Jan Kara wrote:
> On Fri 09-08-19 10:23:07, Michal Hocko wrote:
> > On Fri 09-08-19 10:12:48, Vlastimil Babka wrote:
> > > On 8/9/19 12:59 AM, John Hubbard wrote:
> > > >>> That's true. However, I'm not sure munlocking is where the
> > > >>> put_user_page() machinery is intended to be used anyway? These are
> > > >>> short-term pins for struct page manipulation, not e.g. dirtying of page
> > > >>> contents. Reading commit fc1d8e7cca2d I don't think this case falls
> > > >>> within the reasoning there. Perhaps not all GUP users should be
> > > >>> converted to the planned separate GUP tracking, and instead we should
> > > >>> have a GUP/follow_page_mask() variant that keeps using get_page/put_page?
> > > >>>  
> > > >>
> > > >> Interesting. So far, the approach has been to get all the gup callers to
> > > >> release via put_user_page(), but if we add in Jan's and Ira's vaddr_pin_pages()
> > > >> wrapper, then maybe we could leave some sites unconverted.
> > > >>
> > > >> However, in order to do so, we would have to change things so that we have
> > > >> one set of APIs (gup) that do *not* increment a pin count, and another set
> > > >> (vaddr_pin_pages) that do. 
> > > >>
> > > >> Is that where we want to go...?
> > > >>
> > > 
> > > We already have a FOLL_LONGTERM flag, isn't that somehow related? And if
> > > it's not exactly the same thing, perhaps a new gup flag to distinguish
> > > which kind of pinning to use?
> > 
> > Agreed. This is a shiny example how forcing all existing gup users into
> > the new scheme is subotimal at best. Not the mention the overal
> > fragility mention elsewhere. I dislike the conversion even more now.
> > 
> > Sorry if this was already discussed already but why the new pinning is
> > not bound to FOLL_LONGTERM (ideally hidden by an interface so that users
> > do not have to care about the flag) only?
> 
> The new tracking cannot be bound to FOLL_LONGTERM. Anything that gets page
> reference and then touches page data (e.g. direct IO) needs the new kind of
> tracking so that filesystem knows someone is messing with the page data.
> So what John is trying to address is a different (although related) problem
> to someone pinning a page for a long time.

OK, I see. Thanks for the clarification.

> In principle, I'm not strongly opposed to a new FOLL flag to determine
> whether a pin or an ordinary page reference will be acquired at least as an
> internal implementation detail inside mm/gup.c. But I would really like to
> discourage new GUP users taking just page reference as the most clueless
> users (drivers) usually need a pin in the sense John implements. So in
> terms of API I'd strongly prefer to deprecate GUP as an API, provide
> vaddr_pin_pages() for drivers to get their buffer pages pinned and then for
> those few users who really know what they are doing (and who are not
> interested in page contents) we can have APIs like follow_page() to get a
> page reference from a virtual address.

Yes, going with a dedicated API sounds much better to me. Whether a
dedicated FOLL flag is used internally is not that important. I am also
for making the underlying gup to be really internal to the core kernel.

Thanks!
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-08-09 17:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05 22:20 [PATCH 0/3] mm/: 3 more put_user_page() conversions john.hubbard
2019-08-05 22:20 ` [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() john.hubbard
2019-08-07 11:01   ` Michal Hocko
2019-08-07 23:32     ` John Hubbard
2019-08-08  6:21       ` Michal Hocko
2019-08-08 11:09         ` Vlastimil Babka
2019-08-08 19:20           ` John Hubbard
2019-08-08 22:59             ` John Hubbard
2019-08-08 23:41               ` Ira Weiny
2019-08-08 23:57                 ` John Hubbard
2019-08-09 18:22                   ` Weiny, Ira
2019-08-09  8:12               ` Vlastimil Babka
2019-08-09  8:23                 ` Michal Hocko
2019-08-09  9:05                   ` John Hubbard
2019-08-09  9:16                     ` Michal Hocko
2019-08-09 13:58                   ` Jan Kara
2019-08-09 17:52                     ` Michal Hocko [this message]
2019-08-09 18:14                       ` Weiny, Ira
2019-08-09 18:36                         ` John Hubbard
2019-08-05 22:20 ` [PATCH 2/3] mm/mempolicy.c: " john.hubbard
2019-08-05 22:20 ` [PATCH 3/3] mm/ksm: " john.hubbard
2019-08-06 21:59 ` [PATCH 0/3] mm/: 3 more put_user_page() conversions Andrew Morton
2019-08-06 22:05   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190809175210.GR18351@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=daniel@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.