linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, linux-mm@kvack.org,
	John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Ming Lei <ming.lei@redhat.com>,
	Dave Chinner <david@fromorbit.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v1 10/15] block: add gup flag to bio_add_page()/bio_add_pc_page()/__bio_add_page()
Date: Tue, 16 Apr 2019 13:07:24 -0400	[thread overview]
Message-ID: <20190416170723.GC3254@redhat.com> (raw)
In-Reply-To: <20190416164658.GB17148@quack2.suse.cz>

On Tue, Apr 16, 2019 at 06:46:58PM +0200, Jan Kara wrote:
> On Mon 15-04-19 11:24:33, Jerome Glisse wrote:
> > On Mon, Apr 15, 2019 at 04:59:52PM +0200, Jan Kara wrote:
> > > Hi Jerome!
> > > 
> > > On Thu 11-04-19 17:08:29, jglisse@redhat.com wrote:
> > > > From: Jérôme Glisse <jglisse@redhat.com>
> > > > 
> > > > We want to keep track of how we got a reference on page added to bio_vec
> > > > ie wether the page was reference through GUP (get_user_page*) or not. So
> > > > add a flag to bio_add_page()/bio_add_pc_page()/__bio_add_page() to that
> > > > effect.
> > > 
> > > Thanks for writing this patch set! Looking through patches like this one,
> > > I'm a bit concerned. With so many bio_add_page() callers it's difficult to
> > > get things right and not regress in the future. I'm wondering whether the
> > > things won't be less error-prone if we required that all page reference
> > > from bio are gup-like (not necessarily taken by GUP, if creator of the bio
> > > gets to struct page he needs via some other means (e.g. page cache lookup),
> > > he could just use get_gup_pin() helper we'd provide).  After all, a page
> > > reference in bio means that the page is pinned for the duration of IO and
> > > can be DMAed to/from so it even makes some sense to track the reference
> > > like that. Then bio_put() would just unconditionally do put_user_page() and
> > > we won't have to propagate the information in the bio.
> > > 
> > > Do you think this would be workable and easier?
> > 
> > It might be workable but i am not sure it is any simpler. bio_add_page*()
> > does not take page reference it is up to the caller to take the proper
> > page reference so the complexity would be push there (just in a different
> > place) so i don't think it would be any simpler. This means that we would
> > have to update more code than this patchset does.
> 
> I agree that the amount of work in this patch set is about the same
> (although you don't have to pass the information about reference type in
> the biovec so you save the complexities there). But for the future the
> rule that "bio references must be gup-pins" is IMO easier to grasp for
> developers and you can reasonably assert it in bio_add_page().
> 
> > This present patch is just a coccinelle semantic patch and even if it
> > is scary to see that many call site, they are not that many that need
> > to worry about the GUP parameter and they all are in patch 11, 12, 13
> > and 14.
> > 
> > So i believe this patchset is simpler than converting everyone to take
> > a GUP like page reference. Also doing so means we loose the information
> > about GUP kind of defeat the purpose. So i believe it would be better
> > to limit special reference to GUP only pages.
> 
> So what's the difference whether the page reference has been acquired via
> GUP or via some other means? I don't think that really matters. If say
> infiniband introduced new ioctl() that takes file descriptor, offset, and
> length and just takes pages from page cache and attaches them to their RDMA
> scatter-gather lists, then they'd need to use 'pin' references anyway...
> 
> Then why do we work on differentiating between GUP pins and other page
> references?  Because it matters what the reference is going to be used for
> and what is it's lifetime. And generally GUP references are used to do IO
> to/from page and may even be controlled by userspace so that's why we need
> to make them different. But in principle the 'gup-pin' reference is not about
> the fact that the reference has been obtained from GUP but about the fact
> that it is used to do IO. Hence I think that the rule "bio references must
> be gup-pins" makes some sense.

It will break things like page protection i am working on (KSM for file
back page). Pages can go through bio for mundane reasons (crypto, network,
gpu, ...) that have nothing to do with I/O for fs and do not have to block
any of the fs operation. If we GUP bias all those pages then we will
effectively make the situation worse in that pages will have a high likely-
hood to always look GUPed while it is just going through some bio for one
of those mundane reasons (and page is not being written to just use as
a source).

I understand why conceptualy it looks appealing but we would be loosing
information here. I really want to be able to determine if a page is GUPed
or not. If we GUP bias everyone in bio then we loose that.

Also i want to point out that the complexity of biasing all page in bio
are much bigger than this patchset it will require changes to all call
site of bio_add_page*() at very least.

Cheers,
Jérôme

  parent reply	other threads:[~2019-04-16 17:07 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11 21:08 [PATCH v1 00/15] Keep track of GUPed pages in fs and block jglisse
2019-04-11 21:08 ` [PATCH v1 01/15] fs/direct-io: fix trailing whitespace issues jglisse
2019-04-11 21:08 ` [PATCH v1 02/15] iov_iter: add helper to test if an iter would use GUP jglisse
2019-04-11 21:08 ` [PATCH v1 03/15] block: introduce bvec_page()/bvec_set_page() to get/set bio_vec.bv_page jglisse
2019-04-11 21:08 ` [PATCH v1 04/15] block: introduce BIO_VEC_INIT() macro to initialize bio_vec structure jglisse
2019-04-11 21:08 ` [PATCH v1 05/15] block: replace all bio_vec->bv_page by bvec_page()/bvec_set_page() jglisse
2019-04-11 21:08 ` [PATCH v1 06/15] block: convert bio_vec.bv_page to bv_pfn to store pfn and not page jglisse
2019-04-11 21:08 ` [PATCH v1 07/15] block: add bvec_put_page_dirty*() to replace put_page(bvec_page()) jglisse
2019-04-11 21:08 ` [PATCH v1 08/15] block: use bvec_put_page() instead of put_page(bvec_page()) jglisse
2019-04-11 21:08 ` [PATCH v1 09/15] block: bvec_put_page_dirty* instead of set_page_dirty* and bvec_put_page jglisse
2019-04-11 21:08 ` [PATCH v1 10/15] block: add gup flag to bio_add_page()/bio_add_pc_page()/__bio_add_page() jglisse
2019-04-15 14:59   ` Jan Kara
2019-04-15 15:24     ` Jerome Glisse
2019-04-16 16:46       ` Jan Kara
2019-04-16 16:54         ` Dan Williams
2019-04-16 17:07         ` Jerome Glisse [this message]
2019-04-16  0:22     ` Jerome Glisse
2019-04-16 16:52       ` Jan Kara
2019-04-16 18:32         ` Jerome Glisse
2019-04-11 21:08 ` [PATCH v1 11/15] block: make sure bio_add_page*() knows page that are coming from GUP jglisse
2019-04-11 21:08 ` [PATCH v1 12/15] fs/direct-io: keep track of wether a page is coming from GUP or not jglisse
2019-04-11 23:14   ` Dave Chinner
2019-04-12  0:08     ` Jerome Glisse
2019-04-11 21:08 ` [PATCH v1 13/15] fs/splice: use put_user_page() when appropriate jglisse
2019-04-11 21:08 ` [PATCH v1 14/15] fs: use bvec_set_gup_page() where appropriate jglisse
2019-04-11 21:08 ` [PATCH v1 15/15] ceph: use put_user_pages() instead of ceph_put_page_vector() jglisse
2019-04-15  7:46   ` Yan, Zheng
2019-04-15 15:11     ` Jerome Glisse
2019-04-16  0:00 ` [PATCH v1 00/15] Keep track of GUPed pages in fs and block Dave Chinner
     [not found] ` <2c124cc4-b97e-ee28-2926-305bc6bc74bd@plexistor.com>
2019-04-16 18:47   ` Jerome Glisse
2019-04-16 18:59   ` Kent Overstreet
2019-04-16 19:12     ` Dan Williams
2019-04-16 19:49       ` Jerome Glisse
2019-04-17 21:53         ` Dan Williams
2019-04-17 22:28           ` Jerome Glisse
2019-04-17 23:32             ` Dan Williams
2019-04-18 10:42             ` Jan Kara
2019-04-18 14:27               ` Jerome Glisse
2019-04-18 15:30                 ` Jan Kara
2019-04-18 15:36                   ` Jerome Glisse
2019-04-18 18:03               ` Dan Williams
     [not found]       ` <ccac6c5a-7120-0455-88de-ca321b01e825@plexistor.com>
2019-04-16 19:57         ` Jerome Glisse
     [not found]           ` <41e2d7e1-104b-a006-2824-015ca8c76cc8@gmail.com>
2019-04-16 23:16             ` Jerome Glisse
     [not found]               ` <fa00a2ff-3664-3165-7af8-9d9c53238245@plexistor.com>
2019-04-17  2:03                 ` Jerome Glisse
2019-04-17 21:19                   ` Jerome Glisse
2019-04-16 23:34             ` Jerome Glisse
2019-04-17 21:54         ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190416170723.GC3254@redhat.com \
    --to=jglisse@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=jthumshirn@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.lei@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).