All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Jan Kara <jack@suse.cz>, Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>, Miklos Szeredi <miklos@szeredi.hu>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages
Date: Wed, 31 Aug 2022 11:43:49 +0200	[thread overview]
Message-ID: <20220831094349.boln4jjajkdtykx3@quack3> (raw)
In-Reply-To: <a53b2d14-687a-16c9-2f63-4f94876f8b3c@nvidia.com>

On Mon 29-08-22 12:59:26, John Hubbard wrote:
> On 8/29/22 09:08, Jan Kara wrote:
> >> However, the core block/bio conversion in patch 4 still does depend upon
> >> a key assumption, which I got from a 2019 email discussion with
> >> Christoph Hellwig and others here [1], which says:
> >>
> >>     "All pages released by bio_release_pages should come from
> >>      get_get_user_pages...".
> >>
> >> I really hope that still holds true. Otherwise this whole thing is in
> >> trouble.
> >>
> >> [1] https://lore.kernel.org/kvm/20190724053053.GA18330@infradead.org/
> > 
> > Well as far as I've checked that discussion, Christoph was aware of pipe
> > pages etc. (i.e., bvecs) entering direct IO code. But he had some patches
> > [2] which enabled GUP to work for bvecs as well (using the kernel mapping
> > under the hood AFAICT from a quick glance at the series). I suppose we
> > could also handle this in __iov_iter_get_pages_alloc() by grabbing pin
> > reference instead of plain get_page() for the case of bvec iter. That way
> > we should have only pinned pages in bio_release_pages() even for the bvec
> > case.
> 
> OK, thanks, that looks viable. So, that approach assumes that the
> remaining two cases in __iov_iter_get_pages_alloc() will never end up
> being released via bio_release_pages():
> 
>     iov_iter_is_pipe(i)
>     iov_iter_is_xarray(i)
> 
> I'm actually a little worried about ITER_XARRAY, which is a recent addition.
> It seems to be used in ways that are similar to ITER_BVEC, and cephfs is
> using it. It's probably OK for now, for this series, which doesn't yet
> convert cephfs.

So after looking into that a bit more, I think a clean approach would be to
provide iov_iter_pin_pages2() and iov_iter_pages_alloc2(), under the hood
in __iov_iter_get_pages_alloc() make sure we use pin_user_page() instead of
get_page() in all the cases (using this in pipe_get_pages() and
iter_xarray_get_pages() is easy) and then make all bio handling use the
pinning variants for iters. I think at least iov_iter_is_pipe() case needs
to be handled as well because as I wrote above, pipe pages can enter direct
IO code e.g. for splice(2).

Also I think that all iov_iter_get_pages2() (or the _alloc2 variant) users
actually do want the "pin page" semantics in the end (they are accessing
page contents) so eventually we should convert them all to
iov_iter_pin_pages2() and remove iov_iter_get_pages2() altogether. But this
will take some more conversion work with networking etc. so I'd start with
converting bios only.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2022-08-31  9:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-27  8:36 [PATCH 0/6] convert most filesystems to pin_user_pages_fast() John Hubbard
2022-08-27  8:36 ` [PATCH 1/6] mm/gup: introduce pin_user_page() John Hubbard
2022-08-29 12:07   ` David Hildenbrand
2022-08-29 19:33     ` John Hubbard
2022-08-30 12:17       ` David Hildenbrand
2022-08-30 21:42         ` John Hubbard
2022-08-31  0:06         ` John Hubbard
2022-08-27  8:36 ` [PATCH 2/6] block: add dio_w_*() wrappers for pin, unpin user pages John Hubbard
2022-08-27 22:27   ` Andrew Morton
2022-08-27 23:59     ` John Hubbard
2022-08-28  0:12       ` Andrew Morton
2022-08-28  0:31         ` John Hubbard
2022-08-28  1:07           ` John Hubbard
2022-08-27  8:36 ` [PATCH 3/6] iov_iter: new iov_iter_pin_pages*() routines John Hubbard
2022-08-27 22:46   ` Al Viro
2022-08-27 22:48     ` John Hubbard
2022-08-27  8:36 ` [PATCH 4/6] block, bio, fs: convert most filesystems to pin_user_pages_fast() John Hubbard
2022-08-27  8:36 ` [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages John Hubbard
2022-08-27 22:48   ` Al Viro
2022-08-27 23:55     ` John Hubbard
2022-08-28  0:38       ` Al Viro
2022-08-28  0:39         ` Al Viro
2022-08-28  0:46           ` John Hubbard
2022-08-29  4:59           ` John Hubbard
2022-08-29 16:08             ` Jan Kara
2022-08-29 19:59               ` John Hubbard
2022-08-31  9:43                 ` Jan Kara [this message]
2022-08-31 18:02                   ` John Hubbard
2022-09-01  0:38                   ` Al Viro
2022-09-01  9:06                     ` Jan Kara
2022-08-27  8:36 ` [PATCH 6/6] fuse: convert direct IO paths to use FOLL_PIN John Hubbard
  -- strict thread matches above, loose matches on Subject: below --
2022-02-27  9:34 [PATCH 0/6] block, fs: convert most Direct IO cases to FOLL_PIN jhubbard.send.patches
2022-02-27  9:34 ` [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages jhubbard.send.patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220831094349.boln4jjajkdtykx3@quack3 \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=anna@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=miklos@szeredi.hu \
    --cc=trond.myklebust@hammerspace.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.