All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>, Miklos Szeredi <miklos@szeredi.hu>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages
Date: Thu, 1 Sep 2022 01:38:21 +0100	[thread overview]
Message-ID: <Yw/+/U9GFaNnARdk@ZenIV> (raw)
In-Reply-To: <20220831094349.boln4jjajkdtykx3@quack3>

On Wed, Aug 31, 2022 at 11:43:49AM +0200, Jan Kara wrote:

> So after looking into that a bit more, I think a clean approach would be to
> provide iov_iter_pin_pages2() and iov_iter_pages_alloc2(), under the hood
> in __iov_iter_get_pages_alloc() make sure we use pin_user_page() instead of
> get_page() in all the cases (using this in pipe_get_pages() and
> iter_xarray_get_pages() is easy) and then make all bio handling use the
> pinning variants for iters. I think at least iov_iter_is_pipe() case needs
> to be handled as well because as I wrote above, pipe pages can enter direct
> IO code e.g. for splice(2).
> 
> Also I think that all iov_iter_get_pages2() (or the _alloc2 variant) users
> actually do want the "pin page" semantics in the end (they are accessing
> page contents) so eventually we should convert them all to
> iov_iter_pin_pages2() and remove iov_iter_get_pages2() altogether. But this
> will take some more conversion work with networking etc. so I'd start with
> converting bios only.

Not sure, TBH...

FWIW, quite a few of the callers of iov_iter_get_pages2() do *NOT* need to
grab any references for BVEC/XARRAY/PIPE cases.  What's more, it would be
bloody useful to have a variant that doesn't grab references for
!iter->user_backed case - that could be usable for KVEC as well, simplifying
several callers.

Requirements:
	* recepients of those struct page * should have a way to make
dropping the page refs conditional (obviously); bio machinery can be told
to do so.
	* callers should *NOT* do something like
	"set an ITER_BVEC iter, with page references grabbed and stashed in
bio_vec array, call async read_iter() and drop the references in array - the
refs we grab in dio will serve"
Note that for sync IO that pattern is fine whether we grab/drop anything
inside read_iter(); for async we could take depopulating the bio_vec
array to the IO completion or downstream of that.
	* the code dealing with the references returned by iov_iter_..._pages
should *NOT* play silly buggers with refcounts - something like "I'll grab
a reference, start DMA and report success; page will stay around until I
get around to dropping the ref and callers don't need to wait for that" deep
in the bowels of infinibad stack (or something equally tasteful) is seriously
asking for trouble.

Future plans from the last cycle included iov_iter_find_pages{,_alloc}() that
would *not* grab references on anything other than IOVEC and UBUF (would advance
the iterator, same as iov_iter_get_pages2(), though). Then iov_iter_get_...()
would become a wrapper for that.  After that - look into switching the users
of ..._get_... to ..._find_....   Hadn't done much in that direction yet,
though - need to redo the analysis first.

That primitive might very well do FOLL_PIN instead of FOLL_GET for IOVEC and
UBUF...

  parent reply	other threads:[~2022-09-01  0:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-27  8:36 [PATCH 0/6] convert most filesystems to pin_user_pages_fast() John Hubbard
2022-08-27  8:36 ` [PATCH 1/6] mm/gup: introduce pin_user_page() John Hubbard
2022-08-29 12:07   ` David Hildenbrand
2022-08-29 19:33     ` John Hubbard
2022-08-30 12:17       ` David Hildenbrand
2022-08-30 21:42         ` John Hubbard
2022-08-31  0:06         ` John Hubbard
2022-08-27  8:36 ` [PATCH 2/6] block: add dio_w_*() wrappers for pin, unpin user pages John Hubbard
2022-08-27 22:27   ` Andrew Morton
2022-08-27 23:59     ` John Hubbard
2022-08-28  0:12       ` Andrew Morton
2022-08-28  0:31         ` John Hubbard
2022-08-28  1:07           ` John Hubbard
2022-08-27  8:36 ` [PATCH 3/6] iov_iter: new iov_iter_pin_pages*() routines John Hubbard
2022-08-27 22:46   ` Al Viro
2022-08-27 22:48     ` John Hubbard
2022-08-27  8:36 ` [PATCH 4/6] block, bio, fs: convert most filesystems to pin_user_pages_fast() John Hubbard
2022-08-27  8:36 ` [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages John Hubbard
2022-08-27 22:48   ` Al Viro
2022-08-27 23:55     ` John Hubbard
2022-08-28  0:38       ` Al Viro
2022-08-28  0:39         ` Al Viro
2022-08-28  0:46           ` John Hubbard
2022-08-29  4:59           ` John Hubbard
2022-08-29 16:08             ` Jan Kara
2022-08-29 19:59               ` John Hubbard
2022-08-31  9:43                 ` Jan Kara
2022-08-31 18:02                   ` John Hubbard
2022-09-01  0:38                   ` Al Viro [this message]
2022-09-01  9:06                     ` Jan Kara
2022-08-27  8:36 ` [PATCH 6/6] fuse: convert direct IO paths to use FOLL_PIN John Hubbard
  -- strict thread matches above, loose matches on Subject: below --
2022-02-27  9:34 [PATCH 0/6] block, fs: convert most Direct IO cases to FOLL_PIN jhubbard.send.patches
2022-02-27  9:34 ` [PATCH 5/6] NFS: direct-io: convert to FOLL_PIN pages jhubbard.send.patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yw/+/U9GFaNnARdk@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=anna@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=miklos@szeredi.hu \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.