All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: David Howells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	John Hubbard <jhubbard@nvidia.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH v7 2/8] iov_iter: Add a function to extract a page list from an iterator
Date: Mon, 23 Jan 2023 12:28:40 +0100	[thread overview]
Message-ID: <246ba813-698b-8696-7f4d-400034a3380b@redhat.com> (raw)
In-Reply-To: <20230120175556.3556978-3-dhowells@redhat.com>

On 20.01.23 18:55, David Howells wrote:
> Add a function, iov_iter_extract_pages(), to extract a list of pages from
> an iterator.  The pages may be returned with a reference added or a pin
> added or neither, depending on the type of iterator and the direction of
> transfer.  The caller must pass FOLL_READ_FROM_MEM or FOLL_WRITE_TO_MEM
> as part of gup_flags to indicate how the iterator contents are to be used.
> 
> Add a second function, iov_iter_extract_mode(), to determine how the
> cleanup should be done.
> 
> There are three cases:
> 
>   (1) Transfer *into* an ITER_IOVEC or ITER_UBUF iterator.
> 
>       Extracted pages will have pins obtained on them (but not references)
>       so that fork() doesn't CoW the pages incorrectly whilst the I/O is in
>       progress.
> 
>       iov_iter_extract_mode() will return FOLL_PIN for this case.  The
>       caller should use something like unpin_user_page() to dispose of the
>       page.
> 
>   (2) Transfer is *out of* an ITER_IOVEC or ITER_UBUF iterator.
> 
>       Extracted pages will have references obtained on them, but not pins.
> 
>       iov_iter_extract_mode() will return FOLL_GET.  The caller should use
>       something like put_page() for page disposal.
> 
>   (3) Any other sort of iterator.
> 
>       No refs or pins are obtained on the page, the assumption is made that
>       the caller will manage page retention.  ITER_ALLOW_P2PDMA is not
>       permitted.
> 
>       iov_iter_extract_mode() will return 0.  The pages don't need
>       additional disposal.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Christoph Hellwig <hch@lst.de>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> Link: https://lore.kernel.org/r/166920903885.1461876.692029808682876184.stgit@warthog.procyon.org.uk/ # v2
> Link: https://lore.kernel.org/r/166997421646.9475.14837976344157464997.stgit@warthog.procyon.org.uk/ # v3
> Link: https://lore.kernel.org/r/167305163883.1521586.10777155475378874823.stgit@warthog.procyon.org.uk/ # v4
> Link: https://lore.kernel.org/r/167344728530.2425628.9613910866466387722.stgit@warthog.procyon.org.uk/ # v5
> Link: https://lore.kernel.org/r/167391053207.2311931.16398133457201442907.stgit@warthog.procyon.org.uk/ # v6
> ---
> 
> Notes:
>      ver #7)
>       - Switch to passing in iter-specific flags rather than FOLL_* flags.
>       - Drop the direction flags for now.
>       - Use ITER_ALLOW_P2PDMA to request FOLL_PCI_P2PDMA.
>       - Disallow use of ITER_ALLOW_P2PDMA with non-user-backed iter.
>       - Add support for extraction from KVEC-type iters.
>       - Use iov_iter_advance() rather than open-coding it.
>       - Make BVEC- and KVEC-type skip over initial empty vectors.
>      
>      ver #6)
>       - Add back the function to indicate the cleanup mode.
>       - Drop the cleanup_mode return arg to iov_iter_extract_pages().
>       - Pass FOLL_SOURCE/DEST_BUF in gup_flags.  Check this against the iter
>         data_source.
>      
>      ver #4)
>       - Use ITER_SOURCE/DEST instead of WRITE/READ.
>       - Allow additional FOLL_* flags, such as FOLL_PCI_P2PDMA to be passed in.
>      
>      ver #3)
>       - Switch to using EXPORT_SYMBOL_GPL to prevent indirect 3rd-party access
>         to get/pin_user_pages_fast()[1].
> 
>   include/linux/uio.h |  28 +++
>   lib/iov_iter.c      | 424 ++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 452 insertions(+)
> 
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 46d5080314c6..a4233049ab7a 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -363,4 +363,32 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction,
>   /* Flags for iov_iter_get/extract_pages*() */
>   #define ITER_ALLOW_P2PDMA	0x01	/* Allow P2PDMA on the extracted pages */
>   
> +ssize_t iov_iter_extract_pages(struct iov_iter *i, struct page ***pages,
> +			       size_t maxsize, unsigned int maxpages,
> +			       unsigned int extract_flags, size_t *offset0);
> +
> +/**
> + * iov_iter_extract_mode - Indicate how pages from the iterator will be retained
> + * @iter: The iterator
> + * @extract_flags: How the iterator is to be used
> + *
> + * Examine the iterator and @extract_flags and indicate by returning FOLL_PIN,
> + * FOLL_GET or 0 as to how, if at all, pages extracted from the iterator will
> + * be retained by the extraction function.
> + *
> + * FOLL_GET indicates that the pages will have a reference taken on them that
> + * the caller must put.  This can be done for DMA/async DIO write from a page.
> + *
> + * FOLL_PIN indicates that the pages will have a pin placed in them that the
> + * caller must unpin.  This is must be done for DMA/async DIO read to a page to
> + * avoid CoW problems in fork.
> + *
> + * 0 indicates that no measures are taken and that it's up to the caller to
> + * retain the pages.
> + */
> +#define iov_iter_extract_mode(iter, extract_flags) \
> +	(user_backed_iter(iter) ?				\
> +	 (iter->data_source == ITER_SOURCE) ?			\
> +	 FOLL_GET : FOLL_PIN : 0)
> +
>

How does this work align with the goal of no longer using FOLL_GET for 
O_DIRECT? We should get rid of any FOLL_GET usage for accessing page 
content.

@John, any comments?

-- 
Thanks,

David / dhildenb


  parent reply	other threads:[~2023-01-23 11:29 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20 17:55 [PATCH v7 0/8] iov_iter: Improve page extraction (ref, pin or just list) David Howells
2023-01-20 17:55 ` [PATCH v7 1/8] iov_iter: Define flags to qualify page extraction David Howells
2023-01-21 13:01   ` Christoph Hellwig
2023-01-20 17:55 ` [PATCH v7 2/8] iov_iter: Add a function to extract a page list from an iterator David Howells
2023-01-21 13:01   ` Christoph Hellwig
2023-01-21 13:10   ` Christoph Hellwig
2023-01-21 13:30   ` David Howells
2023-01-21 13:33     ` Christoph Hellwig
2023-01-23 11:28   ` David Hildenbrand [this message]
2023-01-23 11:51   ` David Howells
2023-01-23 13:11     ` David Hildenbrand
2023-01-23 13:19     ` David Howells
2023-01-23 13:24       ` David Hildenbrand
2023-01-23 19:56         ` John Hubbard
2023-01-26 22:15         ` Al Viro
2023-01-26 23:41           ` David Hildenbrand
2023-01-27  0:05           ` David Howells
2023-01-27  0:20             ` David Hildenbrand
2023-01-23 13:38       ` David Howells
2023-01-23 14:20         ` David Hildenbrand
2023-01-23 14:48           ` Christoph Hellwig
2023-01-23 16:11         ` Jan Kara
2023-01-23 16:17           ` Christoph Hellwig
2023-01-23 23:07           ` John Hubbard
2023-01-24  5:57             ` Christoph Hellwig
2023-01-24  6:55               ` John Hubbard
2023-01-23 12:00   ` David Howells
2023-01-23 12:00     ` David Howells
2023-01-20 17:55 ` [PATCH v7 3/8] mm: Provide a helper to drop a pin/ref on a page David Howells
2023-01-20 17:55 ` [PATCH v7 4/8] block: Rename BIO_NO_PAGE_REF to BIO_PAGE_REFFED and invert the meaning David Howells
2023-01-21 13:04   ` Christoph Hellwig
2023-01-23  9:38   ` David Howells
2023-01-23  9:56     ` Christoph Hellwig
2023-01-20 17:55 ` [PATCH v7 5/8] block: Add BIO_PAGE_PINNED David Howells
2023-01-21 13:05   ` Christoph Hellwig
2023-01-20 17:55 ` [PATCH v7 6/8] block: Make bio structs pin pages rather than ref'ing if appropriate David Howells
2023-01-21 13:07   ` Christoph Hellwig
2023-01-23 11:28   ` David Howells
2023-01-23 14:49     ` Christoph Hellwig
2023-01-20 17:55 ` [PATCH v7 7/8] block: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-01-20 17:55 ` [PATCH v7 8/8] mm: Renumber FOLL_GET and FOLL_PIN down David Howells
2023-01-20 18:59   ` Matthew Wilcox
2023-01-20 19:18   ` David Howells
2023-01-23 16:31 ` [PATCH v7 0/8] iov_iter: Improve page extraction (ref, pin or just list) Matthew Wilcox
2023-01-23 16:42   ` Jan Kara
2023-01-23 17:33     ` Matthew Wilcox
2023-01-23 22:53       ` John Hubbard
2023-01-24 10:29       ` Jan Kara
2023-01-24 13:21         ` Christoph Hellwig
2023-01-23 16:38 ` David Howells
2023-01-23 16:42   ` Matthew Wilcox
2023-01-23 17:25     ` Jan Kara
2023-01-24 10:24       ` David Hildenbrand
2023-01-23 17:19   ` David Howells
2023-01-23 18:04     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=246ba813-698b-8696-7f4d-400034a3380b@redhat.com \
    --to=david@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=jlayton@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.