All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v13 00/12] iov_iter: Improve page extraction (pin or just list)
@ 2023-02-09 10:29 David Howells
  2023-02-09 10:29 ` [PATCH v13 01/12] splice: Fix O_DIRECT file read splice to avoid reversion of ITER_PIPE David Howells
                   ` (12 more replies)
  0 siblings, 13 replies; 35+ messages in thread
From: David Howells @ 2023-02-09 10:29 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, linux-fsdevel, linux-block, linux-kernel, linux-mm

Hi Jens, Al, Christoph,

Here are patches to provide support for extracting pages from an iov_iter
and to use this in the extraction functions in the block layer bio code.

The patches make the following changes:

 (1) Change generic_file_splice_read() to no longer use ITER_PIPE for doing
     a read from an O_DIRECT file fd, but rather load up an ITER_BVEC
     iterator with sufficient pages and use that rather than using an
     ITER_PIPE.  This avoids a problem[2] when __iomap_dio_rw() calls
     iov_iter_revert() to shorten an iterator when it races with
     truncation.  The reversion causes the pipe iterator to prematurely
     release the pages it was retaining - despite the read still being in
     progress.  This caused memory corruption.

 (2) Change generic_file_splice_read() to no longer use ITER_PIPE for doing
     a read from a buffered file fd, but rather get pages directly from the
     pagecache using filemap_get_pages() do all the readahead, reading,
     waiting and extraction, and then feed the pages directly into the
     pipe.

 (3) filemap_get_pages() is altered so that it doesn't take an iterator
     (which we don't have in (2)), but rather the count and a flag
     indicating if we can handle partially uptodate pages are passed in and
     down to its subsidiary functions.

 (4) Remove ITER_PIPE and its paraphernalia as generic_file_splice_read()
     was the only user.

 (5) Add a function, iov_iter_extract_pages() to replace
     iov_iter_get_pages*() that gets refs, pins or just lists the pages as
     appropriate to the iterator type.

     Add a function, iov_iter_extract_will_pin() that will indicate from
     the iterator type how the cleanup is to be performed, returning true
     if the pages will need unpinning, false otherwise.

 (6) Make the bio struct carry a pair of flags to indicate the cleanup
     mode.  BIO_NO_PAGE_REF is replaced with BIO_PAGE_REFFED (indicating
     FOLL_GET was used) and BIO_PAGE_PINNED (indicating FOLL_PIN was used)
     is added.

     BIO_PAGE_REFFED will go away, but at the moment fs/direct-io.c sets it
     and this series does not fully address that file.

 (7) Add a function, bio_release_page(), to release a page appropriately to
     the cleanup mode indicated by the BIO_PAGE_* flags.

 (8) Make the iter-to-bio code use iov_iter_extract_pages() to retain the
     pages appropriately and clean them up later.

 (9) Fix bio_flagged() so that it doesn't prevent a gcc optimisation.

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-extract

David

Changes:
========
ver #13)
 - Only use allocation in advance and ITER_BVEC for DIO read-splice.
 - Make buffered read-splice get pages directly from the pagecache.
 - Alter filemap_get_pages() & co. so that it doesn't need an iterator.

ver #12)
 - Added the missing __bitwise on the iov_iter_extraction_t typedef.
 - Rebased on -rc7.
 - Don't specify FOLL_PIN to pin_user_pages_fast().
 - Inserted patch at front to fix race between DIO read and truncation that
   caused memory corruption when iov_iter_revert() got called on an
   ITER_PIPE iterator[2].
 - Inserted a patch after that to remove the now-unused ITER_PIPE and its
   helper functions.
 - Removed the ITER_PIPE bits from iov_iter_extract_pages().

ver #11)
 - Fix iov_iter_extract_kvec_pages() to include the offset into the page in
   the returned starting offset.
 - Use __bitwise for the extraction flags

ver #10)
 - Fix use of i->kvec in iov_iter_extract_bvec_pages() to be i->bvec.
 - Drop bio_set_cleanup_mode(), open coding it instead.

ver #9)
 - It's now not permitted to use FOLL_PIN outside of mm/, so:
 - Change iov_iter_extract_mode() into iov_iter_extract_will_pin() and
   return true/false instead of FOLL_PIN/0.
 - Drop of folio_put_unpin() and page_put_unpin() and instead call
   unpin_user_page() (and put_page()) directly as necessary.
 - Make __bio_release_pages() call bio_release_page() instead of
   unpin_user_page() as there's no BIO_* -> FOLL_* translation to do.
 - Drop the FOLL_* renumbering patch.
 - Change extract_flags to extraction_flags.

ver #8)
 - Import Christoph Hellwig's changes.
   - Split the conversion-to-extraction patch.
   - Drop the extract_flags arg from iov_iter_extract_mode().
   - Don't default bios to BIO_PAGE_REFFED, but set explicitly.
 - Switch FOLL_PIN and FOLL_GET when renumbering so PIN is at bit 0.
 - Switch BIO_PAGE_PINNED and BIO_PAGE_REFFED so PINNED is at bit 0.
 - We should always be using FOLL_PIN (not FOLL_GET) for DIO, so adjust the
   patches for that.

ver #7)
 - For now, drop the parts to pass the I/O direction to iov_iter_*pages*()
   as it turned out to be a lot more complicated, with places not setting
   IOCB_WRITE when they should, for example.
 - Drop all the patches that changed things other then the block layer's
   bio handling.  The netfslib and cifs changes can go into a separate
   patchset.
 - Add support for extracting pages from KVEC-type iterators.
 - When extracting from BVEC/KVEC, skip over empty vecs at the front.

ver #6)
 - Fix write() syscall and co. not setting IOCB_WRITE.
 - Added iocb_is_read() and iocb_is_write() to check IOCB_WRITE.
 - Use op_is_write() in bio_copy_user_iov().
 - Drop the iterator direction checks from smbd_recv().
 - Define FOLL_SOURCE_BUF and FOLL_DEST_BUF and pass them in as part of
   gup_flags to iov_iter_get/extract_pages*().
 - Replace iov_iter_get_pages*2() with iov_iter_get_pages*() and remove.
 - Add back the function to indicate the cleanup mode.
 - Drop the cleanup_mode return arg to iov_iter_extract_pages().
 - Provide a helper to clean up a page.
 - Renumbered FOLL_GET and FOLL_PIN and made BIO_PAGE_REFFED/PINNED have
   the same numerical values, enforced with an assertion.
 - Converted AF_ALG, SCSI vhost, generic DIO, FUSE, splice to pipe, 9P and
   NFS.
 - Added in the patches to make CIFS do top-to-bottom iterators and use
   various of the added extraction functions.
 - Added a pair of work-in-progess patches to make sk_buff fragments store
   FOLL_GET and FOLL_PIN.

ver #5)
 - Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED and split into own patch.
 - Transcribe FOLL_GET/PIN into BIO_PAGE_REFFED/PINNED flags.
 - Add patch to allow bio_flagged() to be combined by gcc.

ver #4)
 - Drop the patch to move the FOLL_* flags to linux/mm_types.h as they're
   no longer referenced by linux/uio.h.
 - Add ITER_SOURCE/DEST cleanup patches.
 - Make iov_iter/netfslib iter extraction patches use ITER_SOURCE/DEST.
 - Allow additional gup_flags to be passed into iov_iter_extract_pages().
 - Add struct bio patch.

ver #3)
 - Switch to using EXPORT_SYMBOL_GPL to prevent indirect 3rd-party access
   to get/pin_user_pages_fast()[1].

ver #2)
 - Rolled the extraction cleanup mode query function into the extraction
   function, returning the indication through the argument list.
 - Fixed patch 4 (extract to scatterlist) to actually use the new
   extraction API.

Link: https://lore.kernel.org/r/Y3zFzdWnWlEJ8X8/@infradead.org/ [1]
Link: https://lore.kernel.org/r/000000000000b0b3c005f3a09383@google.com/ [2]
Link: https://lore.kernel.org/r/166697254399.61150.1256557652599252121.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166722777223.2555743.162508599131141451.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732024173.3186319.18204305072070871546.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166920902005.1461876.2786264600108839814.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/166997419665.9475.15014699817597102032.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/167305160937.1521586.133299343565358971.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/167344725490.2425628.13771289553670112965.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/167391047703.2311931.8115712773222260073.stgit@warthog.procyon.org.uk/ # v6
Link: https://lore.kernel.org/r/20230120175556.3556978-1-dhowells@redhat.com/ # v7
Link: https://lore.kernel.org/r/20230123173007.325544-1-dhowells@redhat.com/ # v8
Link: https://lore.kernel.org/r/20230124170108.1070389-1-dhowells@redhat.com/ # v9
Link: https://lore.kernel.org/r/20230125210657.2335748-1-dhowells@redhat.com/ # v10
Link: https://lore.kernel.org/r/20230126141626.2809643-1-dhowells@redhat.com/ # v11
Link: https://lore.kernel.org/r/20230207171305.3716974-1-dhowells@redhat.com/ # v12

Christoph Hellwig (1):
  block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted
    logic

David Howells (11):
  splice: Fix O_DIRECT file read splice to avoid reversion of ITER_PIPE
  mm: Pass info, not iter, into filemap_get_pages() and unstatic it
  splice: Do splice read from a buffered file without using ITER_PIPE
  iov_iter: Kill ITER_PIPE
  iov_iter: Define flags to qualify page extraction.
  iov_iter: Add a function to extract a page list from an iterator
  iomap: Don't get an reference on ZERO_PAGE for direct I/O block
    zeroing
  block: Fix bio_flagged() so that gcc can better optimise it
  block: Add BIO_PAGE_PINNED and associated infrastructure
  block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages
  block: convert bio_map_user_iov to use iov_iter_extract_pages

 block/bio.c               |  33 +-
 block/blk-map.c           |  26 +-
 block/blk.h               |  12 +
 fs/cifs/file.c            |   8 +-
 fs/direct-io.c            |   2 +
 fs/iomap/direct-io.c      |   1 -
 fs/splice.c               | 245 ++++++++++++-
 include/linux/bio.h       |   5 +-
 include/linux/blk_types.h |   3 +-
 include/linux/pagemap.h   |   2 +
 include/linux/uio.h       |  49 ++-
 lib/iov_iter.c            | 713 +++++++++++++++-----------------------
 mm/filemap.c              |  30 +-
 13 files changed, 603 insertions(+), 526 deletions(-)


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-02-15 15:57 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09 10:29 [PATCH v13 00/12] iov_iter: Improve page extraction (pin or just list) David Howells
2023-02-09 10:29 ` [PATCH v13 01/12] splice: Fix O_DIRECT file read splice to avoid reversion of ITER_PIPE David Howells
2023-02-09 14:53   ` Matthew Wilcox
2023-02-09 15:06   ` David Howells
2023-02-09 16:15   ` [PATCH v14 " David Howells
2023-02-13  8:22     ` Christoph Hellwig
2023-02-15 13:17     ` David Howells
2023-02-15 14:24       ` Christoph Hellwig
2023-02-15 15:56       ` David Howells
2023-02-15 13:42     ` [PATCH] splice: Clean up direct_splice_read() a bit David Howells
2023-02-15 13:47     ` David Howells
2023-02-09 10:29 ` [PATCH v13 02/12] mm: Pass info, not iter, into filemap_get_pages() and unstatic it David Howells
2023-02-13  8:22   ` Christoph Hellwig
2023-02-09 10:29 ` [PATCH v13 03/12] splice: Do splice read from a buffered file without using ITER_PIPE David Howells
2023-02-13  8:28   ` Christoph Hellwig
2023-02-13 10:11   ` David Howells
2023-02-13 10:18     ` Christoph Hellwig
2023-02-13 11:15     ` David Howells
2023-02-13 14:44       ` Christoph Hellwig
2023-02-13 18:06   ` Guenter Roeck
2023-02-13 22:43   ` David Howells
2023-02-13 22:51     ` Guenter Roeck
2023-02-13 23:12     ` David Howells
2023-02-13 23:25       ` Guenter Roeck
2023-02-09 10:29 ` [PATCH v13 04/12] iov_iter: Kill ITER_PIPE David Howells
2023-02-13  8:28   ` Christoph Hellwig
2023-02-09 10:29 ` [PATCH v13 05/12] iov_iter: Define flags to qualify page extraction David Howells
2023-02-09 10:29 ` [PATCH v13 06/12] iov_iter: Add a function to extract a page list from an iterator David Howells
2023-02-09 10:29 ` [PATCH v13 07/12] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing David Howells
2023-02-09 10:29 ` [PATCH v13 08/12] block: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-02-09 10:29 ` [PATCH v13 09/12] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic David Howells
2023-02-09 10:29 ` [PATCH v13 10/12] block: Add BIO_PAGE_PINNED and associated infrastructure David Howells
2023-02-09 10:29 ` [PATCH v13 11/12] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages David Howells
2023-02-09 10:29 ` [PATCH v13 12/12] block: convert bio_map_user_iov " David Howells
2023-02-10 22:31 ` [PATCH v13 00/12] iov_iter: Improve page extraction (pin or just list) Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.