linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v21 00/30] splice: Kill ITER_PIPE
@ 2023-05-20  0:00 David Howells
  2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
                   ` (29 more replies)
  0 siblings, 30 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm

Hi Jens, Al, Christoph,

I've split off splice patchset and moved the block patches to a separate
branch (though they are dependent on this one).

This patchset kills off ITER_PIPE to avoid a race between truncate,
iov_iter_revert() on the pipe and an as-yet incomplete DMA to a bio with
unpinned/unref'ed pages from an O_DIRECT splice read.  This causes memory
corruption[2].  Instead, we use filemap_splice_read(), which invokes the
buffered file reading code and splices from the pagecache into the pipe;
copy_splice_read(), which bulk-allocates a buffer, reads into it and then
pushes the filled pages into the pipe; or handle it in filesystem-specific
code.

 (1) Rename direct_splice_read() to copy_splice_read().

 (2) Simplify the calculations for the number of pages to be reclaimed in
     copy_splice_read().

 (3) Turn do_splice_to() into a helper, vfs_splice_read(), so that it can
     be used by overlayfs and coda to perform the checks on the lower fs.

 (4) Make vfs_splice_read() jump to copy_splice_read() to handle direct-I/O
     and DAX.

 (5) Provide shmem with its own splice_read to handle non-existent pages
     in the pagecache.  We don't want a ->read_folio() as we don't want to
     populate holes, but filemap_get_pages() requires it.

 (6) Provide overlayfs with its own splice_read to call down to a lower
     layer as overlayfs doesn't provide ->read_folio().

 (7) Provide coda with its own splice_read to call down to a lower layer as
     coda doesn't provide ->read_folio().

 (8) Direct ->splice_read to copy_splice_read() in tty, procfs, kernfs
     and random files as they just copy to the output buffer and don't
     splice pages.

 (9) Provide stubs for afs, ceph, ecryptfs, ext4, f2fs, nfs, ntfs3, ocfs2,
     orangefs, xfs and zonefs to do locking and/or revalidation.

(10) Make cifs use filemap_splice_read().

(11) Replace pointers to generic_file_splice_read() with pointers to
     filemap_splice_read() as DIO and DAX are handled in the caller;
     filesystems can still provide their own alternate ->splice_read() op.

(12) Remove generic_file_splice_read().

(13) Remove ITER_PIPE and its paraphernalia as generic_file_splice_read()
     was the only user.

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=kill-iter-pipe

David

Changes:
========
ver #21)
 - Split off the block-layer changes into a separate branch.
 - Check for zero len in vfs_splice_read().
 - Check s_maxbytes in filemap_splice_read().
 - Rename direct_splice_read() to copy_splice_read().
 - The direct I/O and DAX handling needs to be in vfs_splice_read(), not
   generic_file_splice_read(), before ->splice_read() is called.
 - Don't need #ifdef CONFIG_FS_DAX as IS_DAX() is false if !CONFIG_FS_DAX.
 - Replace pointers to generic_file_splice_read() to filemap_splice_read().
 - Remove generic_file_splice_read().
 - In ceph, drop the caps ref.
 - In NFS, Fix pos -> ppos in dprintk().

ver #20)
 - Make direct_splice_read() limit the read to eof for regular files and
   blockdevs.
 - Check against s_maxbytes on the backing store, not a devnode inode.
 - Provide stubs for afs, ceph, ecryptfs, ext4, f2fs, nfs, ntfs3, ocfs2,
   orangefs, xfs and zonefs.
 - Always use direct_splice_read() for 9p, trace and sockets.

ver #19)
 - Remove a missed get_page() on the zeropage in shmem_splice_read().

ver #18)
 - Split out the cifs bits from the patch the switches
   generic_file_splice_read() over to using the non-ITER_PIPE splicing.
 - Don't get/put refs on the zeropage in shmem_splice_read().

ver #17)
 - Rename do_splice_to() to vfs_splice_read() and export it so that it can
   be a helper and make overlayfs and coda use it, allowing duplicate
   checks to be removed.

ver #16)
 - The filemap_get_pages() changes are now upstream.
 - filemap_splice_read() and direct_splice_read() are now upstream.
 - iov_iter_extract_pages() is now upstream.

ver #15)
 - Fixed up some errors in overlayfs_splice_read().

ver #14)
 - Some changes to generic_file_buffered_splice_read():
   - Rename to filemap_splice_read() and move to mm/filemap.c.
   - Create a helper, pipe_head_buf().
   - Use init_sync_kiocb().
 - Some changes to generic_file_direct_splice_read():
   - Use alloc_pages_bulk_array() rather than alloc_pages_bulk_list().
   - Use release_pages() instead of __free_page() in a loop.
   - Rename to direct_splice_read().
 - Rearrange the patches to implement filemap_splice_read() and
   direct_splice_read() separately to changing generic_file_splice_read().
 - Don't call generic_file_splice_read() when there isn't a ->read_folio().
 - Insert patches to fix read_folio-less cases:
   - Make tty, procfs, kernfs and (u)random use direct_splice_read().
   - Make overlayfs and coda call down to a lower layer.
   - Give shmem its own splice-read that doesn't insert missing pages.
 - Fixed a min() with mixed type args on some arches.

ver #13)
 - Only use allocation in advance and ITER_BVEC for DIO read-splice.
 - Make buffered read-splice get pages directly from the pagecache.
 - Alter filemap_get_pages() & co. so that it doesn't need an iterator.

ver #12)
 - Added the missing __bitwise on the iov_iter_extraction_t typedef.
 - Rebased on -rc7.
 - Don't specify FOLL_PIN to pin_user_pages_fast().
 - Inserted patch at front to fix race between DIO read and truncation that
   caused memory corruption when iov_iter_revert() got called on an
   ITER_PIPE iterator[2].
 - Inserted a patch after that to remove the now-unused ITER_PIPE and its
   helper functions.
 - Removed the ITER_PIPE bits from iov_iter_extract_pages().

ver #11)
 - Fix iov_iter_extract_kvec_pages() to include the offset into the page in
   the returned starting offset.
 - Use __bitwise for the extraction flags

ver #10)
 - Fix use of i->kvec in iov_iter_extract_bvec_pages() to be i->bvec.
 - Drop bio_set_cleanup_mode(), open coding it instead.

ver #9)
 - It's now not permitted to use FOLL_PIN outside of mm/, so:
 - Change iov_iter_extract_mode() into iov_iter_extract_will_pin() and
   return true/false instead of FOLL_PIN/0.
 - Drop of folio_put_unpin() and page_put_unpin() and instead call
   unpin_user_page() (and put_page()) directly as necessary.
 - Make __bio_release_pages() call bio_release_page() instead of
   unpin_user_page() as there's no BIO_* -> FOLL_* translation to do.
 - Drop the FOLL_* renumbering patch.
 - Change extract_flags to extraction_flags.

ver #8)
 - Import Christoph Hellwig's changes.
   - Split the conversion-to-extraction patch.
   - Drop the extract_flags arg from iov_iter_extract_mode().
   - Don't default bios to BIO_PAGE_REFFED, but set explicitly.
 - Switch FOLL_PIN and FOLL_GET when renumbering so PIN is at bit 0.
 - Switch BIO_PAGE_PINNED and BIO_PAGE_REFFED so PINNED is at bit 0.
 - We should always be using FOLL_PIN (not FOLL_GET) for DIO, so adjust the
   patches for that.

ver #7)
 - For now, drop the parts to pass the I/O direction to iov_iter_*pages*()
   as it turned out to be a lot more complicated, with places not setting
   IOCB_WRITE when they should, for example.
 - Drop all the patches that changed things other then the block layer's
   bio handling.  The netfslib and cifs changes can go into a separate
   patchset.
 - Add support for extracting pages from KVEC-type iterators.
 - When extracting from BVEC/KVEC, skip over empty vecs at the front.

ver #6)
 - Fix write() syscall and co. not setting IOCB_WRITE.
 - Added iocb_is_read() and iocb_is_write() to check IOCB_WRITE.
 - Use op_is_write() in bio_copy_user_iov().
 - Drop the iterator direction checks from smbd_recv().
 - Define FOLL_SOURCE_BUF and FOLL_DEST_BUF and pass them in as part of
   gup_flags to iov_iter_get/extract_pages*().
 - Replace iov_iter_get_pages*2() with iov_iter_get_pages*() and remove.
 - Add back the function to indicate the cleanup mode.
 - Drop the cleanup_mode return arg to iov_iter_extract_pages().
 - Provide a helper to clean up a page.
 - Renumbered FOLL_GET and FOLL_PIN and made BIO_PAGE_REFFED/PINNED have
   the same numerical values, enforced with an assertion.
 - Converted AF_ALG, SCSI vhost, generic DIO, FUSE, splice to pipe, 9P and
   NFS.
 - Added in the patches to make CIFS do top-to-bottom iterators and use
   various of the added extraction functions.
 - Added a pair of work-in-progess patches to make sk_buff fragments store
   FOLL_GET and FOLL_PIN.

ver #5)
 - Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED and split into own patch.
 - Transcribe FOLL_GET/PIN into BIO_PAGE_REFFED/PINNED flags.
 - Add patch to allow bio_flagged() to be combined by gcc.

ver #4)
 - Drop the patch to move the FOLL_* flags to linux/mm_types.h as they're
   no longer referenced by linux/uio.h.
 - Add ITER_SOURCE/DEST cleanup patches.
 - Make iov_iter/netfslib iter extraction patches use ITER_SOURCE/DEST.
 - Allow additional gup_flags to be passed into iov_iter_extract_pages().
 - Add struct bio patch.

ver #3)
 - Switch to using EXPORT_SYMBOL_GPL to prevent indirect 3rd-party access
   to get/pin_user_pages_fast()[1].

ver #2)
 - Rolled the extraction cleanup mode query function into the extraction
   function, returning the indication through the argument list.
 - Fixed patch 4 (extract to scatterlist) to actually use the new
   extraction API.

Link: https://lore.kernel.org/r/Y3zFzdWnWlEJ8X8/@infradead.org/ [1]
Link: https://lore.kernel.org/r/000000000000b0b3c005f3a09383@google.com/ [2]
Link: https://lore.kernel.org/r/166697254399.61150.1256557652599252121.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166722777223.2555743.162508599131141451.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732024173.3186319.18204305072070871546.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166920902005.1461876.2786264600108839814.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/166997419665.9475.15014699817597102032.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/167305160937.1521586.133299343565358971.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/167344725490.2425628.13771289553670112965.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/167391047703.2311931.8115712773222260073.stgit@warthog.procyon.org.uk/ # v6
Link: https://lore.kernel.org/r/20230120175556.3556978-1-dhowells@redhat.com/ # v7
Link: https://lore.kernel.org/r/20230123173007.325544-1-dhowells@redhat.com/ # v8
Link: https://lore.kernel.org/r/20230124170108.1070389-1-dhowells@redhat.com/ # v9
Link: https://lore.kernel.org/r/20230125210657.2335748-1-dhowells@redhat.com/ # v10
Link: https://lore.kernel.org/r/20230126141626.2809643-1-dhowells@redhat.com/ # v11
Link: https://lore.kernel.org/r/20230207171305.3716974-1-dhowells@redhat.com/ # v12
Link: https://lore.kernel.org/r/20230209102954.528942-1-dhowells@redhat.com/ # v13
Link: https://lore.kernel.org/r/20230214171330.2722188-1-dhowells@redhat.com/ # v14
Link: https://lore.kernel.org/r/20230308143754.1976726-1-dhowells@redhat.com/ # v16
Link: https://lore.kernel.org/r/20230308165251.2078898-1-dhowells@redhat.com/ # v17
Link: https://lore.kernel.org/r/20230314220757.3827941-1-dhowells@redhat.com/ # v18
Link: https://lore.kernel.org/r/20230315163549.295454-1-dhowells@redhat.com/ # v19
Link: https://lore.kernel.org/r/20230519074047.1739879-1-dhowells@redhat.com/ #v20

Additional patches that got folded in:

Link: https://lore.kernel.org/r/20230213134619.2198965-1-dhowells@redhat.com/ # v1
Link: https://lore.kernel.org/r/20230213153301.2338806-1-dhowells@redhat.com/ # v2
Link: https://lore.kernel.org/r/20230214083710.2547248-1-dhowells@redhat.com/ # v3

David Howells (30):
  splice: Fix filemap of a blockdev
  splice: Make filemap_splice_read() check s_maxbytes
  splice: Rename direct_splice_read() to copy_splice_read()
  splice: Clean up copy_splice_read() a bit
  splice: Make do_splice_to() generic and export it
  splice: Check for zero count in vfs_splice_read()
  splice: Make splice from an O_DIRECT fd use copy_splice_read()
  splice: Make splice from a DAX file use copy_splice_read()
  shmem: Implement splice-read
  overlayfs: Implement splice-read
  coda: Implement splice-read
  tty, proc, kernfs, random: Use copy_splice_read()
  net: Make sock_splice_read() use copy_splice_read() by default
  9p:  Add splice_read stub
  afs: Provide a splice-read stub
  ceph: Provide a splice-read stub
  ecryptfs: Provide a splice-read stub
  ext4: Provide a splice-read stub
  f2fs: Provide a splice-read stub
  nfs: Provide a splice-read stub
  ntfs3: Provide a splice-read stub
  ocfs2: Provide a splice-read stub
  orangefs: Provide a splice-read stub
  xfs: Provide a splice-read stub
  zonefs: Provide a splice-read stub
  splice: Convert trace/seq to use copy_splice_read()
  cifs: Use filemap_splice_read()
  splice: Use filemap_splice_read() instead of
    generic_file_splice_read()
  splice: Remove generic_file_splice_read()
  iov_iter: Kill ITER_PIPE

 block/fops.c            |   2 +-
 drivers/char/random.c   |   4 +-
 drivers/tty/tty_io.c    |   4 +-
 fs/9p/vfs_file.c        |  26 ++-
 fs/adfs/file.c          |   2 +-
 fs/affs/file.c          |   2 +-
 fs/afs/file.c           |  20 +-
 fs/bfs/file.c           |   2 +-
 fs/btrfs/file.c         |   2 +-
 fs/ceph/file.c          |  65 +++++-
 fs/cifs/cifsfs.c        |  12 +-
 fs/cifs/cifsfs.h        |   3 -
 fs/cifs/file.c          |  16 --
 fs/coda/file.c          |  29 ++-
 fs/cramfs/inode.c       |   2 +-
 fs/ecryptfs/file.c      |  27 ++-
 fs/erofs/data.c         |   2 +-
 fs/exfat/file.c         |   2 +-
 fs/ext2/file.c          |   2 +-
 fs/ext4/file.c          |  13 +-
 fs/f2fs/file.c          |  43 +++-
 fs/fat/file.c           |   2 +-
 fs/fuse/file.c          |   2 +-
 fs/gfs2/file.c          |   4 +-
 fs/hfs/inode.c          |   2 +-
 fs/hfsplus/inode.c      |   2 +-
 fs/hostfs/hostfs_kern.c |   2 +-
 fs/hpfs/file.c          |   2 +-
 fs/jffs2/file.c         |   2 +-
 fs/jfs/file.c           |   2 +-
 fs/kernfs/file.c        |   2 +-
 fs/minix/file.c         |   2 +-
 fs/nfs/file.c           |  23 ++-
 fs/nfs/internal.h       |   2 +
 fs/nfs/nfs4file.c       |   2 +-
 fs/nilfs2/file.c        |   2 +-
 fs/ntfs/file.c          |   2 +-
 fs/ntfs3/file.c         |  31 ++-
 fs/ocfs2/file.c         |  41 +++-
 fs/ocfs2/ocfs2_trace.h  |   3 +
 fs/omfs/file.c          |   2 +-
 fs/orangefs/file.c      |  22 +-
 fs/overlayfs/file.c     |  23 ++-
 fs/proc/inode.c         |   4 +-
 fs/proc/proc_sysctl.c   |   2 +-
 fs/proc_namespace.c     |   6 +-
 fs/ramfs/file-mmu.c     |   2 +-
 fs/ramfs/file-nommu.c   |   2 +-
 fs/read_write.c         |   2 +-
 fs/reiserfs/file.c      |   2 +-
 fs/romfs/mmap-nommu.c   |   2 +-
 fs/splice.c             | 108 ++++------
 fs/sysv/file.c          |   2 +-
 fs/ubifs/file.c         |   2 +-
 fs/udf/file.c           |   2 +-
 fs/ufs/file.c           |   2 +-
 fs/vboxsf/file.c        |   2 +-
 fs/xfs/xfs_file.c       |  30 ++-
 fs/xfs/xfs_trace.h      |   2 +-
 fs/zonefs/file.c        |  40 +++-
 include/linux/fs.h      |   8 +-
 include/linux/splice.h  |   3 +
 include/linux/uio.h     |  14 --
 kernel/trace/trace.c    |   2 +-
 lib/iov_iter.c          | 431 +---------------------------------------
 mm/filemap.c            |  10 +-
 mm/shmem.c              | 134 ++++++++++++-
 net/socket.c            |   2 +-
 68 files changed, 657 insertions(+), 616 deletions(-)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v21 01/30] splice: Fix filemap of a blockdev
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:08   ` Christoph Hellwig
  2023-05-20  9:14   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes David Howells
                   ` (28 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard

Fix filemap_splice_read() to use file->f_mapping->host, not file->f_inode,
as the source of the file size because in the case of a block device,
file->f_inode points to the block-special file (which is typically 0
length) and not the backing store.

Fixes: 07073eb01c5f ("splice: Add a func to do a splice from a buffered file without ITER_PIPE")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
cc: Steve French <stfrench@microsoft.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 mm/filemap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b4c9bd368b7e..a2006936a6ae 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2900,7 +2900,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 	do {
 		cond_resched();
 
-		if (*ppos >= i_size_read(file_inode(in)))
+		if (*ppos >= i_size_read(in->f_mapping->host))
 			break;
 
 		iocb.ki_pos = *ppos;
@@ -2916,7 +2916,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 		 * part of the page is not copied back to userspace (unless
 		 * another truncate extends the file - this is desired though).
 		 */
-		isize = i_size_read(file_inode(in));
+		isize = i_size_read(in->f_mapping->host);
 		if (unlikely(*ppos >= isize))
 			break;
 		end_offset = min_t(loff_t, isize, *ppos + len);


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
  2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:09   ` Christoph Hellwig
  2023-05-20  9:21   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
                   ` (27 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard

Make filemap_splice_read() check s_maxbytes analogously to filemap_read().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Steve French <stfrench@microsoft.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 mm/filemap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index a2006936a6ae..0fcb0b80c2e2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2887,6 +2887,9 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 	bool writably_mapped;
 	int i, error = 0;
 
+	if (unlikely(*ppos >= in->f_mapping->host->i_sb->s_maxbytes))
+		return 0;
+
 	init_sync_kiocb(&iocb, in);
 	iocb.ki_pos = *ppos;
 


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
  2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
  2023-05-20  0:00 ` [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:09   ` Christoph Hellwig
                     ` (3 more replies)
  2023-05-20  0:00 ` [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit David Howells
                   ` (26 subsequent siblings)
  29 siblings, 4 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, linux-cifs

Rename direct_splice_read() to copy_splice_read() to better reflect as to
what it does.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Steve French <sfrench@samba.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: linux-cifs@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 fs/cifs/cifsfs.c   |  4 ++--
 fs/cifs/file.c     |  2 +-
 fs/splice.c        | 11 +++++------
 include/linux/fs.h |  6 +++---
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 43a4d8603db3..fa2477bbcc86 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1416,7 +1416,7 @@ const struct file_operations cifs_file_direct_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
-	.splice_read = direct_splice_read,
+	.splice_read = copy_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1470,7 +1470,7 @@ const struct file_operations cifs_file_direct_nobrl_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
-	.splice_read = direct_splice_read,
+	.splice_read = copy_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index c5fcefdfd797..023496207c18 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -5091,6 +5091,6 @@ ssize_t cifs_splice_read(struct file *in, loff_t *ppos,
 	if (unlikely(!len))
 		return 0;
 	if (in->f_flags & O_DIRECT)
-		return direct_splice_read(in, ppos, pipe, len, flags);
+		return copy_splice_read(in, ppos, pipe, len, flags);
 	return filemap_splice_read(in, ppos, pipe, len, flags);
 }
diff --git a/fs/splice.c b/fs/splice.c
index 3e06611d19ae..2478e065bc53 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -300,12 +300,11 @@ void splice_shrink_spd(struct splice_pipe_desc *spd)
 }
 
 /*
- * Splice data from an O_DIRECT file into pages and then add them to the output
- * pipe.
+ * Copy data from a file into pages and then splice those into the output pipe.
  */
-ssize_t direct_splice_read(struct file *in, loff_t *ppos,
-			   struct pipe_inode_info *pipe,
-			   size_t len, unsigned int flags)
+ssize_t copy_splice_read(struct file *in, loff_t *ppos,
+			 struct pipe_inode_info *pipe,
+			 size_t len, unsigned int flags)
 {
 	struct iov_iter to;
 	struct bio_vec *bv;
@@ -390,7 +389,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
 	kfree(bv);
 	return ret;
 }
-EXPORT_SYMBOL(direct_splice_read);
+EXPORT_SYMBOL(copy_splice_read);
 
 /**
  * generic_file_splice_read - splice data from file to a pipe
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..e3c22efa413e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2752,9 +2752,9 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
 ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 			    struct pipe_inode_info *pipe,
 			    size_t len, unsigned int flags);
-ssize_t direct_splice_read(struct file *in, loff_t *ppos,
-			   struct pipe_inode_info *pipe,
-			   size_t len, unsigned int flags);
+ssize_t copy_splice_read(struct file *in, loff_t *ppos,
+			 struct pipe_inode_info *pipe,
+			 size_t len, unsigned int flags);
 extern ssize_t generic_file_splice_read(struct file *, loff_t *,
 		struct pipe_inode_info *, size_t, unsigned int);
 extern ssize_t iter_file_splice_write(struct pipe_inode_info *,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (2 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  9:34   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 05/30] splice: Make do_splice_to() generic and export it David Howells
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard

Do a couple of cleanups to copy_splice_read():

 (1) Cast to struct page **, not void *.

 (2) Simplify the calculation of the number of pages to keep/reclaim in
     copy_splice_read().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---

Notes:
    ver #21)
     - direct_splice_read() got renamed to copy_splice_read().

 fs/splice.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 2478e065bc53..f9a9be797b0c 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -311,7 +311,7 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 	struct kiocb kiocb;
 	struct page **pages;
 	ssize_t ret;
-	size_t used, npages, chunk, remain, reclaim;
+	size_t used, npages, chunk, remain, keep = 0;
 	int i;
 
 	/* Work out how much data we can actually add into the pipe */
@@ -325,7 +325,7 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 	if (!bv)
 		return -ENOMEM;
 
-	pages = (void *)(bv + npages);
+	pages = (struct page **)(bv + npages);
 	npages = alloc_pages_bulk_array(GFP_USER, npages, pages);
 	if (!npages) {
 		kfree(bv);
@@ -348,11 +348,8 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 	kiocb.ki_pos = *ppos;
 	ret = call_read_iter(in, &kiocb, &to);
 
-	reclaim = npages * PAGE_SIZE;
-	remain = 0;
 	if (ret > 0) {
-		reclaim -= ret;
-		remain = ret;
+		keep = DIV_ROUND_UP(ret, PAGE_SIZE);
 		*ppos = kiocb.ki_pos;
 		file_accessed(in);
 	} else if (ret < 0) {
@@ -365,14 +362,12 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 	}
 
 	/* Free any pages that didn't get touched at all. */
-	reclaim /= PAGE_SIZE;
-	if (reclaim) {
-		npages -= reclaim;
-		release_pages(pages + npages, reclaim);
-	}
+	if (keep < npages)
+		release_pages(pages + keep, npages - keep);
 
 	/* Push the remaining pages into the pipe. */
-	for (i = 0; i < npages; i++) {
+	remain = ret;
+	for (i = 0; i < keep; i++) {
 		struct pipe_buffer *buf = pipe_head_buf(pipe);
 
 		chunk = min_t(size_t, remain, PAGE_SIZE);


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 05/30] splice: Make do_splice_to() generic and export it
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (3 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  9:35   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read() David Howells
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Miklos Szeredi, John Hubbard, linux-unionfs

Rename do_splice_to() to vfs_splice_read() and export it so that it can be
used as a helper when calling down to a lower layer filesystem as it
performs all the necessary checks[1].

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
cc: Miklos Szeredi <miklos@szeredi.hu>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: John Hubbard <jhubbard@nvidia.com>
cc: David Hildenbrand <david@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-unionfs@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/CAJfpeguGksS3sCigmRi9hJdUec8qtM9f+_9jC1rJhsXT+dV01w@mail.gmail.com/ [1]
---
 fs/splice.c            | 27 ++++++++++++++++++++-------
 include/linux/splice.h |  3 +++
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index f9a9be797b0c..d815a69f6589 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -867,12 +867,24 @@ static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
 	return out->f_op->splice_write(pipe, out, ppos, len, flags);
 }
 
-/*
- * Attempt to initiate a splice from a file to a pipe.
+/**
+ * vfs_splice_read - Read data from a file and splice it into a pipe
+ * @in:		File to splice from
+ * @ppos:	Input file offset
+ * @pipe:	Pipe to splice to
+ * @len:	Number of bytes to splice
+ * @flags:	Splice modifier flags (SPLICE_F_*)
+ *
+ * Splice the requested amount of data from the input file to the pipe.  This
+ * is synchronous as the caller must hold the pipe lock across the entire
+ * operation.
+ *
+ * If successful, it returns the amount of data spliced, 0 if it hit the EOF or
+ * a hole and a negative error code otherwise.
  */
-static long do_splice_to(struct file *in, loff_t *ppos,
-			 struct pipe_inode_info *pipe, size_t len,
-			 unsigned int flags)
+long vfs_splice_read(struct file *in, loff_t *ppos,
+		     struct pipe_inode_info *pipe, size_t len,
+		     unsigned int flags)
 {
 	unsigned int p_space;
 	int ret;
@@ -895,6 +907,7 @@ static long do_splice_to(struct file *in, loff_t *ppos,
 		return warn_unsupported(in, "read");
 	return in->f_op->splice_read(in, ppos, pipe, len, flags);
 }
+EXPORT_SYMBOL_GPL(vfs_splice_read);
 
 /**
  * splice_direct_to_actor - splices data directly between two non-pipes
@@ -964,7 +977,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
 		size_t read_len;
 		loff_t pos = sd->pos, prev_pos = pos;
 
-		ret = do_splice_to(in, &pos, pipe, len, flags);
+		ret = vfs_splice_read(in, &pos, pipe, len, flags);
 		if (unlikely(ret <= 0))
 			goto out_release;
 
@@ -1112,7 +1125,7 @@ long splice_file_to_pipe(struct file *in,
 	pipe_lock(opipe);
 	ret = wait_for_space(opipe, flags);
 	if (!ret)
-		ret = do_splice_to(in, offset, opipe, len, flags);
+		ret = vfs_splice_read(in, offset, opipe, len, flags);
 	pipe_unlock(opipe);
 	if (ret > 0)
 		wakeup_pipe_readers(opipe);
diff --git a/include/linux/splice.h b/include/linux/splice.h
index a55179fd60fc..8f052c3dae95 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -76,6 +76,9 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *,
 			      struct splice_pipe_desc *);
 extern ssize_t add_to_pipe(struct pipe_inode_info *,
 			      struct pipe_buffer *);
+long vfs_splice_read(struct file *in, loff_t *ppos,
+		     struct pipe_inode_info *pipe, size_t len,
+		     unsigned int flags);
 extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
 				      splice_direct_actor *);
 extern long do_splice(struct file *in, loff_t *off_in,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (4 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 05/30] splice: Make do_splice_to() generic and export it David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  9:38   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read() David Howells
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig

Make vfs_splice_read() return immediately if the length is 0.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/splice.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/splice.c b/fs/splice.c
index d815a69f6589..fe3309ffeb26 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -891,6 +891,8 @@ long vfs_splice_read(struct file *in, loff_t *ppos,
 
 	if (unlikely(!(in->f_mode & FMODE_READ)))
 		return -EBADF;
+	if (!len)
+		return 0;
 
 	/* Don't try to read more the pipe has space for. */
 	p_space = pipe->max_usage - pipe_occupancy(pipe->head, pipe->tail);


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (5 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:11   ` Christoph Hellwig
  2023-05-20  9:39   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
                   ` (22 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig

Make a read splice from a file descriptor that's open O_DIRECT use
copy_splice_read() to do the reading as filemap_splice_read() is unlikely
to find any pagecache to splice.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #21)
     - Needs to be in vfs_splice_read(), not generic_file_splice_read().

 fs/splice.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/splice.c b/fs/splice.c
index fe3309ffeb26..76126b1aafcb 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -907,6 +907,12 @@ long vfs_splice_read(struct file *in, loff_t *ppos,
 
 	if (unlikely(!in->f_op->splice_read))
 		return warn_unsupported(in, "read");
+	/*
+	 * O_DIRECT doesn't deal with the pagecache, so we allocate a buffer,
+	 * copy into it and splice that into the pipe.
+	 */
+	if ((in->f_flags & O_DIRECT))
+		return copy_splice_read(in, ppos, pipe, len, flags);
 	return in->f_op->splice_read(in, ppos, pipe, len, flags);
 }
 EXPORT_SYMBOL_GPL(vfs_splice_read);


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 08/30] splice: Make splice from a DAX file use copy_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (6 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:11   ` Christoph Hellwig
                     ` (3 more replies)
  2023-05-20  0:00 ` [PATCH v21 09/30] shmem: Implement splice-read David Howells
                   ` (21 subsequent siblings)
  29 siblings, 4 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	linux-erofs, linux-ext4, linux-xfs

Make a read splice from a DAX file go directly to copy_splice_read() to do
the reading as filemap_splice_read() is unlikely to find any pagecache to
splice.

I think this affects only erofs, Ext2, Ext4, fuse and XFS.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: linux-erofs@lists.ozlabs.org
cc: linux-ext4@vger.kernel.org
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #21)
     - Don't need #ifdef CONFIG_FS_DAX as IS_DAX() is false if !CONFIG_FS_DAX.
     - Needs to be in vfs_splice_read(), not generic_file_splice_read().

 fs/splice.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 76126b1aafcb..8268248df3a9 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -908,10 +908,10 @@ long vfs_splice_read(struct file *in, loff_t *ppos,
 	if (unlikely(!in->f_op->splice_read))
 		return warn_unsupported(in, "read");
 	/*
-	 * O_DIRECT doesn't deal with the pagecache, so we allocate a buffer,
-	 * copy into it and splice that into the pipe.
+	 * O_DIRECT and DAX don't deal with the pagecache, so we allocate a
+	 * buffer, copy into it and splice that into the pipe.
 	 */
-	if ((in->f_flags & O_DIRECT))
+	if ((in->f_flags & O_DIRECT) || IS_DAX(in->f_mapping->host))
 		return copy_splice_read(in, ppos, pipe, len, flags);
 	return in->f_op->splice_read(in, ppos, pipe, len, flags);
 }


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 09/30] shmem: Implement splice-read
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (7 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 10/30] overlayfs: " David Howells
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Daniel Golle, Guenter Roeck,
	Christoph Hellwig, John Hubbard, Hugh Dickins

The new filemap_splice_read() has an implicit expectation via
filemap_get_pages() that ->read_folio() exists if ->readahead() doesn't
fully populate the pagecache of the file it is reading from[1], potentially
leading to a jump to NULL if this doesn't exist.  shmem, however, (and by
extension, tmpfs, ramfs and rootfs), doesn't have ->read_folio(),

Work around this by equipping shmem with its own splice-read
implementation, based on filemap_splice_read(), but able to paste in
zero_page when there's a page missing.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Daniel Golle <daniel@makrotopia.org>
cc: Guenter Roeck <groeck7@gmail.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: John Hubbard <jhubbard@nvidia.com>
cc: David Hildenbrand <david@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Hugh Dickins <hughd@google.com>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/Y+pdHFFTk1TTEBsO@makrotopia.org/ [1]
---

Notes:
    ver #19)
     - Remove a missed get_page() on the zero page.
    
    ver #18)
     - Don't take/release a ref on the zero page.

 mm/shmem.c | 134 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 133 insertions(+), 1 deletion(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index e40a08c5c6d7..1f504ed982cf 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2731,6 +2731,138 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return retval ? retval : error;
 }
 
+static bool zero_pipe_buf_get(struct pipe_inode_info *pipe,
+			      struct pipe_buffer *buf)
+{
+	return true;
+}
+
+static void zero_pipe_buf_release(struct pipe_inode_info *pipe,
+				  struct pipe_buffer *buf)
+{
+}
+
+static bool zero_pipe_buf_try_steal(struct pipe_inode_info *pipe,
+				    struct pipe_buffer *buf)
+{
+	return false;
+}
+
+static const struct pipe_buf_operations zero_pipe_buf_ops = {
+	.release	= zero_pipe_buf_release,
+	.try_steal	= zero_pipe_buf_try_steal,
+	.get		= zero_pipe_buf_get,
+};
+
+static size_t splice_zeropage_into_pipe(struct pipe_inode_info *pipe,
+					loff_t fpos, size_t size)
+{
+	size_t offset = fpos & ~PAGE_MASK;
+
+	size = min_t(size_t, size, PAGE_SIZE - offset);
+
+	if (!pipe_full(pipe->head, pipe->tail, pipe->max_usage)) {
+		struct pipe_buffer *buf = pipe_head_buf(pipe);
+
+		*buf = (struct pipe_buffer) {
+			.ops	= &zero_pipe_buf_ops,
+			.page	= ZERO_PAGE(0),
+			.offset	= offset,
+			.len	= size,
+		};
+		pipe->head++;
+	}
+
+	return size;
+}
+
+static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
+				      struct pipe_inode_info *pipe,
+				      size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	struct address_space *mapping = inode->i_mapping;
+	struct folio *folio = NULL;
+	size_t total_spliced = 0, used, npages, n, part;
+	loff_t isize;
+	int error = 0;
+
+	/* Work out how much data we can actually add into the pipe */
+	used = pipe_occupancy(pipe->head, pipe->tail);
+	npages = max_t(ssize_t, pipe->max_usage - used, 0);
+	len = min_t(size_t, len, npages * PAGE_SIZE);
+
+	do {
+		if (*ppos >= i_size_read(inode))
+			break;
+
+		error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, SGP_READ);
+		if (error) {
+			if (error == -EINVAL)
+				error = 0;
+			break;
+		}
+		if (folio) {
+			folio_unlock(folio);
+
+			if (folio_test_hwpoison(folio)) {
+				error = -EIO;
+				break;
+			}
+		}
+
+		/*
+		 * i_size must be checked after we know the pages are Uptodate.
+		 *
+		 * Checking i_size after the check allows us to calculate
+		 * the correct value for "nr", which means the zero-filled
+		 * part of the page is not copied back to userspace (unless
+		 * another truncate extends the file - this is desired though).
+		 */
+		isize = i_size_read(inode);
+		if (unlikely(*ppos >= isize))
+			break;
+		part = min_t(loff_t, isize - *ppos, len);
+
+		if (folio) {
+			/*
+			 * If users can be writing to this page using arbitrary
+			 * virtual addresses, take care about potential aliasing
+			 * before reading the page on the kernel side.
+			 */
+			if (mapping_writably_mapped(mapping))
+				flush_dcache_folio(folio);
+			folio_mark_accessed(folio);
+			/*
+			 * Ok, we have the page, and it's up-to-date, so we can
+			 * now splice it into the pipe.
+			 */
+			n = splice_folio_into_pipe(pipe, folio, *ppos, part);
+			folio_put(folio);
+			folio = NULL;
+		} else {
+			n = splice_zeropage_into_pipe(pipe, *ppos, len);
+		}
+
+		if (!n)
+			break;
+		len -= n;
+		total_spliced += n;
+		*ppos += n;
+		in->f_ra.prev_pos = *ppos;
+		if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
+			break;
+
+		cond_resched();
+	} while (len);
+
+	if (folio)
+		folio_put(folio);
+
+	file_accessed(in);
+	return total_spliced ? total_spliced : error;
+}
+
 static loff_t shmem_file_llseek(struct file *file, loff_t offset, int whence)
 {
 	struct address_space *mapping = file->f_mapping;
@@ -3971,7 +4103,7 @@ static const struct file_operations shmem_file_operations = {
 	.read_iter	= shmem_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.fsync		= noop_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= shmem_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= shmem_fallocate,
 #endif


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 10/30] overlayfs: Implement splice-read
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (8 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 09/30] shmem: Implement splice-read David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  9:47   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 11/30] coda: " David Howells
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard, Miklos Szeredi, linux-unionfs

Implement splice-read for overlayfs by passing the request down a layer
rather than going through generic_file_splice_read() which is going to be
changed to assume that ->read_folio() is present on buffered files.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: John Hubbard <jhubbard@nvidia.com>
cc: David Hildenbrand <david@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Miklos Szeredi <miklos@szeredi.hu>
cc: linux-unionfs@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #17)
     - Use vfs_splice_read() helper rather than open-coding checks.
    
    ver #15)
     - Remove redundant FMODE_CAN_ODIRECT check on real file.
     - Do rw_verify_area() on the real file, not the overlay file.
     - Fix a file leak.

 fs/overlayfs/file.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 7c04f033aadd..86197882ff35 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -419,6 +419,27 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 	return ret;
 }
 
+static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
+			       struct pipe_inode_info *pipe, size_t len,
+			       unsigned int flags)
+{
+	const struct cred *old_cred;
+	struct fd real;
+	ssize_t ret;
+
+	ret = ovl_real_fdget(in, &real);
+	if (ret)
+		return ret;
+
+	old_cred = ovl_override_creds(file_inode(in)->i_sb);
+	ret = vfs_splice_read(real.file, ppos, pipe, len, flags);
+	revert_creds(old_cred);
+	ovl_file_accessed(in);
+
+	fdput(real);
+	return ret;
+}
+
 /*
  * Calling iter_file_splice_write() directly from overlay's f_op may deadlock
  * due to lock order inversion between pipe->mutex in iter_file_splice_write()
@@ -695,7 +716,7 @@ const struct file_operations ovl_file_operations = {
 	.fallocate	= ovl_fallocate,
 	.fadvise	= ovl_fadvise,
 	.flush		= ovl_flush,
-	.splice_read    = generic_file_splice_read,
+	.splice_read    = ovl_splice_read,
 	.splice_write   = ovl_splice_write,
 
 	.copy_file_range	= ovl_copy_file_range,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 11/30] coda: Implement splice-read
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (9 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 10/30] overlayfs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 12/30] tty, proc, kernfs, random: Use copy_splice_read() David Howells
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Jan Harkes,
	Christoph Hellwig, John Hubbard, coda, codalist, linux-unionfs

Implement splice-read for coda by passing the request down a layer rather
than going through generic_file_splice_read() which is going to be changed
to assume that ->read_folio() is present on buffered files.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: John Hubbard <jhubbard@nvidia.com>
cc: David Hildenbrand <david@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: coda@cs.cmu.edu
cc: codalist@coda.cs.cmu.edu
cc: linux-unionfs@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #17)
     - Use vfs_splice_read() helper rather than open-coding checks.

 fs/coda/file.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/coda/file.c b/fs/coda/file.c
index 3f3c81e6b1ab..12b26bd13564 100644
--- a/fs/coda/file.c
+++ b/fs/coda/file.c
@@ -23,6 +23,7 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/uio.h>
+#include <linux/splice.h>
 
 #include <linux/coda.h>
 #include "coda_psdev.h"
@@ -94,6 +95,32 @@ coda_file_write_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+static ssize_t
+coda_file_splice_read(struct file *coda_file, loff_t *ppos,
+		      struct pipe_inode_info *pipe,
+		      size_t len, unsigned int flags)
+{
+	struct inode *coda_inode = file_inode(coda_file);
+	struct coda_file_info *cfi = coda_ftoc(coda_file);
+	struct file *in = cfi->cfi_container;
+	loff_t ki_pos = *ppos;
+	ssize_t ret;
+
+	ret = venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode),
+				  &cfi->cfi_access_intent,
+				  len, ki_pos, CODA_ACCESS_TYPE_READ);
+	if (ret)
+		goto finish_read;
+
+	ret = vfs_splice_read(in, ppos, pipe, len, flags);
+
+finish_read:
+	venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode),
+			    &cfi->cfi_access_intent,
+			    len, ki_pos, CODA_ACCESS_TYPE_READ_FINISH);
+	return ret;
+}
+
 static void
 coda_vm_open(struct vm_area_struct *vma)
 {
@@ -302,5 +329,5 @@ const struct file_operations coda_file_operations = {
 	.open		= coda_open,
 	.release	= coda_release,
 	.fsync		= coda_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= coda_file_splice_read,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 12/30] tty, proc, kernfs, random: Use copy_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (10 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 11/30] coda: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 13/30] net: Make sock_splice_read() use copy_splice_read() by default David Howells
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Greg Kroah-Hartman,
	Christoph Hellwig, John Hubbard, Miklos Szeredi, Arnd Bergmann

Use copy_splice_read() for tty, procfs, kernfs and random files rather
than going through generic_file_splice_read() as they just copy the file
into the output buffer and don't splice pages.  This avoids the need for
them to have a ->read_folio() to satisfy filemap_splice_read().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: John Hubbard <jhubbard@nvidia.com>
cc: David Hildenbrand <david@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Miklos Szeredi <miklos@szeredi.hu>
cc: Arnd Bergmann <arnd@arndb.de>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---
 drivers/char/random.c | 4 ++--
 drivers/tty/tty_io.c  | 4 ++--
 fs/kernfs/file.c      | 2 +-
 fs/proc/inode.c       | 4 ++--
 fs/proc/proc_sysctl.c | 2 +-
 fs/proc_namespace.c   | 6 +++---
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 253f2ddb8913..3cb37760dfec 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1546,7 +1546,7 @@ const struct file_operations random_fops = {
 	.compat_ioctl = compat_ptr_ioctl,
 	.fasync = random_fasync,
 	.llseek = noop_llseek,
-	.splice_read = generic_file_splice_read,
+	.splice_read = copy_splice_read,
 	.splice_write = iter_file_splice_write,
 };
 
@@ -1557,7 +1557,7 @@ const struct file_operations urandom_fops = {
 	.compat_ioctl = compat_ptr_ioctl,
 	.fasync = random_fasync,
 	.llseek = noop_llseek,
-	.splice_read = generic_file_splice_read,
+	.splice_read = copy_splice_read,
 	.splice_write = iter_file_splice_write,
 };
 
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index c84be40fb8df..4737a8f92c2e 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -466,7 +466,7 @@ static const struct file_operations tty_fops = {
 	.llseek		= no_llseek,
 	.read_iter	= tty_read,
 	.write_iter	= tty_write,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.poll		= tty_poll,
 	.unlocked_ioctl	= tty_ioctl,
@@ -481,7 +481,7 @@ static const struct file_operations console_fops = {
 	.llseek		= no_llseek,
 	.read_iter	= tty_read,
 	.write_iter	= redirected_tty_write,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.poll		= tty_poll,
 	.unlocked_ioctl	= tty_ioctl,
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 40c4661f15b7..180906c36f51 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -1011,7 +1011,7 @@ const struct file_operations kernfs_file_fops = {
 	.release	= kernfs_fop_release,
 	.poll		= kernfs_fop_poll,
 	.fsync		= noop_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
 
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index f495fdb39151..67b09a1d9433 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -591,7 +591,7 @@ static const struct file_operations proc_iter_file_ops = {
 	.llseek		= proc_reg_llseek,
 	.read_iter	= proc_reg_read_iter,
 	.write		= proc_reg_write,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.poll		= proc_reg_poll,
 	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
 	.mmap		= proc_reg_mmap,
@@ -617,7 +617,7 @@ static const struct file_operations proc_reg_file_ops_compat = {
 static const struct file_operations proc_iter_file_ops_compat = {
 	.llseek		= proc_reg_llseek,
 	.read_iter	= proc_reg_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.write		= proc_reg_write,
 	.poll		= proc_reg_poll,
 	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 8038833ff5b0..ae832e982003 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -868,7 +868,7 @@ static const struct file_operations proc_sys_file_operations = {
 	.poll		= proc_sys_poll,
 	.read_iter	= proc_sys_read,
 	.write_iter	= proc_sys_write,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.llseek		= default_llseek,
 };
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 846f9455ae22..250eb5bf7b52 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -324,7 +324,7 @@ static int mountstats_open(struct inode *inode, struct file *file)
 const struct file_operations proc_mounts_operations = {
 	.open		= mounts_open,
 	.read_iter	= seq_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.llseek		= seq_lseek,
 	.release	= mounts_release,
 	.poll		= mounts_poll,
@@ -333,7 +333,7 @@ const struct file_operations proc_mounts_operations = {
 const struct file_operations proc_mountinfo_operations = {
 	.open		= mountinfo_open,
 	.read_iter	= seq_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.llseek		= seq_lseek,
 	.release	= mounts_release,
 	.poll		= mounts_poll,
@@ -342,7 +342,7 @@ const struct file_operations proc_mountinfo_operations = {
 const struct file_operations proc_mountstats_operations = {
 	.open		= mountstats_open,
 	.read_iter	= seq_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.llseek		= seq_lseek,
 	.release	= mounts_release,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 13/30] net: Make sock_splice_read() use copy_splice_read() by default
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (11 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 12/30] tty, proc, kernfs, random: Use copy_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 14/30] 9p: Add splice_read stub David Howells
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Christoph Hellwig,
	netdev

Make sock_splice_read() use copy_splice_read() by default as
file_splice_read() will return immediately with 0 as a socket has no
pagecache and is a zero-size file.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: netdev@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 net/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index b7e01d0fe082..401778380195 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1093,7 +1093,7 @@ static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
 	struct socket *sock = file->private_data;
 
 	if (unlikely(!sock->ops->splice_read))
-		return generic_file_splice_read(file, ppos, pipe, len, flags);
+		return copy_splice_read(file, ppos, pipe, len, flags);
 
 	return sock->ops->splice_read(sock, ppos, pipe, len, flags);
 }


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 14/30] 9p:  Add splice_read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (12 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 13/30] net: Make sock_splice_read() use copy_splice_read() by default David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 15/30] afs: Provide a splice-read stub David Howells
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov,
	Christian Schoenebeck, v9fs

Add a splice_read stub for 9p.  We should use copy_splice_read() if
9PL_DIRECT is set and filemap_splice_read() otherwise.  Note that this
doesn't seem to be particularly related to O_DIRECT.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Eric Van Hensbergen <ericvh@kernel.org>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: v9fs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/9p/vfs_file.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 6c31b8c8112d..2996fb00387f 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -374,6 +374,28 @@ v9fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+/*
+ * v9fs_file_splice_read - splice-read from a file
+ * @in: The 9p file to read from
+ * @ppos: Where to find/update the file position
+ * @pipe: The pipe to splice into
+ * @len: The maximum amount of data to splice
+ * @flags: SPLICE_F_* flags
+ */
+static ssize_t v9fs_file_splice_read(struct file *in, loff_t *ppos,
+				     struct pipe_inode_info *pipe,
+				     size_t len, unsigned int flags)
+{
+	struct p9_fid *fid = in->private_data;
+
+	p9_debug(P9_DEBUG_VFS, "fid %d count %zu offset %lld\n",
+		 fid->fid, len, *ppos);
+
+	if (fid->mode & P9L_DIRECT)
+		return copy_splice_read(in, ppos, pipe, len, flags);
+	return filemap_splice_read(in, ppos, pipe, len, flags);
+}
+
 /**
  * v9fs_file_write_iter - write to a file
  * @iocb: The operation parameters
@@ -569,7 +591,7 @@ const struct file_operations v9fs_file_operations = {
 	.release = v9fs_dir_release,
 	.lock = v9fs_file_lock,
 	.mmap = generic_file_readonly_mmap,
-	.splice_read = generic_file_splice_read,
+	.splice_read = v9fs_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.fsync = v9fs_file_fsync,
 };
@@ -583,7 +605,7 @@ const struct file_operations v9fs_file_operations_dotl = {
 	.lock = v9fs_file_lock_dotl,
 	.flock = v9fs_file_flock_dotl,
 	.mmap = v9fs_file_mmap,
-	.splice_read = generic_file_splice_read,
+	.splice_read = v9fs_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.fsync = v9fs_file_fsync_dotl,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 15/30] afs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (13 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 14/30] 9p: Add splice_read stub David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 16/30] ceph: " David Howells
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Marc Dionne, linux-afs

Provide a splice_read stub for AFS to call afs_validate() before going into
generic_file_splice_read() so that we're likely to have a callback promise
from the server.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/afs/file.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 719b31374879..d8a6b09dadf7 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -25,6 +25,9 @@ static void afs_invalidate_folio(struct folio *folio, size_t offset,
 static bool afs_release_folio(struct folio *folio, gfp_t gfp_flags);
 
 static ssize_t afs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter);
+static ssize_t afs_file_splice_read(struct file *in, loff_t *ppos,
+				    struct pipe_inode_info *pipe,
+				    size_t len, unsigned int flags);
 static void afs_vm_open(struct vm_area_struct *area);
 static void afs_vm_close(struct vm_area_struct *area);
 static vm_fault_t afs_vm_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff);
@@ -36,7 +39,7 @@ const struct file_operations afs_file_operations = {
 	.read_iter	= afs_file_read_iter,
 	.write_iter	= afs_file_write,
 	.mmap		= afs_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= afs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fsync		= afs_fsync,
 	.lock		= afs_lock,
@@ -587,3 +590,18 @@ static ssize_t afs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 
 	return generic_file_read_iter(iocb, iter);
 }
+
+static ssize_t afs_file_splice_read(struct file *in, loff_t *ppos,
+				    struct pipe_inode_info *pipe,
+				    size_t len, unsigned int flags)
+{
+	struct afs_vnode *vnode = AFS_FS_I(file_inode(in));
+	struct afs_file *af = in->private_data;
+	int ret;
+
+	ret = afs_validate(vnode, af->key);
+	if (ret < 0)
+		return ret;
+
+	return generic_file_splice_read(in, ppos, pipe, len, flags);
+}


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 16/30] ceph: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (14 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 15/30] afs: Provide a splice-read stub David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-22  2:12   ` Xiubo Li
  2023-05-20  0:00 ` [PATCH v21 17/30] ecryptfs: " David Howells
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig, Xiubo Li,
	Ilya Dryomov, ceph-devel

Provide a splice_read stub for Ceph.  This does the inode shutdown check
before proceeding and jumps to copy_splice_read() if the file has inline
data or is a synchronous file.

We try and get FILE_RD and either FILE_CACHE and/or FILE_LAZYIO caps and
hold them across filemap_splice_read().  If we fail to get FILE_CACHE or
FILE_LAZYIO capabilities, we use copy_splice_read() instead.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Xiubo Li <xiubli@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #21)
     - Need to drop the caps ref.
     - O_DIRECT is handled by the caller.

 fs/ceph/file.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index f4d8bf7dec88..4285f6cb5d3b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1745,6 +1745,69 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+/*
+ * Wrap filemap_splice_read with checks for cap bits on the inode.
+ * Atomically grab references, so that those bits are not released
+ * back to the MDS mid-read.
+ */
+static ssize_t ceph_splice_read(struct file *in, loff_t *ppos,
+				struct pipe_inode_info *pipe,
+				size_t len, unsigned int flags)
+{
+	struct ceph_file_info *fi = in->private_data;
+	struct inode *inode = file_inode(in);
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	ssize_t ret;
+	int want = 0, got = 0;
+	CEPH_DEFINE_RW_CONTEXT(rw_ctx, 0);
+
+	dout("splice_read %p %llx.%llx %llu~%zu trying to get caps on %p\n",
+	     inode, ceph_vinop(inode), *ppos, len, inode);
+
+	if (ceph_inode_is_shutdown(inode))
+		return -ESTALE;
+
+	if (ceph_has_inline_data(ci) ||
+	    (fi->flags & CEPH_F_SYNC))
+		return copy_splice_read(in, ppos, pipe, len, flags);
+
+	ceph_start_io_read(inode);
+
+	want = CEPH_CAP_FILE_CACHE;
+	if (fi->fmode & CEPH_FILE_MODE_LAZY)
+		want |= CEPH_CAP_FILE_LAZYIO;
+
+	ret = ceph_get_caps(in, CEPH_CAP_FILE_RD, want, -1, &got);
+	if (ret < 0)
+		goto out_end;
+
+	if ((got & (CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO)) == 0) {
+		dout("splice_read/sync %p %llx.%llx %llu~%zu got cap refs on %s\n",
+		     inode, ceph_vinop(inode), *ppos, len,
+		     ceph_cap_string(got));
+
+		ceph_put_cap_refs(ci, got);
+		ceph_end_io_read(inode);
+		return copy_splice_read(in, ppos, pipe, len, flags);
+	}
+
+	dout("splice_read %p %llx.%llx %llu~%zu got cap refs on %s\n",
+	     inode, ceph_vinop(inode), *ppos, len, ceph_cap_string(got));
+
+	rw_ctx.caps = got;
+	ceph_add_rw_context(fi, &rw_ctx);
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+	ceph_del_rw_context(fi, &rw_ctx);
+
+	dout("splice_read %p %llx.%llx dropping cap refs on %s = %zd\n",
+	     inode, ceph_vinop(inode), ceph_cap_string(got), ret);
+
+	ceph_put_cap_refs(ci, got);
+out_end:
+	ceph_end_io_read(inode);
+	return ret;
+}
+
 /*
  * Take cap references to avoid releasing caps to MDS mid-write.
  *
@@ -2593,7 +2656,7 @@ const struct file_operations ceph_file_fops = {
 	.lock = ceph_lock,
 	.setlease = simple_nosetlease,
 	.flock = ceph_flock,
-	.splice_read = generic_file_splice_read,
+	.splice_read = ceph_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl = ceph_ioctl,
 	.compat_ioctl = compat_ptr_ioctl,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 17/30] ecryptfs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (15 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 16/30] ceph: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 18/30] ext4: " David Howells
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Tyler Hicks, ecryptfs

Provide a splice_read stub for ecryptfs to update the access time on the
lower file after the operation.  Splicing from a direct I/O fd will update
the access time when ->read_iter() is called.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Tyler Hicks <code@tyhicks.com>
cc: ecryptfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/ecryptfs/file.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index 268b74499c28..284395587be0 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -44,6 +44,31 @@ static ssize_t ecryptfs_read_update_atime(struct kiocb *iocb,
 	return rc;
 }
 
+/*
+ * ecryptfs_splice_read_update_atime
+ *
+ * generic_file_splice_read updates the atime of upper layer inode.  But, it
+ * doesn't give us a chance to update the atime of the lower layer inode.  This
+ * function is a wrapper to generic_file_read.  It updates the atime of the
+ * lower level inode if generic_file_read returns without any errors. This is
+ * to be used only for file reads.  The function to be used for directory reads
+ * is ecryptfs_read.
+ */
+static ssize_t ecryptfs_splice_read_update_atime(struct file *in, loff_t *ppos,
+						 struct pipe_inode_info *pipe,
+						 size_t len, unsigned int flags)
+{
+	ssize_t rc;
+	const struct path *path;
+
+	rc = generic_file_splice_read(in, ppos, pipe, len, flags);
+	if (rc >= 0) {
+		path = ecryptfs_dentry_to_lower_path(in->f_path.dentry);
+		touch_atime(path);
+	}
+	return rc;
+}
+
 struct ecryptfs_getdents_callback {
 	struct dir_context ctx;
 	struct dir_context *caller;
@@ -414,5 +439,5 @@ const struct file_operations ecryptfs_main_fops = {
 	.release = ecryptfs_release,
 	.fsync = ecryptfs_fsync,
 	.fasync = ecryptfs_fasync,
-	.splice_read = generic_file_splice_read,
+	.splice_read = ecryptfs_splice_read_update_atime,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 18/30] ext4: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (16 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 17/30] ecryptfs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:12   ` Christoph Hellwig
  2023-05-20  7:21   ` David Howells
  2023-05-20  0:00 ` [PATCH v21 19/30] f2fs: " David Howells
                   ` (11 subsequent siblings)
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Theodore Ts'o, Andreas Dilger, linux-ext4

Provide a splice_read stub for Ext4.  This does the inode shutdown check
before proceeding.  Splicing from DAX files and O_DIRECT fds is handled by
the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: "Theodore Ts'o" <tytso@mit.edu>
cc: Andreas Dilger <adilger.kernel@dilger.ca>
cc: linux-ext4@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/ext4/file.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index d101b3b0c7da..9f8bbd9d131c 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -147,6 +147,17 @@ static ssize_t ext4_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return generic_file_read_iter(iocb, to);
 }
 
+static ssize_t ext4_file_splice_read(struct file *in, loff_t *ppos,
+				     struct pipe_inode_info *pipe,
+				     size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+
+	if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
+		return -EIO;
+	return generic_file_splice_read(in, ppos, pipe, len, flags);
+}
+
 /*
  * Called when an inode is released. Note that this is different
  * from ext4_file_open: open gets called at every open, but release
@@ -957,7 +968,7 @@ const struct file_operations ext4_file_operations = {
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
 	.get_unmapped_area = thp_get_unmapped_area,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= ext4_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ext4_fallocate,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 19/30] f2fs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (17 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 18/30] ext4: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-07-06  0:18   ` [f2fs-dev] " patchwork-bot+f2fs
  2023-05-20  0:00 ` [PATCH v21 20/30] nfs: " David Howells
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Jaegeuk Kim, Chao Yu, linux-f2fs-devel

Provide a splice_read stub for f2fs.  This does some checks and tracing
before calling filemap_splice_read() and will update the iostats
afterwards.  Direct I/O is handled by the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Jaegeuk Kim <jaegeuk@kernel.org>
cc: Chao Yu <chao@kernel.org>
cc: linux-f2fs-devel@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/f2fs/file.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5ac53d2627d2..3fce122997ca 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4367,22 +4367,23 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
-static void f2fs_trace_rw_file_path(struct kiocb *iocb, size_t count, int rw)
+static void f2fs_trace_rw_file_path(struct file *file, loff_t pos, size_t count,
+				    int rw)
 {
-	struct inode *inode = file_inode(iocb->ki_filp);
+	struct inode *inode = file_inode(file);
 	char *buf, *path;
 
 	buf = f2fs_getname(F2FS_I_SB(inode));
 	if (!buf)
 		return;
-	path = dentry_path_raw(file_dentry(iocb->ki_filp), buf, PATH_MAX);
+	path = dentry_path_raw(file_dentry(file), buf, PATH_MAX);
 	if (IS_ERR(path))
 		goto free_buf;
 	if (rw == WRITE)
-		trace_f2fs_datawrite_start(inode, iocb->ki_pos, count,
+		trace_f2fs_datawrite_start(inode, pos, count,
 				current->pid, path, current->comm);
 	else
-		trace_f2fs_dataread_start(inode, iocb->ki_pos, count,
+		trace_f2fs_dataread_start(inode, pos, count,
 				current->pid, path, current->comm);
 free_buf:
 	f2fs_putname(buf);
@@ -4398,7 +4399,8 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		return -EOPNOTSUPP;
 
 	if (trace_f2fs_dataread_start_enabled())
-		f2fs_trace_rw_file_path(iocb, iov_iter_count(to), READ);
+		f2fs_trace_rw_file_path(iocb->ki_filp, iocb->ki_pos,
+					iov_iter_count(to), READ);
 
 	if (f2fs_should_use_dio(inode, iocb, to)) {
 		ret = f2fs_dio_read_iter(iocb, to);
@@ -4413,6 +4415,30 @@ static ssize_t f2fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+static ssize_t f2fs_file_splice_read(struct file *in, loff_t *ppos,
+				     struct pipe_inode_info *pipe,
+				     size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	const loff_t pos = *ppos;
+	ssize_t ret;
+
+	if (!f2fs_is_compress_backend_ready(inode))
+		return -EOPNOTSUPP;
+
+	if (trace_f2fs_dataread_start_enabled())
+		f2fs_trace_rw_file_path(in, pos, len, READ);
+
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+	if (ret > 0)
+		f2fs_update_iostat(F2FS_I_SB(inode), inode,
+				   APP_BUFFERED_READ_IO, ret);
+
+	if (trace_f2fs_dataread_end_enabled())
+		trace_f2fs_dataread_end(inode, pos, ret);
+	return ret;
+}
+
 static ssize_t f2fs_write_checks(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
@@ -4714,7 +4740,8 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		ret = preallocated;
 	} else {
 		if (trace_f2fs_datawrite_start_enabled())
-			f2fs_trace_rw_file_path(iocb, orig_count, WRITE);
+			f2fs_trace_rw_file_path(iocb->ki_filp, iocb->ki_pos,
+						orig_count, WRITE);
 
 		/* Do the actual write. */
 		ret = dio ?
@@ -4919,7 +4946,7 @@ const struct file_operations f2fs_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= f2fs_compat_ioctl,
 #endif
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= f2fs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fadvise	= f2fs_file_fadvise,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 20/30] nfs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (18 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 19/30] f2fs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 21/30] ntfs3: " David Howells
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Trond Myklebust, Anna Schumaker, linux-nfs

Provide a splice_read stub for NFS.  This locks the inode around
filemap_splice_read() and revalidates the mapping.  Splicing from direct
I/O is handled by the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #21)
     - Fix pos -> ppos in dprintk().

 fs/nfs/file.c     | 23 ++++++++++++++++++++++-
 fs/nfs/internal.h |  2 ++
 fs/nfs/nfs4file.c |  2 +-
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index f0edf5a36237..f5615fdaa9ed 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -178,6 +178,27 @@ nfs_file_read(struct kiocb *iocb, struct iov_iter *to)
 }
 EXPORT_SYMBOL_GPL(nfs_file_read);
 
+ssize_t
+nfs_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe,
+		     size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	ssize_t result;
+
+	dprintk("NFS: splice_read(%pD2, %zu@%lu)\n", in, len, *ppos);
+
+	nfs_start_io_read(inode);
+	result = nfs_revalidate_mapping(inode, in->f_mapping);
+	if (!result) {
+		result = filemap_splice_read(in, ppos, pipe, len, flags);
+		if (result > 0)
+			nfs_add_stats(inode, NFSIOS_NORMALREADBYTES, result);
+	}
+	nfs_end_io_read(inode);
+	return result;
+}
+EXPORT_SYMBOL_GPL(nfs_file_splice_read);
+
 int
 nfs_file_mmap(struct file * file, struct vm_area_struct * vma)
 {
@@ -879,7 +900,7 @@ const struct file_operations nfs_file_operations = {
 	.fsync		= nfs_file_fsync,
 	.lock		= nfs_lock,
 	.flock		= nfs_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= nfs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.check_flags	= nfs_check_flags,
 	.setlease	= simple_nosetlease,
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 3cc027d3bd58..b5f21d35d30e 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -416,6 +416,8 @@ static inline __u32 nfs_access_xattr_mask(const struct nfs_server *server)
 int nfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync);
 loff_t nfs_file_llseek(struct file *, loff_t, int);
 ssize_t nfs_file_read(struct kiocb *, struct iov_iter *);
+ssize_t nfs_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe,
+			     size_t len, unsigned int flags);
 int nfs_file_mmap(struct file *, struct vm_area_struct *);
 ssize_t nfs_file_write(struct kiocb *, struct iov_iter *);
 int nfs_file_release(struct inode *, struct file *);
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 2563ed8580f3..4aeadd6e1a6d 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -454,7 +454,7 @@ const struct file_operations nfs4_file_operations = {
 	.fsync		= nfs_file_fsync,
 	.lock		= nfs_lock,
 	.flock		= nfs_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= nfs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.check_flags	= nfs_check_flags,
 	.setlease	= nfs4_setlease,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 21/30] ntfs3: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (19 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 20/30] nfs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Konstantin Komarov, ntfs3

Provide a splice_read stub for NTFS3 to perform various checks before
allowing the operation to proceed.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
cc: ntfs3@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/ntfs3/file.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 9a3d55c367d9..667c9dc68b58 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -744,6 +744,35 @@ static ssize_t ntfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	return generic_file_read_iter(iocb, iter);
 }
 
+static ssize_t ntfs_file_splice_read(struct file *in, loff_t *ppos,
+				     struct pipe_inode_info *pipe,
+				     size_t len, unsigned int flags)
+{
+	struct inode *inode = in->f_mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+
+	if (is_encrypted(ni)) {
+		ntfs_inode_warn(inode, "encrypted i/o not supported");
+		return -EOPNOTSUPP;
+	}
+
+#ifndef CONFIG_NTFS3_LZX_XPRESS
+	if (ni->ni_flags & NI_FLAG_COMPRESSED_MASK) {
+		ntfs_inode_warn(
+			inode,
+			"activate CONFIG_NTFS3_LZX_XPRESS to read external compressed files");
+		return -EOPNOTSUPP;
+	}
+#endif
+
+	if (is_dedup(ni)) {
+		ntfs_inode_warn(inode, "read deduplicated not supported");
+		return -EOPNOTSUPP;
+	}
+
+	return generic_file_splice_read(in, ppos, pipe, len, flags);
+}
+
 /*
  * ntfs_get_frame_pages
  *
@@ -1159,7 +1188,7 @@ const struct file_operations ntfs_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= ntfs_compat_ioctl,
 #endif
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= ntfs_file_splice_read,
 	.mmap		= ntfs_file_mmap,
 	.open		= ntfs_file_open,
 	.fsync		= generic_file_fsync,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (20 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 21/30] ntfs3: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-22  2:49   ` Joseph Qi
                     ` (2 more replies)
  2023-05-20  0:00 ` [PATCH v21 23/30] orangefs: " David Howells
                   ` (7 subsequent siblings)
  29 siblings, 3 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Mark Fasheh, Joel Becker, Joseph Qi, ocfs2-devel

Provide a splice_read stub for ocfs2.  This emits trace lines and does an
atime lock/update before calling filemap_splice_read().  Splicing from
direct I/O is handled by the caller.

A couple of new tracepoints are added for this purpose.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Mark Fasheh <mark@fasheh.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/ocfs2/file.c        | 39 ++++++++++++++++++++++++++++++++++++++-
 fs/ocfs2/ocfs2_trace.h |  3 +++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index efb09de4343d..f7e00b5689d5 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2581,6 +2581,43 @@ static ssize_t ocfs2_file_read_iter(struct kiocb *iocb,
 	return ret;
 }
 
+static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
+				      struct pipe_inode_info *pipe,
+				      size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	ssize_t ret = 0;
+	int lock_level = 0;
+
+	trace_ocfs2_file_splice_read(inode, in, in->f_path.dentry,
+				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
+				     in->f_path.dentry->d_name.len,
+				     in->f_path.dentry->d_name.name,
+				     0);
+
+	/*
+	 * We're fine letting folks race truncates and extending writes with
+	 * read across the cluster, just like they can locally.  Hence no
+	 * rw_lock during read.
+	 *
+	 * Take and drop the meta data lock to update inode fields like i_size.
+	 * This allows the checks down below generic_file_splice_read() a
+	 * chance of actually working.
+	 */
+	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, true);
+	if (ret < 0) {
+		if (ret != -EAGAIN)
+			mlog_errno(ret);
+		goto bail;
+	}
+	ocfs2_inode_unlock(inode, lock_level);
+
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+	trace_filemap_splice_read_ret(ret);
+bail:
+	return ret;
+}
+
 /* Refer generic_file_llseek_unlocked() */
 static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
 {
@@ -2744,7 +2781,7 @@ const struct file_operations ocfs2_fops = {
 #endif
 	.lock		= ocfs2_lock,
 	.flock		= ocfs2_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= ocfs2_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ocfs2_fallocate,
 	.remap_file_range = ocfs2_remap_file_range,
diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
index dc4bce1649c1..b8c3d1702076 100644
--- a/fs/ocfs2/ocfs2_trace.h
+++ b/fs/ocfs2/ocfs2_trace.h
@@ -1319,6 +1319,8 @@ DEFINE_OCFS2_FILE_OPS(ocfs2_file_splice_write);
 
 DEFINE_OCFS2_FILE_OPS(ocfs2_file_read_iter);
 
+DEFINE_OCFS2_FILE_OPS(ocfs2_file_splice_read);
+
 DEFINE_OCFS2_ULL_ULL_ULL_EVENT(ocfs2_truncate_file);
 
 DEFINE_OCFS2_ULL_ULL_EVENT(ocfs2_truncate_file_error);
@@ -1470,6 +1472,7 @@ TRACE_EVENT(ocfs2_prepare_inode_for_write,
 );
 
 DEFINE_OCFS2_INT_EVENT(generic_file_read_iter_ret);
+DEFINE_OCFS2_INT_EVENT(filemap_splice_read_ret);
 
 /* End of trace events for fs/ocfs2/file.c. */
 


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 23/30] orangefs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (21 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 24/30] xfs: " David Howells
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Mike Marshall, Martin Brandenburg, devel

Provide a splice_read stub for ocfs2.  This increments the read stats and
then locks the inode across the call to filemap_splice_read() and a
revalidation of the mapping.  Splicing from direct I/O is done by the
caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Mike Marshall <hubcap@omnibond.com>
cc: Martin Brandenburg <martin@omnibond.com>
cc: devel@lists.orangefs.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/orangefs/file.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c
index 1a4301a38aa7..d68372241b30 100644
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -337,6 +337,26 @@ static ssize_t orangefs_file_read_iter(struct kiocb *iocb,
 	return ret;
 }
 
+static ssize_t orangefs_file_splice_read(struct file *in, loff_t *ppos,
+					 struct pipe_inode_info *pipe,
+					 size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	ssize_t ret;
+
+	orangefs_stats.reads++;
+
+	down_read(&inode->i_rwsem);
+	ret = orangefs_revalidate_mapping(inode);
+	if (ret)
+		goto out;
+
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+out:
+	up_read(&inode->i_rwsem);
+	return ret;
+}
+
 static ssize_t orangefs_file_write_iter(struct kiocb *iocb,
     struct iov_iter *iter)
 {
@@ -556,7 +576,7 @@ const struct file_operations orangefs_file_operations = {
 	.lock		= orangefs_lock,
 	.mmap		= orangefs_file_mmap,
 	.open		= generic_file_open,
-	.splice_read    = generic_file_splice_read,
+	.splice_read    = orangefs_file_splice_read,
 	.splice_write   = iter_file_splice_write,
 	.flush		= orangefs_flush,
 	.release	= orangefs_file_release,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 24/30] xfs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (22 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 23/30] orangefs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:13   ` Christoph Hellwig
  2023-05-20  0:00 ` [PATCH v21 25/30] zonefs: " David Howells
                   ` (5 subsequent siblings)
  29 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Darrick J . Wong, linux-xfs

Provide a splice_read stub for XFS.  This does a stat count and a shutdown
check before proceeding, then emits a new trace line and locks the inode
across the call to filemap_splice_read() and adds to the stats afterwards.
Splicing from direct I/O or DAX is handled by the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Darrick J. Wong <djwong@kernel.org>
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/xfs/xfs_file.c  | 30 +++++++++++++++++++++++++++++-
 fs/xfs/xfs_trace.h |  2 +-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index aede746541f8..08d632668e94 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -306,6 +306,34 @@ xfs_file_read_iter(
 	return ret;
 }
 
+STATIC ssize_t
+xfs_file_splice_read(
+	struct file		*in,
+	loff_t			*ppos,
+	struct pipe_inode_info	*pipe,
+	size_t			len,
+	unsigned int		flags)
+{
+	struct inode		*inode = file_inode(in);
+	struct xfs_inode	*ip = XFS_I(inode);
+	struct xfs_mount	*mp = ip->i_mount;
+	ssize_t			ret = 0;
+
+	XFS_STATS_INC(mp, xs_read_calls);
+
+	if (xfs_is_shutdown(mp))
+		return -EIO;
+
+	trace_xfs_file_splice_read(ip, *ppos, len);
+
+	xfs_ilock(ip, XFS_IOLOCK_SHARED);
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+	xfs_iunlock(ip, XFS_IOLOCK_SHARED);
+	if (ret > 0)
+		XFS_STATS_ADD(mp, xs_read_bytes, ret);
+	return ret;
+}
+
 /*
  * Common pre-write limit and setup checks.
  *
@@ -1423,7 +1451,7 @@ const struct file_operations xfs_file_operations = {
 	.llseek		= xfs_file_llseek,
 	.read_iter	= xfs_file_read_iter,
 	.write_iter	= xfs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= xfs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iocb_bio_iopoll,
 	.unlocked_ioctl	= xfs_file_ioctl,
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index cd4ca5b1fcb0..4db669203149 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1445,7 +1445,6 @@ DEFINE_RW_EVENT(xfs_file_direct_write);
 DEFINE_RW_EVENT(xfs_file_dax_write);
 DEFINE_RW_EVENT(xfs_reflink_bounce_dio_write);
 
-
 DECLARE_EVENT_CLASS(xfs_imap_class,
 	TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count,
 		 int whichfork, struct xfs_bmbt_irec *irec),
@@ -1535,6 +1534,7 @@ DEFINE_SIMPLE_IO_EVENT(xfs_zero_eof);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_unwritten);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_append);
+DEFINE_SIMPLE_IO_EVENT(xfs_file_splice_read);
 
 DECLARE_EVENT_CLASS(xfs_itrunc_class,
 	TP_PROTO(struct xfs_inode *ip, xfs_fsize_t new_size),


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 25/30] zonefs: Provide a splice-read stub
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (23 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 24/30] xfs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Darrick J . Wong, linux-xfs

Provide a splice_read stub for zonefs.  This does some checks before
proceeding and locks the inode across the call to filemap_splice_read() and
a size check in case of truncation.  Splicing from direct I/O is handled by
the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Darrick J. Wong <djwong@kernel.org>
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/zonefs/file.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 132f01d3461f..65d4c4fe6364 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -752,6 +752,44 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+static ssize_t zonefs_file_splice_read(struct file *in, loff_t *ppos,
+				       struct pipe_inode_info *pipe,
+				       size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct zonefs_zone *z = zonefs_inode_zone(inode);
+	loff_t isize;
+	ssize_t ret = 0;
+
+	/* Offline zones cannot be read */
+	if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777)))
+		return -EPERM;
+
+	if (*ppos >= z->z_capacity)
+		return 0;
+
+	inode_lock_shared(inode);
+
+	/* Limit read operations to written data */
+	mutex_lock(&zi->i_truncate_mutex);
+	isize = i_size_read(inode);
+	if (*ppos >= isize)
+		len = 0;
+	else
+		len = min_t(loff_t, len, isize - *ppos);
+	mutex_unlock(&zi->i_truncate_mutex);
+
+	if (len > 0) {
+		ret = filemap_splice_read(in, ppos, pipe, len, flags);
+		if (ret == -EIO)
+			zonefs_io_error(inode, false);
+	}
+
+	inode_unlock_shared(inode);
+	return ret;
+}
+
 /*
  * Write open accounting is done only for sequential files.
  */
@@ -896,7 +934,7 @@ const struct file_operations zonefs_file_operations = {
 	.llseek		= zonefs_file_llseek,
 	.read_iter	= zonefs_file_read_iter,
 	.write_iter	= zonefs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= zonefs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iocb_bio_iopoll,
 };


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (24 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 25/30] zonefs: " David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:14   ` Christoph Hellwig
                     ` (3 more replies)
  2023-05-20  0:00 ` [PATCH v21 27/30] cifs: Use filemap_splice_read() David Howells
                   ` (3 subsequent siblings)
  29 siblings, 4 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steven Rostedt, Masami Hiramatsu, linux-trace-kernel

For the splice from the trace seq buffer, just use copy_splice_read().

In the future, something better can probably be done by gifting pages from
seq->buf into the pipe, but that would require changing seq->buf into a
vmap over an array of pages.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Steven Rostedt <rostedt@goodmis.org>
cc: Masami Hiramatsu <mhiramat@kernel.org>
cc: linux-kernel@vger.kernel.org
cc: linux-trace-kernel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 kernel/trace/trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index ebc59781456a..c210d02fac97 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = {
 	.open		= tracing_open,
 	.read		= seq_read,
 	.read_iter	= seq_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= copy_splice_read,
 	.write		= tracing_write_stub,
 	.llseek		= tracing_lseek,
 	.release	= tracing_release,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 27/30] cifs: Use filemap_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (25 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  0:00 ` [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read() David Howells
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Paulo Alcantara, Steve French, John Hubbard, linux-cifs

Make cifs use filemap_splice_read() rather than doing its own version of
generic_file_splice_read().

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Steve French <smfrench@gmail.com>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---

Notes:
    ver #21)
     - Switch to filemap_splice_read() rather than generic_file_splice_read().
    
    ver #20)
     - Don't remove the export of filemap_splice_read().
    
    ver #18)
     - Split out from change to generic_file_splice_read().

 fs/cifs/cifsfs.c |  8 ++++----
 fs/cifs/cifsfs.h |  3 ---
 fs/cifs/file.c   | 16 ----------------
 3 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index fa2477bbcc86..4f4492eb975f 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1376,7 +1376,7 @@ const struct file_operations cifs_file_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
-	.splice_read = cifs_splice_read,
+	.splice_read = filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1396,7 +1396,7 @@ const struct file_operations cifs_file_strict_ops = {
 	.fsync = cifs_strict_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
-	.splice_read = cifs_splice_read,
+	.splice_read = filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1434,7 +1434,7 @@ const struct file_operations cifs_file_nobrl_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
-	.splice_read = cifs_splice_read,
+	.splice_read = filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1452,7 +1452,7 @@ const struct file_operations cifs_file_strict_nobrl_ops = {
 	.fsync = cifs_strict_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
-	.splice_read = cifs_splice_read,
+	.splice_read = filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 74cd6fafb33e..d7274eefc666 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -100,9 +100,6 @@ extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to);
 extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from);
 extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from);
 extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
-extern ssize_t cifs_splice_read(struct file *in, loff_t *ppos,
-				struct pipe_inode_info *pipe, size_t len,
-				unsigned int flags);
 extern int cifs_flock(struct file *pfile, int cmd, struct file_lock *plock);
 extern int cifs_lock(struct file *, int, struct file_lock *);
 extern int cifs_fsync(struct file *, loff_t, loff_t, int);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 023496207c18..375a8037a3f3 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -5078,19 +5078,3 @@ const struct address_space_operations cifs_addr_ops_smallbuf = {
 	.launder_folio = cifs_launder_folio,
 	.migrate_folio = filemap_migrate_folio,
 };
-
-/*
- * Splice data from a file into a pipe.
- */
-ssize_t cifs_splice_read(struct file *in, loff_t *ppos,
-			 struct pipe_inode_info *pipe, size_t len,
-			 unsigned int flags)
-{
-	if (unlikely(*ppos >= file_inode(in)->i_sb->s_maxbytes))
-		return 0;
-	if (unlikely(!len))
-		return 0;
-	if (in->f_flags & O_DIRECT)
-		return copy_splice_read(in, ppos, pipe, len, flags);
-	return filemap_splice_read(in, ppos, pipe, len, flags);
-}


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (26 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 27/30] cifs: Use filemap_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:14   ` Christoph Hellwig
  2023-05-20  9:56   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 29/30] splice: Remove generic_file_splice_read() David Howells
  2023-05-20  0:00 ` [PATCH v21 30/30] iov_iter: Kill ITER_PIPE David Howells
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard

Replace pointers to generic_file_splice_read() with calls to
filemap_splice_read().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 block/fops.c            | 2 +-
 fs/adfs/file.c          | 2 +-
 fs/affs/file.c          | 2 +-
 fs/afs/file.c           | 2 +-
 fs/bfs/file.c           | 2 +-
 fs/btrfs/file.c         | 2 +-
 fs/cramfs/inode.c       | 2 +-
 fs/ecryptfs/file.c      | 4 ++--
 fs/erofs/data.c         | 2 +-
 fs/exfat/file.c         | 2 +-
 fs/ext2/file.c          | 2 +-
 fs/ext4/file.c          | 2 +-
 fs/fat/file.c           | 2 +-
 fs/fuse/file.c          | 2 +-
 fs/gfs2/file.c          | 4 ++--
 fs/hfs/inode.c          | 2 +-
 fs/hfsplus/inode.c      | 2 +-
 fs/hostfs/hostfs_kern.c | 2 +-
 fs/hpfs/file.c          | 2 +-
 fs/jffs2/file.c         | 2 +-
 fs/jfs/file.c           | 2 +-
 fs/minix/file.c         | 2 +-
 fs/nilfs2/file.c        | 2 +-
 fs/ntfs/file.c          | 2 +-
 fs/ntfs3/file.c         | 2 +-
 fs/ocfs2/file.c         | 4 ++--
 fs/omfs/file.c          | 2 +-
 fs/ramfs/file-mmu.c     | 2 +-
 fs/ramfs/file-nommu.c   | 2 +-
 fs/read_write.c         | 2 +-
 fs/reiserfs/file.c      | 2 +-
 fs/romfs/mmap-nommu.c   | 2 +-
 fs/sysv/file.c          | 2 +-
 fs/ubifs/file.c         | 2 +-
 fs/udf/file.c           | 2 +-
 fs/ufs/file.c           | 2 +-
 fs/vboxsf/file.c        | 2 +-
 37 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index d2e6be4e3d1c..6c9aa028af6e 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -691,7 +691,7 @@ const struct file_operations def_blk_fops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= compat_blkdev_ioctl,
 #endif
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= blkdev_fallocate,
 };
diff --git a/fs/adfs/file.c b/fs/adfs/file.c
index 754afb14a6ff..ee80718aaeec 100644
--- a/fs/adfs/file.c
+++ b/fs/adfs/file.c
@@ -28,7 +28,7 @@ const struct file_operations adfs_file_operations = {
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
 	.write_iter	= generic_file_write_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 const struct inode_operations adfs_file_inode_operations = {
diff --git a/fs/affs/file.c b/fs/affs/file.c
index 8daeed31e1af..e43f2f007ac1 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -1001,7 +1001,7 @@ const struct file_operations affs_file_operations = {
 	.open		= affs_file_open,
 	.release	= affs_file_release,
 	.fsync		= affs_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 const struct inode_operations affs_file_inode_operations = {
diff --git a/fs/afs/file.c b/fs/afs/file.c
index d8a6b09dadf7..d37dd201752b 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -603,5 +603,5 @@ static ssize_t afs_file_splice_read(struct file *in, loff_t *ppos,
 	if (ret < 0)
 		return ret;
 
-	return generic_file_splice_read(in, ppos, pipe, len, flags);
+	return filemap_splice_read(in, ppos, pipe, len, flags);
 }
diff --git a/fs/bfs/file.c b/fs/bfs/file.c
index 57ae5ee6deec..adc2230079c6 100644
--- a/fs/bfs/file.c
+++ b/fs/bfs/file.c
@@ -27,7 +27,7 @@ const struct file_operations bfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 static int bfs_move_block(unsigned long from, unsigned long to,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index f649647392e0..71426c6408fa 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -3825,7 +3825,7 @@ static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 const struct file_operations btrfs_file_operations = {
 	.llseek		= btrfs_file_llseek,
 	.read_iter      = btrfs_file_read_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.write_iter	= btrfs_file_write_iter,
 	.splice_write	= iter_file_splice_write,
 	.mmap		= btrfs_file_mmap,
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 006ef68d7ff6..27c6597aa1be 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -473,7 +473,7 @@ static unsigned int cramfs_physmem_mmap_capabilities(struct file *file)
 static const struct file_operations cramfs_physmem_fops = {
 	.llseek			= generic_file_llseek,
 	.read_iter		= generic_file_read_iter,
-	.splice_read		= generic_file_splice_read,
+	.splice_read		= filemap_splice_read,
 	.mmap			= cramfs_physmem_mmap,
 #ifndef CONFIG_MMU
 	.get_unmapped_area	= cramfs_physmem_get_unmapped_area,
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index 284395587be0..ce0a3c5ed0ca 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -47,7 +47,7 @@ static ssize_t ecryptfs_read_update_atime(struct kiocb *iocb,
 /*
  * ecryptfs_splice_read_update_atime
  *
- * generic_file_splice_read updates the atime of upper layer inode.  But, it
+ * filemap_splice_read updates the atime of upper layer inode.  But, it
  * doesn't give us a chance to update the atime of the lower layer inode.  This
  * function is a wrapper to generic_file_read.  It updates the atime of the
  * lower level inode if generic_file_read returns without any errors. This is
@@ -61,7 +61,7 @@ static ssize_t ecryptfs_splice_read_update_atime(struct file *in, loff_t *ppos,
 	ssize_t rc;
 	const struct path *path;
 
-	rc = generic_file_splice_read(in, ppos, pipe, len, flags);
+	rc = filemap_splice_read(in, ppos, pipe, len, flags);
 	if (rc >= 0) {
 		path = ecryptfs_dentry_to_lower_path(in->f_path.dentry);
 		touch_atime(path);
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 6fe9a779fa91..db5e4b7636ec 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -448,5 +448,5 @@ const struct file_operations erofs_file_fops = {
 	.llseek		= generic_file_llseek,
 	.read_iter	= erofs_file_read_iter,
 	.mmap		= erofs_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index e99183a74611..3cbd270e0cba 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -389,7 +389,7 @@ const struct file_operations exfat_file_operations = {
 #endif
 	.mmap		= generic_file_mmap,
 	.fsync		= exfat_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
 
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 6b4bebe982ca..d1ae0f0a3726 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -192,7 +192,7 @@ const struct file_operations ext2_file_operations = {
 	.release	= ext2_release_file,
 	.fsync		= ext2_fsync,
 	.get_unmapped_area = thp_get_unmapped_area,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
 
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 9f8bbd9d131c..e8261900f4f3 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -155,7 +155,7 @@ static ssize_t ext4_file_splice_read(struct file *in, loff_t *ppos,
 
 	if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
 		return -EIO;
-	return generic_file_splice_read(in, ppos, pipe, len, flags);
+	return filemap_splice_read(in, ppos, pipe, len, flags);
 }
 
 /*
diff --git a/fs/fat/file.c b/fs/fat/file.c
index 795a4fad5c40..456477946dd9 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -209,7 +209,7 @@ const struct file_operations fat_file_operations = {
 	.unlocked_ioctl	= fat_generic_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 	.fsync		= fat_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= fat_fallocate,
 };
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 89d97f6188e0..4553124f5406 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3252,7 +3252,7 @@ static const struct file_operations fuse_file_operations = {
 	.lock		= fuse_file_lock,
 	.get_unmapped_area = thp_get_unmapped_area,
 	.flock		= fuse_file_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.unlocked_ioctl	= fuse_file_ioctl,
 	.compat_ioctl	= fuse_file_compat_ioctl,
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 300844f50dcd..0f5ad5165361 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1568,7 +1568,7 @@ const struct file_operations gfs2_file_fops = {
 	.fsync		= gfs2_fsync,
 	.lock		= gfs2_lock,
 	.flock		= gfs2_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= gfs2_file_splice_write,
 	.setlease	= simple_nosetlease,
 	.fallocate	= gfs2_fallocate,
@@ -1599,7 +1599,7 @@ const struct file_operations gfs2_file_fops_nolock = {
 	.open		= gfs2_open,
 	.release	= gfs2_release,
 	.fsync		= gfs2_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= gfs2_file_splice_write,
 	.setlease	= generic_setlease,
 	.fallocate	= gfs2_fallocate,
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 1f7bd068acf0..441d7fc952e3 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -694,7 +694,7 @@ static const struct file_operations hfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.fsync		= hfs_file_fsync,
 	.open		= hfs_file_open,
 	.release	= hfs_file_release,
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index b21660475ac1..7d1a675e037d 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -372,7 +372,7 @@ static const struct file_operations hfsplus_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.fsync		= hfsplus_file_fsync,
 	.open		= hfsplus_file_open,
 	.release	= hfsplus_file_release,
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index 28b4f15c19eb..87998df499f4 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -381,7 +381,7 @@ static int hostfs_fsync(struct file *file, loff_t start, loff_t end,
 
 static const struct file_operations hostfs_file_fops = {
 	.llseek		= generic_file_llseek,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
index 88952d4a631e..1bb8d97cd9ae 100644
--- a/fs/hpfs/file.c
+++ b/fs/hpfs/file.c
@@ -259,7 +259,7 @@ const struct file_operations hpfs_file_ops =
 	.mmap		= generic_file_mmap,
 	.release	= hpfs_file_release,
 	.fsync		= hpfs_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.unlocked_ioctl	= hpfs_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 };
diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c
index 96b0275ce957..2345ca3f09ee 100644
--- a/fs/jffs2/file.c
+++ b/fs/jffs2/file.c
@@ -56,7 +56,7 @@ const struct file_operations jffs2_file_operations =
 	.unlocked_ioctl=jffs2_ioctl,
 	.mmap =		generic_file_readonly_mmap,
 	.fsync =	jffs2_fsync,
-	.splice_read =	generic_file_splice_read,
+	.splice_read =	filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 };
 
diff --git a/fs/jfs/file.c b/fs/jfs/file.c
index 2ee35be49de1..01b6912e60f8 100644
--- a/fs/jfs/file.c
+++ b/fs/jfs/file.c
@@ -144,7 +144,7 @@ const struct file_operations jfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fsync		= jfs_fsync,
 	.release	= jfs_release,
diff --git a/fs/minix/file.c b/fs/minix/file.c
index 0dd05d47724a..906d192ab7f3 100644
--- a/fs/minix/file.c
+++ b/fs/minix/file.c
@@ -19,7 +19,7 @@ const struct file_operations minix_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 static int minix_setattr(struct mnt_idmap *idmap,
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index a265d391ffe9..a9eb3487efb2 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -140,7 +140,7 @@ const struct file_operations nilfs_file_operations = {
 	.open		= generic_file_open,
 	/* .release	= nilfs_release_file, */
 	.fsync		= nilfs_sync_file,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write   = iter_file_splice_write,
 };
 
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index c481b14e4fd9..e5e0ed58670b 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -1992,7 +1992,7 @@ const struct file_operations ntfs_file_ops = {
 #endif /* NTFS_RW */
 	.mmap		= generic_file_mmap,
 	.open		= ntfs_file_open,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 const struct inode_operations ntfs_file_inode_ops = {
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 667c9dc68b58..036efd85f60c 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -770,7 +770,7 @@ static ssize_t ntfs_file_splice_read(struct file *in, loff_t *ppos,
 		return -EOPNOTSUPP;
 	}
 
-	return generic_file_splice_read(in, ppos, pipe, len, flags);
+	return filemap_splice_read(in, ppos, pipe, len, flags);
 }
 
 /*
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index f7e00b5689d5..ff673c882816 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2601,7 +2601,7 @@ static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
 	 * rw_lock during read.
 	 *
 	 * Take and drop the meta data lock to update inode fields like i_size.
-	 * This allows the checks down below generic_file_splice_read() a
+	 * This allows the checks down below filemap_splice_read() a
 	 * chance of actually working.
 	 */
 	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, true);
@@ -2827,7 +2827,7 @@ const struct file_operations ocfs2_fops_no_plocks = {
 	.compat_ioctl   = ocfs2_compat_ioctl,
 #endif
 	.flock		= ocfs2_flock,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ocfs2_fallocate,
 	.remap_file_range = ocfs2_remap_file_range,
diff --git a/fs/omfs/file.c b/fs/omfs/file.c
index 0101f1f87b56..de8f57ee39ec 100644
--- a/fs/omfs/file.c
+++ b/fs/omfs/file.c
@@ -334,7 +334,7 @@ const struct file_operations omfs_file_operations = {
 	.write_iter = generic_file_write_iter,
 	.mmap = generic_file_mmap,
 	.fsync = generic_file_fsync,
-	.splice_read = generic_file_splice_read,
+	.splice_read = filemap_splice_read,
 };
 
 static int omfs_setattr(struct mnt_idmap *idmap,
diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
index 12af0490322f..c7a1aa3c882b 100644
--- a/fs/ramfs/file-mmu.c
+++ b/fs/ramfs/file-mmu.c
@@ -43,7 +43,7 @@ const struct file_operations ramfs_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= noop_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.llseek		= generic_file_llseek,
 	.get_unmapped_area	= ramfs_mmu_get_unmapped_area,
diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index 9fbb9b5256f7..efb1b4c1a0a4 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -43,7 +43,7 @@ const struct file_operations ramfs_file_operations = {
 	.read_iter		= generic_file_read_iter,
 	.write_iter		= generic_file_write_iter,
 	.fsync			= noop_fsync,
-	.splice_read		= generic_file_splice_read,
+	.splice_read		= filemap_splice_read,
 	.splice_write		= iter_file_splice_write,
 	.llseek			= generic_file_llseek,
 };
diff --git a/fs/read_write.c b/fs/read_write.c
index a21ba3be7dbe..b07de77ef126 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -29,7 +29,7 @@ const struct file_operations generic_ro_fops = {
 	.llseek		= generic_file_llseek,
 	.read_iter	= generic_file_read_iter,
 	.mmap		= generic_file_readonly_mmap,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 EXPORT_SYMBOL(generic_ro_fops);
diff --git a/fs/reiserfs/file.c b/fs/reiserfs/file.c
index b54cc7048f02..8eb3ad3e8ae9 100644
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -247,7 +247,7 @@ const struct file_operations reiserfs_file_operations = {
 	.fsync = reiserfs_sync_file,
 	.read_iter = generic_file_read_iter,
 	.write_iter = generic_file_write_iter,
-	.splice_read = generic_file_splice_read,
+	.splice_read = filemap_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = generic_file_llseek,
 };
diff --git a/fs/romfs/mmap-nommu.c b/fs/romfs/mmap-nommu.c
index 4578dc45e50a..4520ca413867 100644
--- a/fs/romfs/mmap-nommu.c
+++ b/fs/romfs/mmap-nommu.c
@@ -78,7 +78,7 @@ static unsigned romfs_mmap_capabilities(struct file *file)
 const struct file_operations romfs_ro_fops = {
 	.llseek			= generic_file_llseek,
 	.read_iter		= generic_file_read_iter,
-	.splice_read		= generic_file_splice_read,
+	.splice_read		= filemap_splice_read,
 	.mmap			= romfs_mmap,
 	.get_unmapped_area	= romfs_get_unmapped_area,
 	.mmap_capabilities	= romfs_mmap_capabilities,
diff --git a/fs/sysv/file.c b/fs/sysv/file.c
index 50eb92557a0f..c645f60bdb7f 100644
--- a/fs/sysv/file.c
+++ b/fs/sysv/file.c
@@ -26,7 +26,7 @@ const struct file_operations sysv_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
 
 static int sysv_setattr(struct mnt_idmap *idmap,
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 979ab1d9d0c3..6738fe43040b 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1669,7 +1669,7 @@ const struct file_operations ubifs_file_operations = {
 	.mmap           = ubifs_file_mmap,
 	.fsync          = ubifs_fsync,
 	.unlocked_ioctl = ubifs_ioctl,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.open		= fscrypt_file_open,
 #ifdef CONFIG_COMPAT
diff --git a/fs/udf/file.c b/fs/udf/file.c
index 8238f742377b..29daf5d5cb67 100644
--- a/fs/udf/file.c
+++ b/fs/udf/file.c
@@ -209,7 +209,7 @@ const struct file_operations udf_file_operations = {
 	.write_iter		= udf_file_write_iter,
 	.release		= udf_release_file,
 	.fsync			= generic_file_fsync,
-	.splice_read		= generic_file_splice_read,
+	.splice_read		= filemap_splice_read,
 	.splice_write		= iter_file_splice_write,
 	.llseek			= generic_file_llseek,
 };
diff --git a/fs/ufs/file.c b/fs/ufs/file.c
index 7e087581be7e..6558882a89ef 100644
--- a/fs/ufs/file.c
+++ b/fs/ufs/file.c
@@ -41,5 +41,5 @@ const struct file_operations ufs_file_operations = {
 	.mmap		= generic_file_mmap,
 	.open           = generic_file_open,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= filemap_splice_read,
 };
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index 572aa1c43b37..2307f8037efc 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -217,7 +217,7 @@ const struct file_operations vboxsf_reg_fops = {
 	.open = vboxsf_file_open,
 	.release = vboxsf_file_release,
 	.fsync = noop_fsync,
-	.splice_read = generic_file_splice_read,
+	.splice_read = filemap_splice_read,
 };
 
 const struct inode_operations vboxsf_reg_iops = {


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 29/30] splice: Remove generic_file_splice_read()
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (27 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:14   ` Christoph Hellwig
  2023-05-20  9:57   ` Christian Brauner
  2023-05-20  0:00 ` [PATCH v21 30/30] iov_iter: Kill ITER_PIPE David Howells
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard, linux-cifs

Remove generic_file_splice_read() as it has been replaced with calls to
filemap_splice_read() and copy_splice_read().

With this, ITER_PIPE is no longer used.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Steve French <smfrench@gmail.com>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---

Notes:
    ver #21)
     - Move zero-len check to vfs_splice_read().
     - Move s_maxbytes check to filemap_splice_read().
     - DIO (and DAX) are handled by vfs_splice_read().
    
    ver #20)
     - Use s_maxbytes from the backing store (in->f_mapping), not the front
       inode (especially for a blockdev).
    
    ver #18)
     - Split out the change to cifs to make it use generic_file_splice_read().
     - Split out the unexport of filemap_splice_read() (still needed by cifs).

 fs/splice.c        | 43 -------------------------------------------
 include/linux/fs.h |  2 --
 2 files changed, 45 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 8268248df3a9..9be4cb3b9879 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -386,49 +386,6 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 }
 EXPORT_SYMBOL(copy_splice_read);
 
-/**
- * generic_file_splice_read - splice data from file to a pipe
- * @in:		file to splice from
- * @ppos:	position in @in
- * @pipe:	pipe to splice to
- * @len:	number of bytes to splice
- * @flags:	splice modifier flags
- *
- * Description:
- *    Will read pages from given file and fill them into a pipe. Can be
- *    used as long as it has more or less sane ->read_iter().
- *
- */
-ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
-				 struct pipe_inode_info *pipe, size_t len,
-				 unsigned int flags)
-{
-	struct iov_iter to;
-	struct kiocb kiocb;
-	int ret;
-
-	iov_iter_pipe(&to, ITER_DEST, pipe, len);
-	init_sync_kiocb(&kiocb, in);
-	kiocb.ki_pos = *ppos;
-	ret = call_read_iter(in, &kiocb, &to);
-	if (ret > 0) {
-		*ppos = kiocb.ki_pos;
-		file_accessed(in);
-	} else if (ret < 0) {
-		/* free what was emitted */
-		pipe_discard_from(pipe, to.start_head);
-		/*
-		 * callers of ->splice_read() expect -EAGAIN on
-		 * "can't put anything in there", rather than -EFAULT.
-		 */
-		if (ret == -EFAULT)
-			ret = -EAGAIN;
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL(generic_file_splice_read);
-
 const struct pipe_buf_operations default_pipe_buf_ops = {
 	.release	= generic_pipe_buf_release,
 	.try_steal	= generic_pipe_buf_try_steal,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e3c22efa413e..08ba2ae1d3ce 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2755,8 +2755,6 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 			 struct pipe_inode_info *pipe,
 			 size_t len, unsigned int flags);
-extern ssize_t generic_file_splice_read(struct file *, loff_t *,
-		struct pipe_inode_info *, size_t, unsigned int);
 extern ssize_t iter_file_splice_write(struct pipe_inode_info *,
 		struct file *, loff_t *, size_t, unsigned int);
 extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v21 30/30] iov_iter: Kill ITER_PIPE
  2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
                   ` (28 preceding siblings ...)
  2023-05-20  0:00 ` [PATCH v21 29/30] splice: Remove generic_file_splice_read() David Howells
@ 2023-05-20  0:00 ` David Howells
  2023-05-20  4:15   ` Christoph Hellwig
  2023-05-20  9:58   ` Christian Brauner
  29 siblings, 2 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  0:00 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard

The ITER_PIPE-type iterator was only used by generic_file_splice_read() and
that has been replaced and removed.  This leaves ITER_PIPE unused - so
remove it too.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---

Notes:
    ver #20)
     - Rebase and remove additional pipe reference.

 include/linux/uio.h |  14 --
 lib/iov_iter.c      | 431 +-------------------------------------------
 mm/filemap.c        |   3 +-
 3 files changed, 4 insertions(+), 444 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 044c1d8c230c..60c342bb7ab8 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -11,7 +11,6 @@
 #include <uapi/linux/uio.h>
 
 struct page;
-struct pipe_inode_info;
 
 typedef unsigned int __bitwise iov_iter_extraction_t;
 
@@ -25,7 +24,6 @@ enum iter_type {
 	ITER_IOVEC,
 	ITER_KVEC,
 	ITER_BVEC,
-	ITER_PIPE,
 	ITER_XARRAY,
 	ITER_DISCARD,
 	ITER_UBUF,
@@ -74,7 +72,6 @@ struct iov_iter {
 				const struct kvec *kvec;
 				const struct bio_vec *bvec;
 				struct xarray *xarray;
-				struct pipe_inode_info *pipe;
 				void __user *ubuf;
 			};
 			size_t count;
@@ -82,10 +79,6 @@ struct iov_iter {
 	};
 	union {
 		unsigned long nr_segs;
-		struct {
-			unsigned int head;
-			unsigned int start_head;
-		};
 		loff_t xarray_start;
 	};
 };
@@ -133,11 +126,6 @@ static inline bool iov_iter_is_bvec(const struct iov_iter *i)
 	return iov_iter_type(i) == ITER_BVEC;
 }
 
-static inline bool iov_iter_is_pipe(const struct iov_iter *i)
-{
-	return iov_iter_type(i) == ITER_PIPE;
-}
-
 static inline bool iov_iter_is_discard(const struct iov_iter *i)
 {
 	return iov_iter_type(i) == ITER_DISCARD;
@@ -286,8 +274,6 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec
 			unsigned long nr_segs, size_t count);
 void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
 			unsigned long nr_segs, size_t count);
-void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
-			size_t count);
 void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
 void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray,
 		     loff_t start, size_t count);
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 960223ed9199..f18138e0292a 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -14,8 +14,6 @@
 #include <linux/scatterlist.h>
 #include <linux/instrumented.h>
 
-#define PIPE_PARANOIA /* for now */
-
 /* covers ubuf and kbuf alike */
 #define iterate_buf(i, n, base, len, off, __p, STEP) {		\
 	size_t __maybe_unused off = 0;				\
@@ -198,150 +196,6 @@ static int copyin(void *to, const void __user *from, size_t n)
 	return res;
 }
 
-#ifdef PIPE_PARANOIA
-static bool sanity(const struct iov_iter *i)
-{
-	struct pipe_inode_info *pipe = i->pipe;
-	unsigned int p_head = pipe->head;
-	unsigned int p_tail = pipe->tail;
-	unsigned int p_occupancy = pipe_occupancy(p_head, p_tail);
-	unsigned int i_head = i->head;
-	unsigned int idx;
-
-	if (i->last_offset) {
-		struct pipe_buffer *p;
-		if (unlikely(p_occupancy == 0))
-			goto Bad;	// pipe must be non-empty
-		if (unlikely(i_head != p_head - 1))
-			goto Bad;	// must be at the last buffer...
-
-		p = pipe_buf(pipe, i_head);
-		if (unlikely(p->offset + p->len != abs(i->last_offset)))
-			goto Bad;	// ... at the end of segment
-	} else {
-		if (i_head != p_head)
-			goto Bad;	// must be right after the last buffer
-	}
-	return true;
-Bad:
-	printk(KERN_ERR "idx = %d, offset = %d\n", i_head, i->last_offset);
-	printk(KERN_ERR "head = %d, tail = %d, buffers = %d\n",
-			p_head, p_tail, pipe->ring_size);
-	for (idx = 0; idx < pipe->ring_size; idx++)
-		printk(KERN_ERR "[%p %p %d %d]\n",
-			pipe->bufs[idx].ops,
-			pipe->bufs[idx].page,
-			pipe->bufs[idx].offset,
-			pipe->bufs[idx].len);
-	WARN_ON(1);
-	return false;
-}
-#else
-#define sanity(i) true
-#endif
-
-static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size)
-{
-	struct page *page = alloc_page(GFP_USER);
-	if (page) {
-		struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++);
-		*buf = (struct pipe_buffer) {
-			.ops = &default_pipe_buf_ops,
-			.page = page,
-			.offset = 0,
-			.len = size
-		};
-	}
-	return page;
-}
-
-static void push_page(struct pipe_inode_info *pipe, struct page *page,
-			unsigned int offset, unsigned int size)
-{
-	struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++);
-	*buf = (struct pipe_buffer) {
-		.ops = &page_cache_pipe_buf_ops,
-		.page = page,
-		.offset = offset,
-		.len = size
-	};
-	get_page(page);
-}
-
-static inline int last_offset(const struct pipe_buffer *buf)
-{
-	if (buf->ops == &default_pipe_buf_ops)
-		return buf->len;	// buf->offset is 0 for those
-	else
-		return -(buf->offset + buf->len);
-}
-
-static struct page *append_pipe(struct iov_iter *i, size_t size,
-				unsigned int *off)
-{
-	struct pipe_inode_info *pipe = i->pipe;
-	int offset = i->last_offset;
-	struct pipe_buffer *buf;
-	struct page *page;
-
-	if (offset > 0 && offset < PAGE_SIZE) {
-		// some space in the last buffer; add to it
-		buf = pipe_buf(pipe, pipe->head - 1);
-		size = min_t(size_t, size, PAGE_SIZE - offset);
-		buf->len += size;
-		i->last_offset += size;
-		i->count -= size;
-		*off = offset;
-		return buf->page;
-	}
-	// OK, we need a new buffer
-	*off = 0;
-	size = min_t(size_t, size, PAGE_SIZE);
-	if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
-		return NULL;
-	page = push_anon(pipe, size);
-	if (!page)
-		return NULL;
-	i->head = pipe->head - 1;
-	i->last_offset = size;
-	i->count -= size;
-	return page;
-}
-
-static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t bytes,
-			 struct iov_iter *i)
-{
-	struct pipe_inode_info *pipe = i->pipe;
-	unsigned int head = pipe->head;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	if (!sanity(i))
-		return 0;
-
-	if (offset && i->last_offset == -offset) { // could we merge it?
-		struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
-		if (buf->page == page) {
-			buf->len += bytes;
-			i->last_offset -= bytes;
-			i->count -= bytes;
-			return bytes;
-		}
-	}
-	if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
-		return 0;
-
-	push_page(pipe, page, offset, bytes);
-	i->last_offset = -(offset + bytes);
-	i->head = head;
-	i->count -= bytes;
-	return bytes;
-}
-
 /*
  * fault_in_iov_iter_readable - fault in iov iterator for reading
  * @i: iterator
@@ -446,46 +300,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
 }
 EXPORT_SYMBOL(iov_iter_init);
 
-// returns the offset in partial buffer (if any)
-static inline unsigned int pipe_npages(const struct iov_iter *i, int *npages)
-{
-	struct pipe_inode_info *pipe = i->pipe;
-	int used = pipe->head - pipe->tail;
-	int off = i->last_offset;
-
-	*npages = max((int)pipe->max_usage - used, 0);
-
-	if (off > 0 && off < PAGE_SIZE) { // anon and not full
-		(*npages)++;
-		return off;
-	}
-	return 0;
-}
-
-static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
-				struct iov_iter *i)
-{
-	unsigned int off, chunk;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-	if (unlikely(!bytes))
-		return 0;
-
-	if (!sanity(i))
-		return 0;
-
-	for (size_t n = bytes; n; n -= chunk) {
-		struct page *page = append_pipe(i, n, &off);
-		chunk = min_t(size_t, n, PAGE_SIZE - off);
-		if (!page)
-			return bytes - n;
-		memcpy_to_page(page, off, addr, chunk);
-		addr += chunk;
-	}
-	return bytes;
-}
-
 static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 			      __wsum sum, size_t off)
 {
@@ -493,44 +307,10 @@ static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 	return csum_block_add(sum, next, off);
 }
 
-static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
-					 struct iov_iter *i, __wsum *sump)
-{
-	__wsum sum = *sump;
-	size_t off = 0;
-	unsigned int chunk, r;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-	if (unlikely(!bytes))
-		return 0;
-
-	if (!sanity(i))
-		return 0;
-
-	while (bytes) {
-		struct page *page = append_pipe(i, bytes, &r);
-		char *p;
-
-		if (!page)
-			break;
-		chunk = min_t(size_t, bytes, PAGE_SIZE - r);
-		p = kmap_local_page(page);
-		sum = csum_and_memcpy(p + r, addr + off, chunk, sum, off);
-		kunmap_local(p);
-		off += chunk;
-		bytes -= chunk;
-	}
-	*sump = sum;
-	return off;
-}
-
 size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	if (WARN_ON_ONCE(i->data_source))
 		return 0;
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_pipe_to_iter(addr, bytes, i);
 	if (user_backed_iter(i))
 		might_fault();
 	iterate_and_advance(i, bytes, base, len, off,
@@ -552,42 +332,6 @@ static int copyout_mc(void __user *to, const void *from, size_t n)
 	return n;
 }
 
-static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
-				struct iov_iter *i)
-{
-	size_t xfer = 0;
-	unsigned int off, chunk;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-	if (unlikely(!bytes))
-		return 0;
-
-	if (!sanity(i))
-		return 0;
-
-	while (bytes) {
-		struct page *page = append_pipe(i, bytes, &off);
-		unsigned long rem;
-		char *p;
-
-		if (!page)
-			break;
-		chunk = min_t(size_t, bytes, PAGE_SIZE - off);
-		p = kmap_local_page(page);
-		rem = copy_mc_to_kernel(p + off, addr + xfer, chunk);
-		chunk -= rem;
-		kunmap_local(p);
-		xfer += chunk;
-		bytes -= chunk;
-		if (rem) {
-			iov_iter_revert(i, rem);
-			break;
-		}
-	}
-	return xfer;
-}
-
 /**
  * _copy_mc_to_iter - copy to iter with source memory error exception handling
  * @addr: source kernel address
@@ -607,9 +351,8 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
  *   alignment and poison alignment assumptions to avoid re-triggering
  *   hardware exceptions.
  *
- * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies.
- *   Compare to copy_to_iter() where only ITER_IOVEC attempts might return
- *   a short copy.
+ * * ITER_KVEC and ITER_BVEC can return short copies.  Compare to
+ *   copy_to_iter() where only ITER_IOVEC attempts might return a short copy.
  *
  * Return: number of bytes copied (may be %0)
  */
@@ -617,8 +360,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	if (WARN_ON_ONCE(i->data_source))
 		return 0;
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_mc_pipe_to_iter(addr, bytes, i);
 	if (user_backed_iter(i))
 		might_fault();
 	__iterate_and_advance(i, bytes, base, len, off,
@@ -732,8 +473,6 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 		return 0;
 	if (WARN_ON_ONCE(i->data_source))
 		return 0;
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_page_to_iter_pipe(page, offset, bytes, i);
 	page += offset / PAGE_SIZE; // first subpage
 	offset %= PAGE_SIZE;
 	while (1) {
@@ -764,8 +503,6 @@ size_t copy_page_to_iter_nofault(struct page *page, unsigned offset, size_t byte
 		return 0;
 	if (WARN_ON_ONCE(i->data_source))
 		return 0;
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_page_to_iter_pipe(page, offset, bytes, i);
 	page += offset / PAGE_SIZE; // first subpage
 	offset %= PAGE_SIZE;
 	while (1) {
@@ -818,36 +555,8 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 }
 EXPORT_SYMBOL(copy_page_from_iter);
 
-static size_t pipe_zero(size_t bytes, struct iov_iter *i)
-{
-	unsigned int chunk, off;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-	if (unlikely(!bytes))
-		return 0;
-
-	if (!sanity(i))
-		return 0;
-
-	for (size_t n = bytes; n; n -= chunk) {
-		struct page *page = append_pipe(i, n, &off);
-		char *p;
-
-		if (!page)
-			return bytes - n;
-		chunk = min_t(size_t, n, PAGE_SIZE - off);
-		p = kmap_local_page(page);
-		memset(p + off, 0, chunk);
-		kunmap_local(p);
-	}
-	return bytes;
-}
-
 size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 {
-	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_zero(bytes, i);
 	iterate_and_advance(i, bytes, base, len, count,
 		clear_user(base, len),
 		memset(base, 0, len)
@@ -878,32 +587,6 @@ size_t copy_page_from_iter_atomic(struct page *page, unsigned offset, size_t byt
 }
 EXPORT_SYMBOL(copy_page_from_iter_atomic);
 
-static void pipe_advance(struct iov_iter *i, size_t size)
-{
-	struct pipe_inode_info *pipe = i->pipe;
-	int off = i->last_offset;
-
-	if (!off && !size) {
-		pipe_discard_from(pipe, i->start_head); // discard everything
-		return;
-	}
-	i->count -= size;
-	while (1) {
-		struct pipe_buffer *buf = pipe_buf(pipe, i->head);
-		if (off) /* make it relative to the beginning of buffer */
-			size += abs(off) - buf->offset;
-		if (size <= buf->len) {
-			buf->len = size;
-			i->last_offset = last_offset(buf);
-			break;
-		}
-		size -= buf->len;
-		i->head++;
-		off = 0;
-	}
-	pipe_discard_from(pipe, i->head + 1); // discard everything past this one
-}
-
 static void iov_iter_bvec_advance(struct iov_iter *i, size_t size)
 {
 	const struct bio_vec *bvec, *end;
@@ -955,8 +638,6 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
 		iov_iter_iovec_advance(i, size);
 	} else if (iov_iter_is_bvec(i)) {
 		iov_iter_bvec_advance(i, size);
-	} else if (iov_iter_is_pipe(i)) {
-		pipe_advance(i, size);
 	} else if (iov_iter_is_discard(i)) {
 		i->count -= size;
 	}
@@ -970,26 +651,6 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
 	if (WARN_ON(unroll > MAX_RW_COUNT))
 		return;
 	i->count += unroll;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		struct pipe_inode_info *pipe = i->pipe;
-		unsigned int head = pipe->head;
-
-		while (head > i->start_head) {
-			struct pipe_buffer *b = pipe_buf(pipe, --head);
-			if (unroll < b->len) {
-				b->len -= unroll;
-				i->last_offset = last_offset(b);
-				i->head = head;
-				return;
-			}
-			unroll -= b->len;
-			pipe_buf_release(pipe, b);
-			pipe->head--;
-		}
-		i->last_offset = 0;
-		i->head = head;
-		return;
-	}
 	if (unlikely(iov_iter_is_discard(i)))
 		return;
 	if (unroll <= i->iov_offset) {
@@ -1079,24 +740,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
 }
 EXPORT_SYMBOL(iov_iter_bvec);
 
-void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
-			struct pipe_inode_info *pipe,
-			size_t count)
-{
-	BUG_ON(direction != READ);
-	WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size));
-	*i = (struct iov_iter){
-		.iter_type = ITER_PIPE,
-		.data_source = false,
-		.pipe = pipe,
-		.head = pipe->head,
-		.start_head = pipe->head,
-		.last_offset = 0,
-		.count = count
-	};
-}
-EXPORT_SYMBOL(iov_iter_pipe);
-
 /**
  * iov_iter_xarray - Initialise an I/O iterator to use the pages in an xarray
  * @i: The iterator to initialise.
@@ -1224,19 +867,6 @@ bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
 	if (iov_iter_is_bvec(i))
 		return iov_iter_aligned_bvec(i, addr_mask, len_mask);
 
-	if (iov_iter_is_pipe(i)) {
-		size_t size = i->count;
-
-		if (size & len_mask)
-			return false;
-		if (size && i->last_offset > 0) {
-			if (i->last_offset & addr_mask)
-				return false;
-		}
-
-		return true;
-	}
-
 	if (iov_iter_is_xarray(i)) {
 		if (i->count & len_mask)
 			return false;
@@ -1307,14 +937,6 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
 	if (iov_iter_is_bvec(i))
 		return iov_iter_alignment_bvec(i);
 
-	if (iov_iter_is_pipe(i)) {
-		size_t size = i->count;
-
-		if (size && i->last_offset > 0)
-			return size | i->last_offset;
-		return size;
-	}
-
 	if (iov_iter_is_xarray(i))
 		return (i->xarray_start + i->iov_offset) | i->count;
 
@@ -1367,36 +989,6 @@ static int want_pages_array(struct page ***res, size_t size,
 	return count;
 }
 
-static ssize_t pipe_get_pages(struct iov_iter *i,
-		   struct page ***pages, size_t maxsize, unsigned maxpages,
-		   size_t *start)
-{
-	unsigned int npages, count, off, chunk;
-	struct page **p;
-	size_t left;
-
-	if (!sanity(i))
-		return -EFAULT;
-
-	*start = off = pipe_npages(i, &npages);
-	if (!npages)
-		return -EFAULT;
-	count = want_pages_array(pages, maxsize, off, min(npages, maxpages));
-	if (!count)
-		return -ENOMEM;
-	p = *pages;
-	for (npages = 0, left = maxsize ; npages < count; npages++, left -= chunk) {
-		struct page *page = append_pipe(i, left, &off);
-		if (!page)
-			break;
-		chunk = min_t(size_t, left, PAGE_SIZE - off);
-		get_page(*p++ = page);
-	}
-	if (!npages)
-		return -EFAULT;
-	return maxsize - left;
-}
-
 static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarray *xa,
 					  pgoff_t index, unsigned int nr_pages)
 {
@@ -1547,8 +1139,6 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
 		}
 		return maxsize;
 	}
-	if (iov_iter_is_pipe(i))
-		return pipe_get_pages(i, pages, maxsize, maxpages, start);
 	if (iov_iter_is_xarray(i))
 		return iter_xarray_get_pages(i, pages, maxsize, maxpages, start);
 	return -EFAULT;
@@ -1638,9 +1228,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
 	}
 
 	sum = csum_shift(csstate->csum, csstate->off);
-	if (unlikely(iov_iter_is_pipe(i)))
-		bytes = csum_and_copy_to_pipe_iter(addr, bytes, i, &sum);
-	else iterate_and_advance(i, bytes, base, len, off, ({
+	iterate_and_advance(i, bytes, base, len, off, ({
 		next = csum_and_copy_to_user(addr + off, base, len);
 		sum = csum_block_add(sum, next, off);
 		next ? 0 : len;
@@ -1725,15 +1313,6 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
 		return iov_npages(i, maxpages);
 	if (iov_iter_is_bvec(i))
 		return bvec_npages(i, maxpages);
-	if (iov_iter_is_pipe(i)) {
-		int npages;
-
-		if (!sanity(i))
-			return 0;
-
-		pipe_npages(i, &npages);
-		return min(npages, maxpages);
-	}
 	if (iov_iter_is_xarray(i)) {
 		unsigned offset = (i->xarray_start + i->iov_offset) % PAGE_SIZE;
 		int npages = DIV_ROUND_UP(offset + i->count, PAGE_SIZE);
@@ -1746,10 +1325,6 @@ EXPORT_SYMBOL(iov_iter_npages);
 const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
 {
 	*new = *old;
-	if (unlikely(iov_iter_is_pipe(new))) {
-		WARN_ON(1);
-		return NULL;
-	}
 	if (iov_iter_is_bvec(new))
 		return new->bvec = kmemdup(new->bvec,
 				    new->nr_segs * sizeof(struct bio_vec),
diff --git a/mm/filemap.c b/mm/filemap.c
index 0fcb0b80c2e2..603b562d69b1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2687,8 +2687,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 		if (unlikely(iocb->ki_pos >= i_size_read(inode)))
 			break;
 
-		error = filemap_get_pages(iocb, iter->count, &fbatch,
-					  iov_iter_is_pipe(iter));
+		error = filemap_get_pages(iocb, iter->count, &fbatch, false);
 		if (error < 0)
 			break;
 


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 01/30] splice: Fix filemap of a blockdev
  2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
@ 2023-05-20  4:08   ` Christoph Hellwig
  2023-05-20  9:14   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:08 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard

Just noticed this now, but filemap of a blockdev sounds weird.

Maybe something like:

splice: use the correct inode in filemap_splice_read

might be better.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes
  2023-05-20  0:00 ` [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes David Howells
@ 2023-05-20  4:09   ` Christoph Hellwig
  2023-05-20  9:21   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:09 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
@ 2023-05-20  4:09   ` Christoph Hellwig
  2023-05-20  9:23   ` Christian Brauner
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:09 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, linux-cifs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read() David Howells
@ 2023-05-20  4:11   ` Christoph Hellwig
  2023-05-20  9:39   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:11 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 08/30] splice: Make splice from a DAX file use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
@ 2023-05-20  4:11   ` Christoph Hellwig
  2023-05-20  9:41   ` Christian Brauner
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:11 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	linux-erofs, linux-ext4, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 18/30] ext4: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 18/30] ext4: " David Howells
@ 2023-05-20  4:12   ` Christoph Hellwig
  2023-05-20  7:21   ` David Howells
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:12 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Theodore Ts'o, Andreas Dilger, linux-ext4

On Sat, May 20, 2023 at 01:00:37AM +0100, David Howells wrote:
> Provide a splice_read stub for Ext4.  This does the inode shutdown check
> before proceeding.  Splicing from DAX files and O_DIRECT fds is handled by
> the caller.

Not sure I'd call this a stub, but then again I'm not a native speaker.

The patch itself looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 24/30] xfs: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 24/30] xfs: " David Howells
@ 2023-05-20  4:13   ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:13 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Darrick J . Wong, linux-xfs

Same comment about stub sounding a bit weird as for ext4 and various
patches not in my area, but the changes themselves looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
@ 2023-05-20  4:14   ` Christoph Hellwig
  2023-05-21 10:28   ` Masami Hiramatsu
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:14 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steven Rostedt, Masami Hiramatsu, linux-trace-kernel

s/splice/trace/ ?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read()
  2023-05-20  0:00 ` [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read() David Howells
@ 2023-05-20  4:14   ` Christoph Hellwig
  2023-05-20  9:56   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:14 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 29/30] splice: Remove generic_file_splice_read()
  2023-05-20  0:00 ` [PATCH v21 29/30] splice: Remove generic_file_splice_read() David Howells
@ 2023-05-20  4:14   ` Christoph Hellwig
  2023-05-20  9:57   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:14 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, John Hubbard, linux-cifs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 30/30] iov_iter: Kill ITER_PIPE
  2023-05-20  0:00 ` [PATCH v21 30/30] iov_iter: Kill ITER_PIPE David Howells
@ 2023-05-20  4:15   ` Christoph Hellwig
  2023-05-20  9:58   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  4:15 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	John Hubbard

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 18/30] ext4: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 18/30] ext4: " David Howells
  2023-05-20  4:12   ` Christoph Hellwig
@ 2023-05-20  7:21   ` David Howells
  2023-05-20  9:01     ` Christoph Hellwig
  1 sibling, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-20  7:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: dhowells, Jens Axboe, Al Viro, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Theodore Ts'o, Andreas Dilger, linux-ext4

Christoph Hellwig <hch@infradead.org> wrote:

> Not sure I'd call this a stub, but then again I'm not a native speaker.

"Wrapper"?

David


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 18/30] ext4: Provide a splice-read stub
  2023-05-20  7:21   ` David Howells
@ 2023-05-20  9:01     ` Christoph Hellwig
  2023-05-21  0:26       ` Theodore Ts'o
  0 siblings, 1 reply; 72+ messages in thread
From: Christoph Hellwig @ 2023-05-20  9:01 UTC (permalink / raw)
  To: David Howells
  Cc: Christoph Hellwig, Jens Axboe, Al Viro, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Theodore Ts'o, Andreas Dilger, linux-ext4

On Sat, May 20, 2023 at 08:21:44AM +0100, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > Not sure I'd call this a stub, but then again I'm not a native speaker.
> 
> "Wrapper"?

That's what I'd call it, yes.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 01/30] splice: Fix filemap of a blockdev
  2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
  2023-05-20  4:08   ` Christoph Hellwig
@ 2023-05-20  9:14   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:14 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Steve French,
	John Hubbard

On Sat, May 20, 2023 at 01:00:20AM +0100, David Howells wrote:
> Fix filemap_splice_read() to use file->f_mapping->host, not file->f_inode,
> as the source of the file size because in the case of a block device,
> file->f_inode points to the block-special file (which is typically 0
> length) and not the backing store.
> 
> Fixes: 07073eb01c5f ("splice: Add a func to do a splice from a buffered file without ITER_PIPE")
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> cc: Steve French <stfrench@microsoft.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes
  2023-05-20  0:00 ` [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes David Howells
  2023-05-20  4:09   ` Christoph Hellwig
@ 2023-05-20  9:21   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:21 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Steve French,
	John Hubbard

On Sat, May 20, 2023 at 01:00:21AM +0100, David Howells wrote:
> Make filemap_splice_read() check s_maxbytes analogously to filemap_read().
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Steve French <stfrench@microsoft.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---
>  mm/filemap.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a2006936a6ae..0fcb0b80c2e2 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2887,6 +2887,9 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
>  	bool writably_mapped;
>  	int i, error = 0;
>  
> +	if (unlikely(*ppos >= in->f_mapping->host->i_sb->s_maxbytes))

Pointer deref galore
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
  2023-05-20  4:09   ` Christoph Hellwig
@ 2023-05-20  9:23   ` Christian Brauner
  2023-05-20  9:51   ` David Howells
  2023-05-22  7:55   ` David Howells
  3 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:23 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Steve French,
	linux-cifs

On Sat, May 20, 2023 at 01:00:22AM +0100, David Howells wrote:
> Rename direct_splice_read() to copy_splice_read() to better reflect as to
> what it does.
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Steve French <sfrench@samba.org>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: linux-cifs@vger.kernel.org
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

For the future it'd be nice if exported functions would always get
proper kernel doc,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit
  2023-05-20  0:00 ` [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit David Howells
@ 2023-05-20  9:34   ` Christian Brauner
  0 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:34 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, John Hubbard

On Sat, May 20, 2023 at 01:00:23AM +0100, David Howells wrote:
> Do a couple of cleanups to copy_splice_read():
> 
>  (1) Cast to struct page **, not void *.
> 
>  (2) Simplify the calculation of the number of pages to keep/reclaim in
>      copy_splice_read().
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 05/30] splice: Make do_splice_to() generic and export it
  2023-05-20  0:00 ` [PATCH v21 05/30] splice: Make do_splice_to() generic and export it David Howells
@ 2023-05-20  9:35   ` Christian Brauner
  0 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:35 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Miklos Szeredi,
	John Hubbard, linux-unionfs

On Sat, May 20, 2023 at 01:00:24AM +0100, David Howells wrote:
> Rename do_splice_to() to vfs_splice_read() and export it so that it can be
> used as a helper when calling down to a lower layer filesystem as it
> performs all the necessary checks[1].
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> cc: Miklos Szeredi <miklos@szeredi.hu>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: David Hildenbrand <david@redhat.com>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-unionfs@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> Link: https://lore.kernel.org/r/CAJfpeguGksS3sCigmRi9hJdUec8qtM9f+_9jC1rJhsXT+dV01w@mail.gmail.com/ [1]
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read()
  2023-05-20  0:00 ` [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read() David Howells
@ 2023-05-20  9:38   ` Christian Brauner
  0 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:38 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig

On Sat, May 20, 2023 at 01:00:25AM +0100, David Howells wrote:
> Make vfs_splice_read() return immediately if the length is 0.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read() David Howells
  2023-05-20  4:11   ` Christoph Hellwig
@ 2023-05-20  9:39   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:39 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig

On Sat, May 20, 2023 at 01:00:26AM +0100, David Howells wrote:
> Make a read splice from a file descriptor that's open O_DIRECT use
> copy_splice_read() to do the reading as filemap_splice_read() is unlikely
> to find any pagecache to splice.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 08/30] splice: Make splice from a DAX file use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
  2023-05-20  4:11   ` Christoph Hellwig
@ 2023-05-20  9:41   ` Christian Brauner
  2023-05-21  0:28   ` Theodore Ts'o
  2023-05-21 14:55   ` Gao Xiang
  3 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:41 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, linux-erofs,
	linux-ext4, linux-xfs

On Sat, May 20, 2023 at 01:00:27AM +0100, David Howells wrote:
> Make a read splice from a DAX file go directly to copy_splice_read() to do
> the reading as filemap_splice_read() is unlikely to find any pagecache to
> splice.
> 
> I think this affects only erofs, Ext2, Ext4, fuse and XFS.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: linux-erofs@lists.ozlabs.org
> cc: linux-ext4@vger.kernel.org
> cc: linux-xfs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

Fwiw, O_DIRECT and DAX could've just been folded into one patch imho.
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 10/30] overlayfs: Implement splice-read
  2023-05-20  0:00 ` [PATCH v21 10/30] overlayfs: " David Howells
@ 2023-05-20  9:47   ` Christian Brauner
  0 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:47 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, John Hubbard,
	Miklos Szeredi, linux-unionfs, Amir Goldstein

[Cc Amir as well]

On Sat, May 20, 2023 at 01:00:29AM +0100, David Howells wrote:
> Implement splice-read for overlayfs by passing the request down a layer
> rather than going through generic_file_splice_read() which is going to be
> changed to assume that ->read_folio() is present on buffered files.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: David Hildenbrand <david@redhat.com>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: Miklos Szeredi <miklos@szeredi.hu>
> cc: linux-unionfs@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

Acked-by: Christian Brauner <brauner@kernel.org>

> 
> Notes:
>     ver #17)
>      - Use vfs_splice_read() helper rather than open-coding checks.
>     
>     ver #15)
>      - Remove redundant FMODE_CAN_ODIRECT check on real file.
>      - Do rw_verify_area() on the real file, not the overlay file.
>      - Fix a file leak.
> 
>  fs/overlayfs/file.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 7c04f033aadd..86197882ff35 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -419,6 +419,27 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>  	return ret;
>  }
>  
> +static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
> +			       struct pipe_inode_info *pipe, size_t len,
> +			       unsigned int flags)
> +{
> +	const struct cred *old_cred;
> +	struct fd real;
> +	ssize_t ret;
> +
> +	ret = ovl_real_fdget(in, &real);
> +	if (ret)
> +		return ret;
> +
> +	old_cred = ovl_override_creds(file_inode(in)->i_sb);
> +	ret = vfs_splice_read(real.file, ppos, pipe, len, flags);
> +	revert_creds(old_cred);
> +	ovl_file_accessed(in);
> +
> +	fdput(real);
> +	return ret;
> +}
> +
>  /*
>   * Calling iter_file_splice_write() directly from overlay's f_op may deadlock
>   * due to lock order inversion between pipe->mutex in iter_file_splice_write()
> @@ -695,7 +716,7 @@ const struct file_operations ovl_file_operations = {
>  	.fallocate	= ovl_fallocate,
>  	.fadvise	= ovl_fadvise,
>  	.flush		= ovl_flush,
> -	.splice_read    = generic_file_splice_read,
> +	.splice_read    = ovl_splice_read,
>  	.splice_write   = ovl_splice_write,
>  
>  	.copy_file_range	= ovl_copy_file_range,
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
  2023-05-20  4:09   ` Christoph Hellwig
  2023-05-20  9:23   ` Christian Brauner
@ 2023-05-20  9:51   ` David Howells
  2023-05-22  7:55   ` David Howells
  3 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-20  9:51 UTC (permalink / raw)
  To: Christian Brauner
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, linux-cifs

Christian Brauner <brauner@kernel.org> wrote:

> For the future it'd be nice if exported functions would always get
> proper kernel doc,

Good point.  It wasn't meant to remain exported originally.  I'll add a patch
to add that.

David


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read()
  2023-05-20  0:00 ` [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read() David Howells
  2023-05-20  4:14   ` Christoph Hellwig
@ 2023-05-20  9:56   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:56 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, John Hubbard

On Sat, May 20, 2023 at 01:00:47AM +0100, David Howells wrote:
> Replace pointers to generic_file_splice_read() with calls to
> filemap_splice_read().
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 29/30] splice: Remove generic_file_splice_read()
  2023-05-20  0:00 ` [PATCH v21 29/30] splice: Remove generic_file_splice_read() David Howells
  2023-05-20  4:14   ` Christoph Hellwig
@ 2023-05-20  9:57   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:57 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Steve French,
	John Hubbard, linux-cifs

On Sat, May 20, 2023 at 01:00:48AM +0100, David Howells wrote:
> Remove generic_file_splice_read() as it has been replaced with calls to
> filemap_splice_read() and copy_splice_read().
> 
> With this, ITER_PIPE is no longer used.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Steve French <smfrench@gmail.com>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-cifs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 30/30] iov_iter: Kill ITER_PIPE
  2023-05-20  0:00 ` [PATCH v21 30/30] iov_iter: Kill ITER_PIPE David Howells
  2023-05-20  4:15   ` Christoph Hellwig
@ 2023-05-20  9:58   ` Christian Brauner
  1 sibling, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-20  9:58 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, John Hubbard

On Sat, May 20, 2023 at 01:00:49AM +0100, David Howells wrote:
> The ITER_PIPE-type iterator was only used by generic_file_splice_read() and
> that has been replaced and removed.  This leaves ITER_PIPE unused - so
> remove it too.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: David Hildenbrand <david@redhat.com>
> cc: John Hubbard <jhubbard@nvidia.com>
> cc: linux-mm@kvack.org
> cc: linux-block@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 18/30] ext4: Provide a splice-read stub
  2023-05-20  9:01     ` Christoph Hellwig
@ 2023-05-21  0:26       ` Theodore Ts'o
  0 siblings, 0 replies; 72+ messages in thread
From: Theodore Ts'o @ 2023-05-21  0:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Howells, Jens Axboe, Al Viro, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Andreas Dilger, linux-ext4

On Sat, May 20, 2023 at 02:01:47AM -0700, Christoph Hellwig wrote:
> On Sat, May 20, 2023 at 08:21:44AM +0100, David Howells wrote:
> > Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > > Not sure I'd call this a stub, but then again I'm not a native speaker.
> > 
> > "Wrapper"?
> 
> That's what I'd call it, yes.

Agreed, "wrapper" is a better term.

Other than that,

Acked-by: Theodore Ts'o <tytso@mit.edu>



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 08/30] splice: Make splice from a DAX file use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
  2023-05-20  4:11   ` Christoph Hellwig
  2023-05-20  9:41   ` Christian Brauner
@ 2023-05-21  0:28   ` Theodore Ts'o
  2023-05-21 14:55   ` Gao Xiang
  3 siblings, 0 replies; 72+ messages in thread
From: Theodore Ts'o @ 2023-05-21  0:28 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	linux-erofs, linux-ext4, linux-xfs

On Sat, May 20, 2023 at 01:00:27AM +0100, David Howells wrote:
> Make a read splice from a DAX file go directly to copy_splice_read() to do
> the reading as filemap_splice_read() is unlikely to find any pagecache to
> splice.
> 
> I think this affects only erofs, Ext2, Ext4, fuse and XFS.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: linux-erofs@lists.ozlabs.org
> cc: linux-ext4@vger.kernel.org

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
  2023-05-20  4:14   ` Christoph Hellwig
@ 2023-05-21 10:28   ` Masami Hiramatsu
  2023-05-21 12:50   ` David Howells
  2023-05-23 14:27   ` Steven Rostedt
  3 siblings, 0 replies; 72+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:28 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steven Rostedt, Masami Hiramatsu, linux-trace-kernel

Hi David,

On Sat, 20 May 2023 01:00:45 +0100
David Howells <dhowells@redhat.com> wrote:

> For the splice from the trace seq buffer, just use copy_splice_read().

So this is because you will remove generic_file_splice_read() (since
it's buggy), right?

> 
> In the future, something better can probably be done by gifting pages from
> seq->buf into the pipe, but that would require changing seq->buf into a
> vmap over an array of pages.

So what we need is to introduce a vmap? We introduced splice support for
avoiding copy ringbuffer pages, but this drops it. Thus this will drop
performance of splice on ring buffer (trace file). If it is correct,
can you also add a note about that?

Thank you,

> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Steven Rostedt <rostedt@goodmis.org>
> cc: Masami Hiramatsu <mhiramat@kernel.org>
> cc: linux-kernel@vger.kernel.org
> cc: linux-trace-kernel@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---
>  kernel/trace/trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index ebc59781456a..c210d02fac97 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = {
>  	.open		= tracing_open,
>  	.read		= seq_read,
>  	.read_iter	= seq_read_iter,
> -	.splice_read	= generic_file_splice_read,
> +	.splice_read	= copy_splice_read,
>  	.write		= tracing_write_stub,
>  	.llseek		= tracing_lseek,
>  	.release	= tracing_release,
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
  2023-05-20  4:14   ` Christoph Hellwig
  2023-05-21 10:28   ` Masami Hiramatsu
@ 2023-05-21 12:50   ` David Howells
  2023-05-23 14:27   ` Steven Rostedt
  3 siblings, 0 replies; 72+ messages in thread
From: David Howells @ 2023-05-21 12:50 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Christian Brauner, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel, linux-mm,
	Christoph Hellwig, Steven Rostedt, linux-trace-kernel

Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> David Howells <dhowells@redhat.com> wrote:
> 
> > For the splice from the trace seq buffer, just use copy_splice_read().
> 
> So this is because you will remove generic_file_splice_read() (since
> it's buggy), right?

An ITER_PIPE iterator has a problem if it gets reverted with other changes I
want to make.  The problem is that it may not be valid to control the lifetime
of the data in the buffer with get_page().  The pages may need a pin taking
(FOLL_PIN) or the lifetime might be controlled with kfree() or rmmod.

> > In the future, something better can probably be done by gifting pages from
> > seq->buf into the pipe, but that would require changing seq->buf into a
> > vmap over an array of pages.
> 
> ... We introduced splice support for avoiding copy ringbuffer pages, but
> this drops it. Thus this will drop performance of splice on ring buffer
> (trace file). If it is correct, can you also add a note about that?

Actually, no.  There is no special splice support for tracing_fops.  You
currently use generic_file_splice_read(), which wends its way down into
seq_read_iter.  However, the seqfile stuff uses kvmalloc() to allocate the
buffer, so you are not allowed to splice page refs from kmalloc'd or vmalloc'd
memory into a pipe, so it doesn't.  It calls copy_to_iter() which will cause
ITER_PIPE to allocate bufferage on an as-needed basis.

copy_splice_read() instead creates an ITER_BVEC and populates it up front
using the bulk allocator, so if you're splicing a lot of data, this ought to
be marginally faster.

> So what we need is to introduce a vmap?

We could implement seq_splice_read().  What we would need to do is to change
how the buffer is allocated: bulk allocate a bunch of arbitrary pages which we
then vmap().  When we need to splice, we read into the buffer, do a vunmap()
and then splice the pages holding the data we used into the pipe.

If we don't manage to splice all the data, we can continue splicing from the
pages we have left next time.  If a read() comes along to view partially
spliced data, we would need to copy from the individual pages.

When we use up all the data, we discard all the pages we might have spliced
from and shuffle down the other pages, call the bulk allocator to replenish
the buffer and then vmap() it again.

Any pages we've spliced from must be discarded and replaced and not rewritten.

If a read() comes without the buffer having been spliced from, it can do as it
does now.

David


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 08/30] splice: Make splice from a DAX file use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
                     ` (2 preceding siblings ...)
  2023-05-21  0:28   ` Theodore Ts'o
@ 2023-05-21 14:55   ` Gao Xiang
  3 siblings, 0 replies; 72+ messages in thread
From: Gao Xiang @ 2023-05-21 14:55 UTC (permalink / raw)
  To: David Howells, Jens Axboe, Al Viro, Christoph Hellwig
  Cc: linux-erofs, linux-block, Hillf Danton, Jan Kara, linux-xfs,
	David Hildenbrand, Linus Torvalds, Jeff Layton,
	Christian Brauner, Matthew Wilcox, linux-kernel, linux-mm,
	Jason Gunthorpe, linux-fsdevel, linux-ext4, Logan Gunthorpe,
	Christoph Hellwig



On 2023/5/20 17:00, David Howells wrote:
> Make a read splice from a DAX file go directly to copy_splice_read() to do
> the reading as filemap_splice_read() is unlikely to find any pagecache to
> splice.
> 
> I think this affects only erofs, Ext2, Ext4, fuse and XFS.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: linux-erofs@lists.ozlabs.org
> cc: linux-ext4@vger.kernel.org
> cc: linux-xfs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org

Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 16/30] ceph: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 16/30] ceph: " David Howells
@ 2023-05-22  2:12   ` Xiubo Li
  0 siblings, 0 replies; 72+ messages in thread
From: Xiubo Li @ 2023-05-22  2:12 UTC (permalink / raw)
  To: David Howells, Jens Axboe, Al Viro, Christoph Hellwig
  Cc: Matthew Wilcox, Jan Kara, Jeff Layton, David Hildenbrand,
	Jason Gunthorpe, Logan Gunthorpe, Hillf Danton,
	Christian Brauner, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Ilya Dryomov,
	ceph-devel


On 5/20/23 08:00, David Howells wrote:
> Provide a splice_read stub for Ceph.  This does the inode shutdown check
> before proceeding and jumps to copy_splice_read() if the file has inline
> data or is a synchronous file.
>
> We try and get FILE_RD and either FILE_CACHE and/or FILE_LAZYIO caps and
> hold them across filemap_splice_read().  If we fail to get FILE_CACHE or
> FILE_LAZYIO capabilities, we use copy_splice_read() instead.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Xiubo Li <xiubli@redhat.com>
> cc: Ilya Dryomov <idryomov@gmail.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: ceph-devel@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---
>
> Notes:
>      ver #21)
>       - Need to drop the caps ref.
>       - O_DIRECT is handled by the caller.
>
>   fs/ceph/file.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 64 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index f4d8bf7dec88..4285f6cb5d3b 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -1745,6 +1745,69 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to)
>   	return ret;
>   }
>   
> +/*
> + * Wrap filemap_splice_read with checks for cap bits on the inode.
> + * Atomically grab references, so that those bits are not released
> + * back to the MDS mid-read.
> + */
> +static ssize_t ceph_splice_read(struct file *in, loff_t *ppos,
> +				struct pipe_inode_info *pipe,
> +				size_t len, unsigned int flags)
> +{
> +	struct ceph_file_info *fi = in->private_data;
> +	struct inode *inode = file_inode(in);
> +	struct ceph_inode_info *ci = ceph_inode(inode);
> +	ssize_t ret;
> +	int want = 0, got = 0;
> +	CEPH_DEFINE_RW_CONTEXT(rw_ctx, 0);
> +
> +	dout("splice_read %p %llx.%llx %llu~%zu trying to get caps on %p\n",
> +	     inode, ceph_vinop(inode), *ppos, len, inode);
> +
> +	if (ceph_inode_is_shutdown(inode))
> +		return -ESTALE;
> +
> +	if (ceph_has_inline_data(ci) ||
> +	    (fi->flags & CEPH_F_SYNC))
> +		return copy_splice_read(in, ppos, pipe, len, flags);
> +
> +	ceph_start_io_read(inode);
> +
> +	want = CEPH_CAP_FILE_CACHE;
> +	if (fi->fmode & CEPH_FILE_MODE_LAZY)
> +		want |= CEPH_CAP_FILE_LAZYIO;
> +
> +	ret = ceph_get_caps(in, CEPH_CAP_FILE_RD, want, -1, &got);
> +	if (ret < 0)
> +		goto out_end;
> +
> +	if ((got & (CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO)) == 0) {
> +		dout("splice_read/sync %p %llx.%llx %llu~%zu got cap refs on %s\n",
> +		     inode, ceph_vinop(inode), *ppos, len,
> +		     ceph_cap_string(got));
> +
> +		ceph_put_cap_refs(ci, got);
> +		ceph_end_io_read(inode);
> +		return copy_splice_read(in, ppos, pipe, len, flags);
> +	}
> +
> +	dout("splice_read %p %llx.%llx %llu~%zu got cap refs on %s\n",
> +	     inode, ceph_vinop(inode), *ppos, len, ceph_cap_string(got));
> +
> +	rw_ctx.caps = got;
> +	ceph_add_rw_context(fi, &rw_ctx);
> +	ret = filemap_splice_read(in, ppos, pipe, len, flags);
> +	ceph_del_rw_context(fi, &rw_ctx);
> +
> +	dout("splice_read %p %llx.%llx dropping cap refs on %s = %zd\n",
> +	     inode, ceph_vinop(inode), ceph_cap_string(got), ret);
> +
> +	ceph_put_cap_refs(ci, got);
> +out_end:
> +	ceph_end_io_read(inode);
> +	return ret;
> +}
> +
>   /*
>    * Take cap references to avoid releasing caps to MDS mid-write.
>    *
> @@ -2593,7 +2656,7 @@ const struct file_operations ceph_file_fops = {
>   	.lock = ceph_lock,
>   	.setlease = simple_nosetlease,
>   	.flock = ceph_flock,
> -	.splice_read = generic_file_splice_read,
> +	.splice_read = ceph_splice_read,
>   	.splice_write = iter_file_splice_write,
>   	.unlocked_ioctl = ceph_ioctl,
>   	.compat_ioctl = compat_ptr_ioctl,
>
LGTM.

Reviewed-by: Xiubo Li <xiubli@redhat.com>

Thanks

- Xiubo



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
@ 2023-05-22  2:49   ` Joseph Qi
  2023-05-22  6:28   ` David Howells
  2023-05-22  6:49   ` David Howells
  2 siblings, 0 replies; 72+ messages in thread
From: Joseph Qi @ 2023-05-22  2:49 UTC (permalink / raw)
  To: David Howells, Jens Axboe, Al Viro, Christoph Hellwig
  Cc: Matthew Wilcox, Jan Kara, Jeff Layton, David Hildenbrand,
	Jason Gunthorpe, Logan Gunthorpe, Hillf Danton,
	Christian Brauner, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Mark Fasheh,
	Joel Becker, ocfs2-devel



On 5/20/23 8:00 AM, David Howells wrote:
> Provide a splice_read stub for ocfs2.  This emits trace lines and does an
> atime lock/update before calling filemap_splice_read().  Splicing from
> direct I/O is handled by the caller.
> 
> A couple of new tracepoints are added for this purpose.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Mark Fasheh <mark@fasheh.com>
> cc: Joel Becker <jlbec@evilplan.org>
> cc: Joseph Qi <joseph.qi@linux.alibaba.com>
> cc: ocfs2-devel@oss.oracle.com
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---
>  fs/ocfs2/file.c        | 39 ++++++++++++++++++++++++++++++++++++++-
>  fs/ocfs2/ocfs2_trace.h |  3 +++
>  2 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index efb09de4343d..f7e00b5689d5 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -2581,6 +2581,43 @@ static ssize_t ocfs2_file_read_iter(struct kiocb *iocb,
>  	return ret;
>  }
>  
> +static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
> +				      struct pipe_inode_info *pipe,
> +				      size_t len, unsigned int flags)
> +{
> +	struct inode *inode = file_inode(in);
> +	ssize_t ret = 0;
> +	int lock_level = 0;
> +
> +	trace_ocfs2_file_splice_read(inode, in, in->f_path.dentry,
> +				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
> +				     in->f_path.dentry->d_name.len,
> +				     in->f_path.dentry->d_name.name,
> +				     0);

Better also trace flags here.

> +
> +	/*
> +	 * We're fine letting folks race truncates and extending writes with
> +	 * read across the cluster, just like they can locally.  Hence no
> +	 * rw_lock during read.
> +	 *
> +	 * Take and drop the meta data lock to update inode fields like i_size.
> +	 * This allows the checks down below generic_file_splice_read() a

Now it calls filemap_splice_read().

> +	 * chance of actually working.
> +	 */
> +	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, true);

Since prototype is 'int wait', so directly passing '1' seems more appropriate.

> +	if (ret < 0) {
> +		if (ret != -EAGAIN)
> +			mlog_errno(ret);
> +		goto bail;
> +	}
> +	ocfs2_inode_unlock(inode, lock_level);
> +

Don't see direct IO logic now. Am I missing something?

Thanks,
Joseph

> +	ret = filemap_splice_read(in, ppos, pipe, len, flags);
> +	trace_filemap_splice_read_ret(ret);
> +bail:
> +	return ret;
> +}
> +
>  /* Refer generic_file_llseek_unlocked() */
>  static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
>  {
> @@ -2744,7 +2781,7 @@ const struct file_operations ocfs2_fops = {
>  #endif
>  	.lock		= ocfs2_lock,
>  	.flock		= ocfs2_flock,
> -	.splice_read	= generic_file_splice_read,
> +	.splice_read	= ocfs2_file_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.fallocate	= ocfs2_fallocate,
>  	.remap_file_range = ocfs2_remap_file_range,
> diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
> index dc4bce1649c1..b8c3d1702076 100644
> --- a/fs/ocfs2/ocfs2_trace.h
> +++ b/fs/ocfs2/ocfs2_trace.h
> @@ -1319,6 +1319,8 @@ DEFINE_OCFS2_FILE_OPS(ocfs2_file_splice_write);
>  
>  DEFINE_OCFS2_FILE_OPS(ocfs2_file_read_iter);
>  
> +DEFINE_OCFS2_FILE_OPS(ocfs2_file_splice_read);
> +
>  DEFINE_OCFS2_ULL_ULL_ULL_EVENT(ocfs2_truncate_file);
>  
>  DEFINE_OCFS2_ULL_ULL_EVENT(ocfs2_truncate_file_error);
> @@ -1470,6 +1472,7 @@ TRACE_EVENT(ocfs2_prepare_inode_for_write,
>  );
>  
>  DEFINE_OCFS2_INT_EVENT(generic_file_read_iter_ret);
> +DEFINE_OCFS2_INT_EVENT(filemap_splice_read_ret);
>  
>  /* End of trace events for fs/ocfs2/file.c. */
>  

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
  2023-05-22  2:49   ` Joseph Qi
@ 2023-05-22  6:28   ` David Howells
  2023-05-22  6:34     ` Joseph Qi
  2023-05-22  6:49   ` David Howells
  2 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-22  6:28 UTC (permalink / raw)
  To: Joseph Qi
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Christian Brauner, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel, linux-mm,
	Christoph Hellwig, Mark Fasheh, Joel Becker, ocfs2-devel

Joseph Qi <joseph.qi@linux.alibaba.com> wrote:

> Don't see direct IO logic now. Am I missing something?

See that patch description ;-)

    Provide a splice_read stub for ocfs2.  This emits trace lines and does an
    atime lock/update before calling filemap_splice_read().  Splicing from
    direct I/O is handled by the caller.

David


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-22  6:28   ` David Howells
@ 2023-05-22  6:34     ` Joseph Qi
  0 siblings, 0 replies; 72+ messages in thread
From: Joseph Qi @ 2023-05-22  6:34 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Mark Fasheh, Joel Becker, ocfs2-devel



On 5/22/23 2:28 PM, David Howells wrote:
> Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> 
>> Don't see direct IO logic now. Am I missing something?
> 
> See that patch description ;-)
> 
>     Provide a splice_read stub for ocfs2.  This emits trace lines and does an
>     atime lock/update before calling filemap_splice_read().  Splicing from
>     direct I/O is handled by the caller.
> 

Oops, missed the patch 7 of the series since I've only received this
one:(
Have checked it on maillist, it's fine for me now.

Thanks,
Joseph

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
  2023-05-22  2:49   ` Joseph Qi
  2023-05-22  6:28   ` David Howells
@ 2023-05-22  6:49   ` David Howells
  2023-05-22  6:54     ` Joseph Qi
  2 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-22  6:49 UTC (permalink / raw)
  To: Joseph Qi
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Christian Brauner, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel, linux-mm,
	Christoph Hellwig, Mark Fasheh, Joel Becker, ocfs2-devel

So something like the attached changes?  Any suggestions as to how to improve
the comments?

David
---
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index f7e00b5689d5..86add13b5f23 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2552,7 +2552,7 @@ static ssize_t ocfs2_file_read_iter(struct kiocb *iocb,
 	 *
 	 * Take and drop the meta data lock to update inode fields
 	 * like i_size. This allows the checks down below
-	 * generic_file_read_iter() a chance of actually working.
+	 * copy_splice_read() a chance of actually working.
 	 */
 	ret = ocfs2_inode_lock_atime(inode, filp->f_path.mnt, &lock_level,
 				     !nowait);
@@ -2593,7 +2593,7 @@ static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
 				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
 				     in->f_path.dentry->d_name.len,
 				     in->f_path.dentry->d_name.name,
-				     0);
+				     flags);
 
 	/*
 	 * We're fine letting folks race truncates and extending writes with
@@ -2601,10 +2601,10 @@ static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
 	 * rw_lock during read.
 	 *
 	 * Take and drop the meta data lock to update inode fields like i_size.
-	 * This allows the checks down below generic_file_splice_read() a
-	 * chance of actually working.
+	 * This allows the checks down below filemap_splice_read() a chance of
+	 * actually working.
 	 */
-	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, true);
+	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, 1);
 	if (ret < 0) {
 		if (ret != -EAGAIN)
 			mlog_errno(ret);


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 22/30] ocfs2: Provide a splice-read stub
  2023-05-22  6:49   ` David Howells
@ 2023-05-22  6:54     ` Joseph Qi
  0 siblings, 0 replies; 72+ messages in thread
From: Joseph Qi @ 2023-05-22  6:54 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Mark Fasheh, Joel Becker, ocfs2-devel



On 5/22/23 2:49 PM, David Howells wrote:
> So something like the attached changes?  Any suggestions as to how to improve
> the comments?
> 

Looks fine to me now. Thanks.

Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>

> David
> ---
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index f7e00b5689d5..86add13b5f23 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -2552,7 +2552,7 @@ static ssize_t ocfs2_file_read_iter(struct kiocb *iocb,
>  	 *
>  	 * Take and drop the meta data lock to update inode fields
>  	 * like i_size. This allows the checks down below
> -	 * generic_file_read_iter() a chance of actually working.
> +	 * copy_splice_read() a chance of actually working.
>  	 */
>  	ret = ocfs2_inode_lock_atime(inode, filp->f_path.mnt, &lock_level,
>  				     !nowait);
> @@ -2593,7 +2593,7 @@ static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
>  				     (unsigned long long)OCFS2_I(inode)->ip_blkno,
>  				     in->f_path.dentry->d_name.len,
>  				     in->f_path.dentry->d_name.name,
> -				     0);
> +				     flags);
>  
>  	/*
>  	 * We're fine letting folks race truncates and extending writes with
> @@ -2601,10 +2601,10 @@ static ssize_t ocfs2_file_splice_read(struct file *in, loff_t *ppos,
>  	 * rw_lock during read.
>  	 *
>  	 * Take and drop the meta data lock to update inode fields like i_size.
> -	 * This allows the checks down below generic_file_splice_read() a
> -	 * chance of actually working.
> +	 * This allows the checks down below filemap_splice_read() a chance of
> +	 * actually working.
>  	 */
> -	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, true);
> +	ret = ocfs2_inode_lock_atime(inode, in->f_path.mnt, &lock_level, 1);
>  	if (ret < 0) {
>  		if (ret != -EAGAIN)
>  			mlog_errno(ret);

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
                     ` (2 preceding siblings ...)
  2023-05-20  9:51   ` David Howells
@ 2023-05-22  7:55   ` David Howells
  2023-05-22 12:53     ` Christian Brauner
  3 siblings, 1 reply; 72+ messages in thread
From: David Howells @ 2023-05-22  7:55 UTC (permalink / raw)
  To: Christian Brauner
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Steve French, linux-cifs

> For the future it'd be nice if exported functions would always get
> proper kernel doc,

Something like the attached?

David
---
commit 0362042ba0751fc5457b0548fb9006f9d7dfbeca
Author: David Howells <dhowells@redhat.com>
Date:   Mon May 22 08:34:24 2023 +0100

    splice: kdoc for filemap_splice_read() and copy_splice_read()
    
    Provide kerneldoc comments for filemap_splice_read() and
    copy_splice_read().
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Christian Brauner <brauner@kernel.org>
    cc: Christoph Hellwig <hch@lst.de>
    cc: Jens Axboe <axboe@kernel.dk>
    cc: Steve French <smfrench@gmail.com>
    cc: Al Viro <viro@zeniv.linux.org.uk>
    cc: linux-mm@kvack.org
    cc: linux-block@vger.kernel.org
    cc: linux-cifs@vger.kernel.org
    cc: linux-fsdevel@vger.kernel.org

diff --git a/fs/splice.c b/fs/splice.c
index 9be4cb3b9879..5292a8fa929d 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -299,8 +299,25 @@ void splice_shrink_spd(struct splice_pipe_desc *spd)
 	kfree(spd->partial);
 }
 
-/*
- * Copy data from a file into pages and then splice those into the output pipe.
+/**
+ * copy_splice_read -  Copy data from a file and splice the copy into a pipe
+ * @in: The file to read from
+ * @ppos: Pointer to the file position to read from
+ * @pipe: The pipe to splice into
+ * @len: The amount to splice
+ * @flags: The SPLICE_F_* flags
+ *
+ * This function allocates a bunch of pages sufficient to hold the requested
+ * amount of data (but limited by the remaining pipe capacity), passes it to
+ * the file's ->read_iter() to read into and then splices the used pages into
+ * the pipe.
+ *
+ * On success, the number of bytes read will be returned and *@ppos will be
+ * updated if appropriate; 0 will be returned if there is no more data to be
+ * read; -EAGAIN will be returned if the pipe had no space, and some other
+ * negative error code will be returned on error.  A short read may occur if
+ * the pipe has insufficient space, we reach the end of the data or we hit a
+ * hole.
  */
 ssize_t copy_splice_read(struct file *in, loff_t *ppos,
 			 struct pipe_inode_info *pipe,
diff --git a/mm/filemap.c b/mm/filemap.c
index 603b562d69b1..1f235a6430fd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2871,9 +2871,24 @@ size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,
 	return spliced;
 }
 
-/*
- * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into
- * a pipe.
+/**
+ * filemap_splice_read -  Splice data from a file's pagecache into a pipe
+ * @in: The file to read from
+ * @ppos: Pointer to the file position to read from
+ * @pipe: The pipe to splice into
+ * @len: The amount to splice
+ * @flags: The SPLICE_F_* flags
+ *
+ * This function gets folios from a file's pagecache and splices them into the
+ * pipe.  Readahead will be called as necessary to fill more folios.  This may
+ * be used for blockdevs also.
+ *
+ * On success, the number of bytes read will be returned and *@ppos will be
+ * updated if appropriate; 0 will be returned if there is no more data to be
+ * read; -EAGAIN will be returned if the pipe had no space, and some other
+ * negative error code will be returned on error.  A short read may occur if
+ * the pipe has insufficient space, we reach the end of the data or we hit a
+ * hole.
  */
 ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
 			    struct pipe_inode_info *pipe,


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read()
  2023-05-22  7:55   ` David Howells
@ 2023-05-22 12:53     ` Christian Brauner
  0 siblings, 0 replies; 72+ messages in thread
From: Christian Brauner @ 2023-05-22 12:53 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Steve French,
	linux-cifs

On Mon, May 22, 2023 at 08:55:14AM +0100, David Howells wrote:
> > For the future it'd be nice if exported functions would always get
> > proper kernel doc,
> 
> Something like the attached?
> 
> David
> ---
> commit 0362042ba0751fc5457b0548fb9006f9d7dfbeca
> Author: David Howells <dhowells@redhat.com>
> Date:   Mon May 22 08:34:24 2023 +0100
> 
>     splice: kdoc for filemap_splice_read() and copy_splice_read()
>     
>     Provide kerneldoc comments for filemap_splice_read() and
>     copy_splice_read().
>     
>     Signed-off-by: David Howells <dhowells@redhat.com>
>     cc: Christian Brauner <brauner@kernel.org>
>     cc: Christoph Hellwig <hch@lst.de>
>     cc: Jens Axboe <axboe@kernel.dk>
>     cc: Steve French <smfrench@gmail.com>
>     cc: Al Viro <viro@zeniv.linux.org.uk>
>     cc: linux-mm@kvack.org
>     cc: linux-block@vger.kernel.org
>     cc: linux-cifs@vger.kernel.org
>     cc: linux-fsdevel@vger.kernel.org
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 9be4cb3b9879..5292a8fa929d 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -299,8 +299,25 @@ void splice_shrink_spd(struct splice_pipe_desc *spd)
>  	kfree(spd->partial);
>  }
>  
> -/*
> - * Copy data from a file into pages and then splice those into the output pipe.
> +/**
> + * copy_splice_read -  Copy data from a file and splice the copy into a pipe
> + * @in: The file to read from
> + * @ppos: Pointer to the file position to read from
> + * @pipe: The pipe to splice into
> + * @len: The amount to splice
> + * @flags: The SPLICE_F_* flags
> + *
> + * This function allocates a bunch of pages sufficient to hold the requested
> + * amount of data (but limited by the remaining pipe capacity), passes it to
> + * the file's ->read_iter() to read into and then splices the used pages into
> + * the pipe.
> + *
> + * On success, the number of bytes read will be returned and *@ppos will be
> + * updated if appropriate; 0 will be returned if there is no more data to be
> + * read; -EAGAIN will be returned if the pipe had no space, and some other
> + * negative error code will be returned on error.  A short read may occur if
> + * the pipe has insufficient space, we reach the end of the data or we hit a
> + * hole.
>   */

I think kdoc expects:

* Return: On success, the number of bytes read will be returned and *@ppos will be
* updated if appropriate; 0 will be returned if there is no more data to be
* read; -EAGAIN will be returned if the pipe had no space, and some other
* negative error code will be returned on error.  A short read may occur if
* the pipe has insufficient space, we reach the end of the data or we hit a
* hole.

and similar for filemap_splice_read() other than that this looks good!

>  ssize_t copy_splice_read(struct file *in, loff_t *ppos,
>  			 struct pipe_inode_info *pipe,
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 603b562d69b1..1f235a6430fd 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2871,9 +2871,24 @@ size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,
>  	return spliced;
>  }
>  
> -/*
> - * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into
> - * a pipe.
> +/**
> + * filemap_splice_read -  Splice data from a file's pagecache into a pipe
> + * @in: The file to read from
> + * @ppos: Pointer to the file position to read from
> + * @pipe: The pipe to splice into
> + * @len: The amount to splice
> + * @flags: The SPLICE_F_* flags
> + *
> + * This function gets folios from a file's pagecache and splices them into the
> + * pipe.  Readahead will be called as necessary to fill more folios.  This may
> + * be used for blockdevs also.
> + *
> + * On success, the number of bytes read will be returned and *@ppos will be
> + * updated if appropriate; 0 will be returned if there is no more data to be
> + * read; -EAGAIN will be returned if the pipe had no space, and some other
> + * negative error code will be returned on error.  A short read may occur if
> + * the pipe has insufficient space, we reach the end of the data or we hit a
> + * hole.
>   */
>  ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
>  			    struct pipe_inode_info *pipe,
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read()
  2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
                     ` (2 preceding siblings ...)
  2023-05-21 12:50   ` David Howells
@ 2023-05-23 14:27   ` Steven Rostedt
  3 siblings, 0 replies; 72+ messages in thread
From: Steven Rostedt @ 2023-05-23 14:27 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Masami Hiramatsu, linux-trace-kernel

On Sat, 20 May 2023 01:00:45 +0100
David Howells <dhowells@redhat.com> wrote:

> For the splice from the trace seq buffer, just use copy_splice_read().
> 
> In the future, something better can probably be done by gifting pages from
> seq->buf into the pipe, but that would require changing seq->buf into a
> vmap over an array of pages.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Steven Rostedt <rostedt@goodmis.org>
> cc: Masami Hiramatsu <mhiramat@kernel.org>
> cc: linux-kernel@vger.kernel.org
> cc: linux-trace-kernel@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> ---
>  kernel/trace/trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index ebc59781456a..c210d02fac97 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = {
>  	.open		= tracing_open,
>  	.read		= seq_read,
>  	.read_iter	= seq_read_iter,
> -	.splice_read	= generic_file_splice_read,
> +	.splice_read	= copy_splice_read,

Anyway, for this change:

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

>  	.write		= tracing_write_stub,
>  	.llseek		= tracing_lseek,
>  	.release	= tracing_release,


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [f2fs-dev] [PATCH v21 19/30] f2fs: Provide a splice-read stub
  2023-05-20  0:00 ` [PATCH v21 19/30] f2fs: " David Howells
@ 2023-07-06  0:18   ` patchwork-bot+f2fs
  0 siblings, 0 replies; 72+ messages in thread
From: patchwork-bot+f2fs @ 2023-07-06  0:18 UTC (permalink / raw)
  To: David Howells
  Cc: axboe, viro, hch, linux-block, hdanton, jack, david, torvalds,
	jlayton, brauner, willy, linux-kernel, linux-mm, jgg,
	linux-fsdevel, jaegeuk, linux-f2fs-devel, logang, hch

Hello:

This patch was applied to jaegeuk/f2fs.git (dev)
by Jens Axboe <axboe@kernel.dk>:

On Sat, 20 May 2023 01:00:38 +0100 you wrote:
> Provide a splice_read stub for f2fs.  This does some checks and tracing
> before calling filemap_splice_read() and will update the iostats
> afterwards.  Direct I/O is handled by the caller.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Jaegeuk Kim <jaegeuk@kernel.org>
> cc: Chao Yu <chao@kernel.org>
> cc: linux-f2fs-devel@lists.sourceforge.net
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org
> 
> [...]

Here is the summary with links:
  - [f2fs-dev,v21,19/30] f2fs: Provide a splice-read stub
    https://git.kernel.org/jaegeuk/f2fs/c/ceb11d0e2da2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2023-07-06  0:18 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-20  0:00 [PATCH v21 00/30] splice: Kill ITER_PIPE David Howells
2023-05-20  0:00 ` [PATCH v21 01/30] splice: Fix filemap of a blockdev David Howells
2023-05-20  4:08   ` Christoph Hellwig
2023-05-20  9:14   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 02/30] splice: Make filemap_splice_read() check s_maxbytes David Howells
2023-05-20  4:09   ` Christoph Hellwig
2023-05-20  9:21   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 03/30] splice: Rename direct_splice_read() to copy_splice_read() David Howells
2023-05-20  4:09   ` Christoph Hellwig
2023-05-20  9:23   ` Christian Brauner
2023-05-20  9:51   ` David Howells
2023-05-22  7:55   ` David Howells
2023-05-22 12:53     ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 04/30] splice: Clean up copy_splice_read() a bit David Howells
2023-05-20  9:34   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 05/30] splice: Make do_splice_to() generic and export it David Howells
2023-05-20  9:35   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 06/30] splice: Check for zero count in vfs_splice_read() David Howells
2023-05-20  9:38   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 07/30] splice: Make splice from an O_DIRECT fd use copy_splice_read() David Howells
2023-05-20  4:11   ` Christoph Hellwig
2023-05-20  9:39   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 08/30] splice: Make splice from a DAX file " David Howells
2023-05-20  4:11   ` Christoph Hellwig
2023-05-20  9:41   ` Christian Brauner
2023-05-21  0:28   ` Theodore Ts'o
2023-05-21 14:55   ` Gao Xiang
2023-05-20  0:00 ` [PATCH v21 09/30] shmem: Implement splice-read David Howells
2023-05-20  0:00 ` [PATCH v21 10/30] overlayfs: " David Howells
2023-05-20  9:47   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 11/30] coda: " David Howells
2023-05-20  0:00 ` [PATCH v21 12/30] tty, proc, kernfs, random: Use copy_splice_read() David Howells
2023-05-20  0:00 ` [PATCH v21 13/30] net: Make sock_splice_read() use copy_splice_read() by default David Howells
2023-05-20  0:00 ` [PATCH v21 14/30] 9p: Add splice_read stub David Howells
2023-05-20  0:00 ` [PATCH v21 15/30] afs: Provide a splice-read stub David Howells
2023-05-20  0:00 ` [PATCH v21 16/30] ceph: " David Howells
2023-05-22  2:12   ` Xiubo Li
2023-05-20  0:00 ` [PATCH v21 17/30] ecryptfs: " David Howells
2023-05-20  0:00 ` [PATCH v21 18/30] ext4: " David Howells
2023-05-20  4:12   ` Christoph Hellwig
2023-05-20  7:21   ` David Howells
2023-05-20  9:01     ` Christoph Hellwig
2023-05-21  0:26       ` Theodore Ts'o
2023-05-20  0:00 ` [PATCH v21 19/30] f2fs: " David Howells
2023-07-06  0:18   ` [f2fs-dev] " patchwork-bot+f2fs
2023-05-20  0:00 ` [PATCH v21 20/30] nfs: " David Howells
2023-05-20  0:00 ` [PATCH v21 21/30] ntfs3: " David Howells
2023-05-20  0:00 ` [PATCH v21 22/30] ocfs2: " David Howells
2023-05-22  2:49   ` Joseph Qi
2023-05-22  6:28   ` David Howells
2023-05-22  6:34     ` Joseph Qi
2023-05-22  6:49   ` David Howells
2023-05-22  6:54     ` Joseph Qi
2023-05-20  0:00 ` [PATCH v21 23/30] orangefs: " David Howells
2023-05-20  0:00 ` [PATCH v21 24/30] xfs: " David Howells
2023-05-20  4:13   ` Christoph Hellwig
2023-05-20  0:00 ` [PATCH v21 25/30] zonefs: " David Howells
2023-05-20  0:00 ` [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() David Howells
2023-05-20  4:14   ` Christoph Hellwig
2023-05-21 10:28   ` Masami Hiramatsu
2023-05-21 12:50   ` David Howells
2023-05-23 14:27   ` Steven Rostedt
2023-05-20  0:00 ` [PATCH v21 27/30] cifs: Use filemap_splice_read() David Howells
2023-05-20  0:00 ` [PATCH v21 28/30] splice: Use filemap_splice_read() instead of generic_file_splice_read() David Howells
2023-05-20  4:14   ` Christoph Hellwig
2023-05-20  9:56   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 29/30] splice: Remove generic_file_splice_read() David Howells
2023-05-20  4:14   ` Christoph Hellwig
2023-05-20  9:57   ` Christian Brauner
2023-05-20  0:00 ` [PATCH v21 30/30] iov_iter: Kill ITER_PIPE David Howells
2023-05-20  4:15   ` Christoph Hellwig
2023-05-20  9:58   ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).