All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Steve French <smfrench@gmail.com>
Cc: David Howells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Shyam Prasad N <nspmangalore@gmail.com>,
	Rohith Surabattula <rohiths.msft@gmail.com>,
	Tom Talpey <tom@talpey.com>, Stefan Metzmacher <metze@samba.org>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	Jeff Layton <jlayton@kernel.org>,
	linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 00/11] smb3: Use iov_iters down to the network transport and fix DIO page pinning
Date: Fri,  3 Feb 2023 20:59:18 +0000	[thread overview]
Message-ID: <20230203205929.2126634-1-dhowells@redhat.com> (raw)

Hi Steve,

Here's an updated version of my patchset to make the cifs/smb3 driver pass
iov_iters down to the lowest layers where they can be passed directly to
the network transport rather than passing lists of pages around.

I've dropped the patch Stefan Metzmacher objected to and rebased on top of
a merge of my two iov_iter_extract_pages() patches onto your for-next
branch.  The merge is so that the same commits are used as are in the
linux-block tree.

The series also deals with some other issues:

 (*) By pinning pages, it fixes the race between concurrent DIO read and
     fork, whereby the pages containing the DIO read buffer may end up
     belonging to the child process and not the parent - with the result
     that the parent might not see the retrieved data.

 (*) cifs shouldn't take refs on pages extracted from non-user-backed
     iterators (eg. KVEC).  With these changes, cifs will apply the
     appropriate cleanup.  Note that there is the possibility the network
     transport might, but that's beyond the scope of this patchset.

 (*) Making it easier to transition to using folios in cifs rather than
     pages by dealing with them through BVEC and XARRAY iterators.


The first couple of patches to provide function to pin or leave unpinned
the pages from an iterator (and not take a ref on them).

 (1) Define qualifying flags for extraction functions.

 (2) Define iov_iter_extract_pages() to do the extraction and
     iov_iter_extract_will_pin() to indicate how it should be cleaned up.

Then there are a couple of patches that add stuff to netfslib that I want
to use there as well as in cifs:

 (3) Add a netfslib function to extract and pin pages from an ITER_IOBUF or
     ITER_UBUF iterator into an ITER_BVEC iterator.

 (4) Add a netfslib function to extract pages from an iterator that's of
     type ITER_UBUF/IOVEC/BVEC/KVEC/XARRAY and add them to a scatterlist.
     The cleanup will need to be done as for iov_iter_extract_pages().

     BVEC, KVEC and XARRAY iterators can be rendered into elements that
     span multiple pages.

Then a fix:

 (5) Fix oops due to uncleared server->smbd_conn in reconnect

Then there are some cifs helpers that work with iterators:

 (6) Implement cifs_splice_read() to use an ITER_BVEC rather than an
     ITER_PIPE, bulk-allocating the pages, attaching them to the bvec,
     doing the I/O and then pushing the pages into the pipe.  This avoids
     the problem with cifs wanting to split the pipe iterator in a later
     patch.

 (7) Add a function to walk through an ITER_BVEC/KVEC/XARRAY iterator and
     add elements to an RDMA SGE list.  Only the DMA addresses are stored,
     and an element may span multiple pages (say if an xarray contains a
     multipage folio).

 (8) Add a function to walk through an ITER_BVEC/KVEC/XARRAY iterator and
     pass the contents into a shash function.

 (9) Add functions to walk through an ITER_XARRAY iterator and perform
     various sorts of cleanup on the folios held therein, to be used on I/O
     completion.

(10) Add a function to read from the transport TCP socket directly into an
     iterator.

Then come the patches that actually do the work of iteratorising cifs:

(11) The main patch.  Replace page lists with iterators.  It extracts the
     pages from ITER_UBUF and ITER_IOVEC iterators to an ITER_BVEC
     iterator, pinning or getting refs on them, before passing them down as
     the I/O may be done from a worker thread.

     The iterator is extracted into a scatterlist in order to talk to the
     crypto interface or to do RDMA.

(12) In the cifs RDMA code, extract the iterator into an RDMA SGE[] list,
     removing the scatterlist intermediate - at least for smbd_send().
     There appear to be other ways for cifs to talk to the RDMA layer that
     don't go through that that I haven't managed to work out.

(13) Remove a chunk of now-unused code.

(14) Fix a problem with encrypted RDMA data read.

(15) Allow DIO to/from KVEC-type iterators.

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cifs

David

Link: https://lore.kernel.org/r/166697254399.61150.1256557652599252121.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/20230131182855.4027499-1-dhowells@redhat.com/ # v1

David Howells (11):
  netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator
  netfs: Add a function to extract an iterator into a scatterlist
  cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE
  cifs: Add a function to build an RDMA SGE list from an iterator
  cifs: Add a function to Hash the contents of an iterator
  cifs: Add some helper functions
  cifs: Add a function to read into an iter from a socket
  cifs: Change the I/O paths to use an iterator rather than a page list
  cifs: Build the RDMA SGE list directly from an iterator
  cifs: Remove unused code
  cifs: DIO to/from KVEC-type iterators should now work

 fs/cifs/Kconfig       |    1 +
 fs/cifs/cifsencrypt.c |  172 +++-
 fs/cifs/cifsfs.c      |   12 +-
 fs/cifs/cifsfs.h      |    6 +
 fs/cifs/cifsglob.h    |   66 +-
 fs/cifs/cifsproto.h   |   11 +-
 fs/cifs/cifssmb.c     |   15 +-
 fs/cifs/connect.c     |   14 +
 fs/cifs/file.c        | 1848 +++++++++++++++++++----------------------
 fs/cifs/fscache.c     |   22 +-
 fs/cifs/fscache.h     |   10 +-
 fs/cifs/misc.c        |  128 +--
 fs/cifs/smb2ops.c     |  362 ++++----
 fs/cifs/smb2pdu.c     |   53 +-
 fs/cifs/smbdirect.c   |  535 +++++++-----
 fs/cifs/smbdirect.h   |    7 +-
 fs/cifs/transport.c   |   54 +-
 fs/netfs/Makefile     |    1 +
 fs/netfs/iterator.c   |  372 +++++++++
 fs/splice.c           |    1 +
 include/linux/netfs.h |    8 +
 mm/vmalloc.c          |    1 +
 22 files changed, 2013 insertions(+), 1686 deletions(-)
 create mode 100644 fs/netfs/iterator.c


             reply	other threads:[~2023-02-03 21:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03 20:59 David Howells [this message]
2023-02-03 20:59 ` [PATCH 01/11] netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator David Howells
2023-02-03 20:59 ` [PATCH 02/11] netfs: Add a function to extract an iterator into a scatterlist David Howells
2023-02-03 20:59 ` [PATCH 03/11] cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE David Howells
2023-02-06 15:48   ` Christoph Hellwig
2023-02-03 20:59 ` [PATCH 04/11] cifs: Add a function to build an RDMA SGE list from an iterator David Howells
2023-02-03 20:59 ` [PATCH 05/11] cifs: Add a function to Hash the contents of " David Howells
2023-02-03 20:59 ` [PATCH 06/11] cifs: Add some helper functions David Howells
2023-02-03 20:59 ` [PATCH 07/11] cifs: Add a function to read into an iter from a socket David Howells
2023-02-03 20:59 ` [PATCH 08/11] cifs: Change the I/O paths to use an iterator rather than a page list David Howells
2023-02-03 20:59 ` [PATCH 09/11] cifs: Build the RDMA SGE list directly from an iterator David Howells
2023-02-03 20:59 ` [PATCH 10/11] cifs: Remove unused code David Howells
2023-02-03 20:59 ` [PATCH 11/11] cifs: DIO to/from KVEC-type iterators should now work David Howells
2023-02-10 23:31 [PATCH 00/11] smb3: Use iov_iters down to the network transport and fix DIO page pinning David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230203205929.2126634-1-dhowells@redhat.com \
    --to=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=jlayton@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=nspmangalore@gmail.com \
    --cc=rohiths.msft@gmail.com \
    --cc=smfrench@gmail.com \
    --cc=tom@talpey.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.