From: Jeff Layton <jlayton@kernel.org>
To: David Howells <dhowells@redhat.com>,
Christian Brauner <christian@brauner.io>,
Gao Xiang <hsiangkao@linux.alibaba.com>,
Dominique Martinet <asmadeus@codewreck.org>
Cc: Matthew Wilcox <willy@infradead.org>,
Steve French <smfrench@gmail.com>,
Marc Dionne <marc.dionne@auristor.com>,
Paulo Alcantara <pc@manguebit.com>,
Shyam Prasad N <sprasad@microsoft.com>,
Tom Talpey <tom@talpey.com>,
Eric Van Hensbergen <ericvh@kernel.org>,
Ilya Dryomov <idryomov@gmail.com>,
netfs@lists.linux.dev, linux-cachefs@redhat.com,
linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org,
linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org,
v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache
Date: Mon, 15 Apr 2024 08:49:39 -0400 [thread overview]
Message-ID: <5c905e500499a07c5e4b0dcf9983b90e8746ed81.camel@kernel.org> (raw)
In-Reply-To: <20240328163424.2781320-1-dhowells@redhat.com>
On Thu, 2024-03-28 at 16:33 +0000, David Howells wrote:
> Hi Christian, Willy,
>
> The primary purpose of these patches is to rework the netfslib writeback
> implementation such that pages read from the cache are written to the cache
> through ->writepages(), thereby allowing the fscache page flag to be
> retired.
>
> The reworking also:
>
> (1) builds on top of the new writeback_iter() infrastructure;
>
> (2) makes it possible to use vectored write RPCs as discontiguous streams
> of pages can be accommodated;
>
> (3) makes it easier to do simultaneous content crypto and stream division.
>
> (4) provides support for retrying writes and re-dividing a stream;
>
> (5) replaces the ->launder_folio() op, so that ->writepages() is used
> instead;
>
> (6) uses mempools to allocate the netfs_io_request and netfs_io_subrequest
> structs to avoid allocation failure in the writeback path.
>
> Some code that uses the fscache page flag is retained for compatibility
> purposes with nfs and ceph. The code is switched to using the synonymous
> private_2 label instead and marked with deprecation comments. I have a
> separate set of patches that convert cifs to use this code.
>
> -~-
>
> In this new implementation, writeback_iter() is used to pump folios,
> progressively creating two parallel, but separate streams. Either or both
> streams can contain gaps, and the subrequests in each stream can be of
> variable size, don't need to align with each other and don't need to align
> with the folios. (Note that more streams can be added if we have multiple
> servers to duplicate data to).
>
> Indeed, subrequests can cross folio boundaries, may cover several folios or
> a folio may be spanned by multiple subrequests, e.g.:
>
> +---+---+-----+-----+---+----------+
> Folios: | | | | | | |
> +---+---+-----+-----+---+----------+
>
> +------+------+ +----+----+
> Upload: | | |.....| | |
> +------+------+ +----+----+
>
> +------+------+------+------+------+
> Cache: | | | | | |
> +------+------+------+------+------+
>
> Data that got read from the server that needs copying to the cache is
> stored in folios that are marked dirty and have folio->private set to a
> special value.
>
> The progressive subrequest construction permits the algorithm to be
> preparing both the next upload to the server and the next write to the
> cache whilst the previous ones are already in progress. Throttling can be
> applied to control the rate of production of subrequests - and, in any
> case, we probably want to write them to the server in ascending order,
> particularly if the file will be extended.
>
> Content crypto can also be prepared at the same time as the subrequests and
> run asynchronously, with the prepped requests being stalled until the
> crypto catches up with them. This might also be useful for transport
> crypto, but that happens at a lower layer, so probably would be harder to
> pull off.
>
> The algorithm is split into three parts:
>
> (1) The issuer. This walks through the data, packaging it up, encrypting
> it and creating subrequests. The part of this that generates
> subrequests only deals with file positions and spans and so is usable
> for DIO/unbuffered writes as well as buffered writes.
>
> (2) The collector. This asynchronously collects completed subrequests,
> unlocks folios, frees crypto buffers and performs any retries. This
> runs in a work queue so that the issuer can return to the caller for
> writeback (so that the VM can have its kswapd thread back) or async
> writes.
>
> Collection is slightly complex as the collector has to work out where
> discontiguities happen in the folio list so that it doesn't try and
> collect folios that weren't included in the write out.
>
> (3) The retryer. This pauses the issuer, waits for all outstanding
> subrequests to complete and then goes through the failed subrequests
> to reissue them. This may involve reprepping them (with cifs, the
> credits must be renegotiated and a subrequest may need splitting), and
> doing RMW for content crypto if there's a conflicting change on the
> server.
>
> David
>
> David Howells (26):
> cifs: Fix duplicate fscache cookie warnings
> 9p: Clean up some kdoc and unused var warnings.
> netfs: Update i_blocks when write committed to pagecache
> netfs: Replace PG_fscache by setting folio->private and marking dirty
> mm: Remove the PG_fscache alias for PG_private_2
> netfs: Remove deprecated use of PG_private_2 as a second writeback
> flag
> netfs: Make netfs_io_request::subreq_counter an atomic_t
> netfs: Use subreq_counter to allocate subreq debug_index values
> mm: Provide a means of invalidation without using launder_folio
> cifs: Use alternative invalidation to using launder_folio
> 9p: Use alternative invalidation to using launder_folio
> afs: Use alternative invalidation to using launder_folio
> netfs: Remove ->launder_folio() support
> netfs: Use mempools for allocating requests and subrequests
> mm: Export writeback_iter()
> netfs: Switch to using unsigned long long rather than loff_t
> netfs: Fix writethrough-mode error handling
> netfs: Add some write-side stats and clean up some stat names
> netfs: New writeback implementation
> netfs, afs: Implement helpers for new write code
> netfs, 9p: Implement helpers for new write code
> netfs, cachefiles: Implement helpers for new write code
> netfs: Cut over to using new writeback code
> netfs: Remove the old writeback code
> netfs: Miscellaneous tidy ups
> netfs, afs: Use writeback retry to deal with alternate keys
>
> fs/9p/vfs_addr.c | 60 +--
> fs/9p/vfs_inode_dotl.c | 4 -
> fs/afs/file.c | 8 +-
> fs/afs/internal.h | 6 +-
> fs/afs/validation.c | 4 +-
> fs/afs/write.c | 187 ++++----
> fs/cachefiles/io.c | 75 +++-
> fs/ceph/addr.c | 24 +-
> fs/ceph/inode.c | 2 +
> fs/netfs/Makefile | 3 +-
> fs/netfs/buffered_read.c | 40 +-
> fs/netfs/buffered_write.c | 832 ++++-------------------------------
> fs/netfs/direct_write.c | 30 +-
> fs/netfs/fscache_io.c | 14 +-
> fs/netfs/internal.h | 55 ++-
> fs/netfs/io.c | 155 +------
> fs/netfs/main.c | 55 ++-
> fs/netfs/misc.c | 10 +-
> fs/netfs/objects.c | 81 +++-
> fs/netfs/output.c | 478 --------------------
> fs/netfs/stats.c | 17 +-
> fs/netfs/write_collect.c | 813 ++++++++++++++++++++++++++++++++++
> fs/netfs/write_issue.c | 673 ++++++++++++++++++++++++++++
> fs/nfs/file.c | 8 +-
> fs/nfs/fscache.h | 6 +-
> fs/nfs/write.c | 4 +-
> fs/smb/client/cifsfs.h | 1 -
> fs/smb/client/file.c | 136 +-----
> fs/smb/client/fscache.c | 16 +-
> fs/smb/client/inode.c | 27 +-
> include/linux/fscache.h | 22 +-
> include/linux/netfs.h | 196 +++++----
> include/linux/pagemap.h | 1 +
> include/net/9p/client.h | 2 +
> include/trace/events/netfs.h | 249 ++++++++++-
> mm/filemap.c | 52 ++-
> mm/page-writeback.c | 1 +
> net/9p/Kconfig | 1 +
> net/9p/client.c | 49 +++
> net/9p/trans_fd.c | 1 -
> 40 files changed, 2492 insertions(+), 1906 deletions(-)
> delete mode 100644 fs/netfs/output.c
> create mode 100644 fs/netfs/write_collect.c
> create mode 100644 fs/netfs/write_issue.c
>
This all looks pretty reasonable. There is at least one bugfix that
looks like it ought to go in independently (#17). #19 is huge, complex
and hard to review. That will need some cycles in -next, I think. In any
case, on any that I didn't send comments you can add:
Reviewed-by: Jeff Layton <jlayton@kernel.org>
prev parent reply other threads:[~2024-04-15 12:49 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-28 16:33 [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache David Howells
2024-03-28 16:33 ` [PATCH 01/26] cifs: Fix duplicate fscache cookie warnings David Howells
2024-04-15 11:25 ` Jeff Layton
2024-04-15 13:03 ` David Howells
2024-04-15 22:51 ` Steve French
2024-04-16 22:40 ` David Howells
2024-03-28 16:33 ` [PATCH 02/26] 9p: Clean up some kdoc and unused var warnings David Howells
2024-03-28 16:33 ` [PATCH 03/26] netfs: Update i_blocks when write committed to pagecache David Howells
2024-04-15 11:28 ` Jeff Layton
2024-04-16 22:47 ` David Howells
2024-03-28 16:33 ` [PATCH 04/26] netfs: Replace PG_fscache by setting folio->private and marking dirty David Howells
2024-03-28 16:33 ` [PATCH 05/26] mm: Remove the PG_fscache alias for PG_private_2 David Howells
2024-03-28 16:33 ` [PATCH 06/26] netfs: Remove deprecated use of PG_private_2 as a second writeback flag David Howells
2024-03-28 16:33 ` [PATCH 07/26] netfs: Make netfs_io_request::subreq_counter an atomic_t David Howells
2024-03-28 16:34 ` [PATCH 08/26] netfs: Use subreq_counter to allocate subreq debug_index values David Howells
2024-03-28 16:34 ` [PATCH 09/26] mm: Provide a means of invalidation without using launder_folio David Howells
2024-04-15 11:41 ` Jeff Layton
2024-04-17 9:02 ` David Howells
2024-03-28 16:34 ` [PATCH 10/26] cifs: Use alternative invalidation to " David Howells
2024-03-28 16:34 ` [PATCH 11/26] 9p: " David Howells
2024-04-15 11:43 ` Jeff Layton
2024-04-16 23:03 ` David Howells
2024-03-28 16:34 ` [PATCH 12/26] afs: " David Howells
2024-03-28 16:34 ` [PATCH 13/26] netfs: Remove ->launder_folio() support David Howells
2024-03-28 16:34 ` [PATCH 14/26] netfs: Use mempools for allocating requests and subrequests David Howells
2024-03-28 16:34 ` [PATCH 15/26] mm: Export writeback_iter() David Howells
2024-04-03 8:59 ` Christoph Hellwig
2024-04-03 10:10 ` David Howells
2024-04-03 10:14 ` Christoph Hellwig
2024-04-03 10:55 ` David Howells
2024-04-03 12:41 ` Christoph Hellwig
2024-04-03 12:58 ` David Howells
2024-04-05 6:53 ` Christoph Hellwig
2024-04-05 10:15 ` Christian Brauner
2024-03-28 16:34 ` [PATCH 16/26] netfs: Switch to using unsigned long long rather than loff_t David Howells
2024-03-28 16:34 ` [PATCH 17/26] netfs: Fix writethrough-mode error handling David Howells
2024-04-15 12:40 ` Jeff Layton
2024-04-17 9:04 ` David Howells
2024-03-28 16:34 ` [PATCH 18/26] netfs: Add some write-side stats and clean up some stat names David Howells
2024-03-28 16:34 ` [PATCH 19/26] netfs: New writeback implementation David Howells
2024-03-29 10:34 ` Naveen Mamindlapalli
2024-03-30 1:06 ` Vadim Fedorenko
2024-03-30 1:06 ` Vadim Fedorenko
2024-03-30 1:03 ` Vadim Fedorenko
2024-03-28 16:34 ` [PATCH 20/26] netfs, afs: Implement helpers for new write code David Howells
2024-03-28 16:34 ` [PATCH 21/26] netfs, 9p: " David Howells
2024-03-28 16:34 ` [PATCH 22/26] netfs, cachefiles: " David Howells
2024-03-28 16:34 ` [PATCH 23/26] netfs: Cut over to using new writeback code David Howells
2024-03-28 16:34 ` [PATCH 24/26] netfs: Remove the old " David Howells
2024-04-15 12:20 ` Jeff Layton
2024-04-17 10:36 ` David Howells
2024-03-28 16:34 ` [PATCH 25/26] netfs: Miscellaneous tidy ups David Howells
2024-03-28 16:34 ` [PATCH 26/26] netfs, afs: Use writeback retry to deal with alternate keys David Howells
2024-04-01 13:53 ` Simon Horman
2024-04-02 8:32 ` David Howells
2024-04-10 17:38 ` Simon Horman
2024-04-11 7:09 ` David Howells
2024-04-02 8:46 ` [PATCH 19/26] netfs: New writeback implementation David Howells
2024-04-02 10:48 ` [PATCH 00/26] netfs, afs, 9p, cifs: Rework netfs to use ->writepages() to copy to cache Christian Brauner
2024-04-04 7:51 ` [PATCH 21/26] netfs, 9p: Implement helpers for new write code David Howells
2024-04-04 8:01 ` David Howells
2024-04-08 15:53 ` [PATCH 23/26] netfs: Cut over to using new writeback code David Howells
2024-04-15 12:49 ` Jeff Layton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5c905e500499a07c5e4b0dcf9983b90e8746ed81.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=asmadeus@codewreck.org \
--cc=ceph-devel@vger.kernel.org \
--cc=christian@brauner.io \
--cc=dhowells@redhat.com \
--cc=ericvh@kernel.org \
--cc=hsiangkao@linux.alibaba.com \
--cc=idryomov@gmail.com \
--cc=linux-afs@lists.infradead.org \
--cc=linux-cachefs@redhat.com \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=marc.dionne@auristor.com \
--cc=netdev@vger.kernel.org \
--cc=netfs@lists.linux.dev \
--cc=pc@manguebit.com \
--cc=smfrench@gmail.com \
--cc=sprasad@microsoft.com \
--cc=tom@talpey.com \
--cc=v9fs@lists.linux.dev \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).