All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve French <smfrench@gmail.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: David Howells <dhowells@redhat.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	CIFS <linux-cifs@vger.kernel.org>,
	ceph-devel@vger.kernel.org, Matthew Wilcox <willy@infradead.org>,
	linux-cachefs@redhat.com,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-mm <linux-mm@kvack.org>,
	linux-afs@lists.infradead.org,
	v9fs-developer@lists.sourceforge.net,
	Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	David Wysochanski <dwysocha@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3]
Date: Mon, 15 Feb 2021 18:40:27 -0600	[thread overview]
Message-ID: <CAH2r5mvPLivjuE=cbijzGSHOvx-hkWSWbcxpoBnJX-BR9pBskQ@mail.gmail.com> (raw)
In-Reply-To: <9e49f96cd80eaf9c8ed267a7fbbcb4c6467ee790.camel@redhat.com>

Jeff,
What are the performance differences you are seeing (positive or
negative) with ceph and netfs, especially with simple examples like
file copy or grep of large files?

It could be good if netfs simplifies the problem experienced by
network filesystems on Linux with readahead on large sequential reads
- where we don't get as much parallelism due to only having one
readahead request at a time (thus in many cases there is 'dead time'
on either the network or the file server while waiting for the next
readpages request to be issued).   This can be a significant
performance problem for current readpages when network latency is long
(or e.g. in cases when network encryption is enabled, and hardware
offload not available so time consuming on the server or client to
encrypt the packet).

Do you see netfs much faster than currentreadpages for ceph?

Have you been able to get much benefit from throttling readahead with
ceph from the current netfs approach for clamping i/o?

On Mon, Feb 15, 2021 at 12:08 PM Jeff Layton <jlayton@redhat.com> wrote:
>
> On Mon, 2021-02-15 at 15:44 +0000, David Howells wrote:
> > Here's a set of patches to do two things:
> >
> >  (1) Add a helper library to handle the new VM readahead interface.  This
> >      is intended to be used unconditionally by the filesystem (whether or
> >      not caching is enabled) and provides a common framework for doing
> >      caching, transparent huge pages and, in the future, possibly fscrypt
> >      and read bandwidth maximisation.  It also allows the netfs and the
> >      cache to align, expand and slice up a read request from the VM in
> >      various ways; the netfs need only provide a function to read a stretch
> >      of data to the pagecache and the helper takes care of the rest.
> >
> >  (2) Add an alternative fscache/cachfiles I/O API that uses the kiocb
> >      facility to do async DIO to transfer data to/from the netfs's pages,
> >      rather than using readpage with wait queue snooping on one side and
> >      vfs_write() on the other.  It also uses less memory, since it doesn't
> >      do buffered I/O on the backing file.
> >
> >      Note that this uses SEEK_HOLE/SEEK_DATA to locate the data available
> >      to be read from the cache.  Whilst this is an improvement from the
> >      bmap interface, it still has a problem with regard to a modern
> >      extent-based filesystem inserting or removing bridging blocks of
> >      zeros.  Fixing that requires a much greater overhaul.
> >
> > This is a step towards overhauling the fscache API.  The change is opt-in
> > on the part of the network filesystem.  A netfs should not try to mix the
> > old and the new API because of conflicting ways of handling pages and the
> > PG_fscache page flag and because it would be mixing DIO with buffered I/O.
> > Further, the helper library can't be used with the old API.
> >
> > This does not change any of the fscache cookie handling APIs or the way
> > invalidation is done.
> >
> > In the near term, I intend to deprecate and remove the old I/O API
> > (fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(),
> > fscache_write_page() and fscache_uncache_page()) and eventually replace
> > most of fscache/cachefiles with something simpler and easier to follow.
> >
> > The patchset contains five parts:
> >
> >  (1) Some helper patches, including provision of an ITER_XARRAY iov
> >      iterator and a function to do readahead expansion.
> >
> >  (2) Patches to add the netfs helper library.
> >
> >  (3) A patch to add the fscache/cachefiles kiocb API.
> >
> >  (4) Patches to add support in AFS for this.
> >
> >  (5) Patches from Jeff Layton to add support in Ceph for this.
> >
> > Dave Wysochanski also has patches for NFS for this, though they're not
> > included on this branch as there's an issue with PNFS.
> >
> > With this, AFS without a cache passes all expected xfstests; with a cache,
> > there's an extra failure, but that's also there before these patches.
> > Fixing that probably requires a greater overhaul.  Ceph and NFS also pass
> > the expected tests.
> >
> > These patches can be found also on:
> >
> >       https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib
> >
> > For diffing reference, the tag for the 9th Feb pull request is
> > fscache-ioapi-20210203 and can be found in the same repository.
> >
> >
> >
> > Changes
> > =======
> >
> >  (v3) Rolled in the bug fixes.
> >
> >       Adjusted the functions that unlock and wait for PG_fscache according
> >       to Linus's suggestion.
> >
> >       Hold a ref on a page when PG_fscache is set as per Linus's
> >       suggestion.
> >
> >       Dropped NFS support and added Ceph support.
> >
> >  (v2) Fixed some bugs and added NFS support.
> >
> >
> > References
> > ==========
> >
> > These patches have been published for review before, firstly as part of a
> > larger set:
> >
> > Link: https://lore.kernel.org/linux-fsdevel/158861203563.340223.7585359869938129395.stgit@warthog.procyon.org.uk/
> >
> > Link: https://lore.kernel.org/linux-fsdevel/159465766378.1376105.11619976251039287525.stgit@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/159465821598.1377938.2046362270225008168.stgit@warthog.procyon.org.uk/
> >
> > Link: https://lore.kernel.org/linux-fsdevel/160588455242.3465195.3214733858273019178.stgit@warthog.procyon.org.uk/
> >
> > Then as a cut-down set:
> >
> > Link: https://lore.kernel.org/linux-fsdevel/161118128472.1232039.11746799833066425131.stgit@warthog.procyon.org.uk/
> >
> > Link: https://lore.kernel.org/linux-fsdevel/161161025063.2537118.2009249444682241405.stgit@warthog.procyon.org.uk/
> >
> >
> > Proposals/information about the design has been published here:
> >
> > Link: https://lore.kernel.org/lkml/24942.1573667720@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/2758811.1610621106@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/1441311.1598547738@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/160655.1611012999@warthog.procyon.org.uk/
> >
> > And requests for information:
> >
> > Link: https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/4467.1579020509@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/3577430.1579705075@warthog.procyon.org.uk/
> >
> > The NFS parts, though not included here, have been tested by someone who's
> > using fscache in production:
> >
> > Link: https://listman.redhat.com/archives/linux-cachefs/2020-December/msg00000.html
> >
> > I've posted partial patches to try and help 9p and cifs along:
> >
> > Link: https://lore.kernel.org/linux-fsdevel/1514086.1605697347@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-cifs/1794123.1605713481@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-fsdevel/241017.1612263863@warthog.procyon.org.uk/
> > Link: https://lore.kernel.org/linux-cifs/270998.1612265397@warthog.procyon.org.uk/
> >
> > David
> > ---
> > David Howells (27):
> >       iov_iter: Add ITER_XARRAY
> >       mm: Add an unlock function for PG_private_2/PG_fscache
> >       mm: Implement readahead_control pageset expansion
> >       vfs: Export rw_verify_area() for use by cachefiles
> >       netfs: Make a netfs helper module
> >       netfs, mm: Move PG_fscache helper funcs to linux/netfs.h
> >       netfs, mm: Add unlock_page_fscache() and wait_on_page_fscache()
> >       netfs: Provide readahead and readpage netfs helpers
> >       netfs: Add tracepoints
> >       netfs: Gather stats
> >       netfs: Add write_begin helper
> >       netfs: Define an interface to talk to a cache
> >       netfs: Hold a ref on a page when PG_private_2 is set
> >       fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
> >       afs: Disable use of the fscache I/O routines
> >       afs: Pass page into dirty region helpers to provide THP size
> >       afs: Print the operation debug_id when logging an unexpected data version
> >       afs: Move key to afs_read struct
> >       afs: Don't truncate iter during data fetch
> >       afs: Log remote unmarshalling errors
> >       afs: Set up the iov_iter before calling afs_extract_data()
> >       afs: Use ITER_XARRAY for writing
> >       afs: Wait on PG_fscache before modifying/releasing a page
> >       afs: Extract writeback extension into its own function
> >       afs: Prepare for use of THPs
> >       afs: Use the fs operation ops to handle FetchData completion
> >       afs: Use new fscache read helper API
> >
> > Jeff Layton (6):
> >       ceph: disable old fscache readpage handling
> >       ceph: rework PageFsCache handling
> >       ceph: fix fscache invalidation
> >       ceph: convert readpage to fscache read helper
> >       ceph: plug write_begin into read helper
> >       ceph: convert ceph_readpages to ceph_readahead
> >
> >
> >  fs/Kconfig                    |    1 +
> >  fs/Makefile                   |    1 +
> >  fs/afs/Kconfig                |    1 +
> >  fs/afs/dir.c                  |  225 ++++---
> >  fs/afs/file.c                 |  470 ++++---------
> >  fs/afs/fs_operation.c         |    4 +-
> >  fs/afs/fsclient.c             |  108 +--
> >  fs/afs/inode.c                |    7 +-
> >  fs/afs/internal.h             |   58 +-
> >  fs/afs/rxrpc.c                |  150 ++---
> >  fs/afs/write.c                |  610 +++++++++--------
> >  fs/afs/yfsclient.c            |   82 +--
> >  fs/cachefiles/Makefile        |    1 +
> >  fs/cachefiles/interface.c     |    5 +-
> >  fs/cachefiles/internal.h      |    9 +
> >  fs/cachefiles/rdwr2.c         |  412 ++++++++++++
> >  fs/ceph/Kconfig               |    1 +
> >  fs/ceph/addr.c                |  535 ++++++---------
> >  fs/ceph/cache.c               |  125 ----
> >  fs/ceph/cache.h               |  101 +--
> >  fs/ceph/caps.c                |   10 +-
> >  fs/ceph/inode.c               |    1 +
> >  fs/ceph/super.h               |    1 +
> >  fs/fscache/Kconfig            |    1 +
> >  fs/fscache/Makefile           |    3 +-
> >  fs/fscache/internal.h         |    3 +
> >  fs/fscache/page.c             |    2 +-
> >  fs/fscache/page2.c            |  117 ++++
> >  fs/fscache/stats.c            |    1 +
> >  fs/internal.h                 |    5 -
> >  fs/netfs/Kconfig              |   23 +
> >  fs/netfs/Makefile             |    5 +
> >  fs/netfs/internal.h           |   97 +++
> >  fs/netfs/read_helper.c        | 1169 +++++++++++++++++++++++++++++++++
> >  fs/netfs/stats.c              |   59 ++
> >  fs/read_write.c               |    1 +
> >  include/linux/fs.h            |    1 +
> >  include/linux/fscache-cache.h |    4 +
> >  include/linux/fscache.h       |   40 +-
> >  include/linux/netfs.h         |  195 ++++++
> >  include/linux/pagemap.h       |    3 +
> >  include/net/af_rxrpc.h        |    2 +-
> >  include/trace/events/afs.h    |   74 +--
> >  include/trace/events/netfs.h  |  201 ++++++
> >  mm/filemap.c                  |   20 +
> >  mm/readahead.c                |   70 ++
> >  net/rxrpc/recvmsg.c           |    9 +-
> >  47 files changed, 3473 insertions(+), 1550 deletions(-)
> >  create mode 100644 fs/cachefiles/rdwr2.c
> >  create mode 100644 fs/fscache/page2.c
> >  create mode 100644 fs/netfs/Kconfig
> >  create mode 100644 fs/netfs/Makefile
> >  create mode 100644 fs/netfs/internal.h
> >  create mode 100644 fs/netfs/read_helper.c
> >  create mode 100644 fs/netfs/stats.c
> >  create mode 100644 include/linux/netfs.h
> >  create mode 100644 include/trace/events/netfs.h
> >
> >
>
> Thanks David,
>
> I did an xfstests run on ceph with a kernel based on this and it seemed
> to do fine. I'll plan to pull this into the ceph-client/testing branch
> and run it through the ceph kclient test harness. There are only a few
> differences from the last run we did, so I'm not expecting big changes,
> but I'll keep you posted.
>
> --
> Jeff Layton <jlayton@redhat.com>
>


-- 
Thanks,

Steve

  reply	other threads:[~2021-02-16  0:41 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-15 15:44 [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] David Howells
2021-02-15 15:44 ` [PATCH 01/33] iov_iter: Add ITER_XARRAY David Howells
2021-02-15 15:44 ` [PATCH 02/33] mm: Add an unlock function for PG_private_2/PG_fscache David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-15 15:44 ` [PATCH 03/33] mm: Implement readahead_control pageset expansion David Howells
2021-02-16 10:32   ` Christoph Hellwig
2021-02-16 13:22     ` Matthew Wilcox
2021-02-17 14:36       ` Mike Marshall
2021-02-17 14:36         ` Mike Marshall
2021-02-17 15:42       ` David Howells
2021-02-17 16:59         ` Mike Marshall
2021-02-17 16:59           ` Mike Marshall
2021-02-17 22:20         ` David Howells
2021-02-16 11:48   ` David Howells
2021-02-17 16:13   ` Matthew Wilcox
2021-02-17 22:34   ` David Howells
2021-02-17 22:49     ` Matthew Wilcox
2021-02-18 17:47   ` David Howells
2021-02-15 15:45 ` [PATCH 04/33] vfs: Export rw_verify_area() for use by cachefiles David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-16 11:55   ` David Howells
2021-02-15 15:45 ` [PATCH 05/33] netfs: Make a netfs helper module David Howells
2021-02-15 15:45 ` [PATCH 06/33] netfs, mm: Move PG_fscache helper funcs to linux/netfs.h David Howells
2021-02-15 15:45 ` [PATCH 07/33] netfs, mm: Add unlock_page_fscache() and wait_on_page_fscache() David Howells
2021-02-15 15:45 ` [PATCH 08/33] netfs: Provide readahead and readpage netfs helpers David Howells
2021-02-15 15:45 ` [PATCH 09/33] netfs: Add tracepoints David Howells
2021-02-15 15:46 ` [PATCH 10/33] netfs: Gather stats David Howells
2021-02-15 15:46 ` [PATCH 11/33] netfs: Add write_begin helper David Howells
2021-02-15 15:46 ` [PATCH 12/33] netfs: Define an interface to talk to a cache David Howells
2021-02-15 15:46 ` [PATCH 13/33] netfs: Hold a ref on a page when PG_private_2 is set David Howells
2021-02-15 15:47 ` [PATCH 14/33] fscache, cachefiles: Add alternate API to use kiocb for read/write to cache David Howells
2021-02-16 10:49   ` Christoph Hellwig
2021-02-16 15:08   ` David Howells
2021-02-15 15:47 ` [PATCH 15/33] afs: Disable use of the fscache I/O routines David Howells
2021-02-15 15:47 ` [PATCH 16/33] afs: Pass page into dirty region helpers to provide THP size David Howells
2021-02-15 15:47 ` [PATCH 17/33] afs: Print the operation debug_id when logging an unexpected data version David Howells
2021-02-15 15:47 ` [PATCH 18/33] afs: Move key to afs_read struct David Howells
2021-02-15 15:47 ` [PATCH 19/33] afs: Don't truncate iter during data fetch David Howells
2021-02-15 15:48 ` [PATCH 20/33] afs: Log remote unmarshalling errors David Howells
2021-02-15 15:48 ` [PATCH 21/33] afs: Set up the iov_iter before calling afs_extract_data() David Howells
2021-02-15 15:48 ` [PATCH 22/33] afs: Use ITER_XARRAY for writing David Howells
2021-02-15 15:48 ` [PATCH 23/33] afs: Wait on PG_fscache before modifying/releasing a page David Howells
2021-02-15 15:49 ` [PATCH 24/33] afs: Extract writeback extension into its own function David Howells
2021-02-15 15:49 ` [PATCH 25/33] afs: Prepare for use of THPs David Howells
2021-02-15 15:49 ` [PATCH 26/33] afs: Use the fs operation ops to handle FetchData completion David Howells
2021-02-15 15:49 ` [PATCH 27/33] afs: Use new fscache read helper API David Howells
2021-02-15 15:49 ` [PATCH 28/33] ceph: disable old fscache readpage handling David Howells
2021-02-15 15:50 ` [PATCH 29/33] ceph: rework PageFsCache handling David Howells
2021-02-15 15:50 ` [PATCH 30/33] ceph: fix fscache invalidation David Howells
2021-02-15 15:50 ` [PATCH 31/33] ceph: convert readpage to fscache read helper David Howells
2021-02-15 15:50 ` [PATCH 32/33] ceph: plug write_begin into " David Howells
2021-02-15 15:51 ` [PATCH 33/33] ceph: convert ceph_readpages to ceph_readahead David Howells
2021-02-15 18:05 ` [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] Jeff Layton
2021-02-15 18:05   ` Jeff Layton
2021-02-16  0:40   ` Steve French [this message]
2021-02-16  0:40     ` Steve French
2021-02-16  2:10     ` Matthew Wilcox
2021-02-16  5:18       ` Steve French
2021-02-16  5:18         ` Steve French
2021-02-16  5:22       ` Steve French
2021-02-16  5:22         ` Steve French
2021-02-23 20:27         ` Matthew Wilcox
2021-02-23 20:27           ` [f2fs-dev] " Matthew Wilcox
2021-02-24  4:57           ` Steve French
2021-02-24  4:57             ` [f2fs-dev] " Steve French
2021-02-24  4:57             ` Steve French
2021-02-24 13:32       ` David Howells
2021-02-24 15:51         ` Matthew Wilcox
2021-02-16 11:01     ` Jeff Layton
2021-02-16 11:01       ` Jeff Layton
2021-02-15 22:46 ` [PATCH 34/33] netfs: Use in_interrupt() not in_softirq() David Howells
2021-02-16  8:42   ` Christoph Hellwig
2021-02-16  9:06     ` Sebastian Andrzej Siewior
2021-02-16  9:29   ` David Howells
2021-02-16  9:30     ` Christoph Hellwig
2021-02-18 14:02     ` [PATCH 34/33] netfs: Pass flag rather than use in_softirq() David Howells
2021-02-18 15:06       ` Marc Dionne
2021-02-18 15:06         ` Marc Dionne
2021-02-18 15:16       ` Marc Dionne
2021-02-18 15:16         ` Marc Dionne
2021-02-19  9:01       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH2r5mvPLivjuE=cbijzGSHOvx-hkWSWbcxpoBnJX-BR9pBskQ@mail.gmail.com' \
    --to=smfrench@gmail.com \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=dwysocha@redhat.com \
    --cc=hch@lst.de \
    --cc=jlayton@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=sfrench@samba.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trondmy@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.