linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Wysochanski <dwysocha@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH v1 0/13] Convert NFS to new netfs and fscache APIs
Date: Sat, 21 Nov 2020 13:28:57 -0500	[thread overview]
Message-ID: <CALF+zOnqDeFS+WHe8XUAhbzhkYOBMp2JrAFvqcHqxKsBDzycwA@mail.gmail.com> (raw)
In-Reply-To: <6773E22E-57CC-4555-8B27-2B52034DD24D@oracle.com>

On Sat, Nov 21, 2020 at 12:16 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
>
>
> > On Nov 21, 2020, at 12:01 PM, David Wysochanski <dwysocha@redhat.com> wrote:
> >
> > On Sat, Nov 21, 2020 at 11:14 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> Hi Dave-
> >>
> >>> On Nov 21, 2020, at 8:29 AM, Dave Wysochanski <dwysocha@redhat.com> wrote:
> >>>
> >>> These patches update the NFS client to use the new netfs and fscache
> >>> APIs and are at:
> >>> https://github.com/DaveWysochanskiRH/kernel.git
> >>> https://github.com/DaveWysochanskiRH/kernel/commit/94e9633d98a5542ea384b1095290ac6f915fc917
> >>> https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-nfs-20201120
> >>>
> >>> The patches are based on David Howells fscache-iter tree at
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
> >>>
> >>> The first 6 patches refactor some of the NFS read code to facilitate
> >>> re-use, the next 6 patches do the conversion to the new API, and the
> >>> last patch is a somewhat awkward fix for a problem seen in final
> >>> testing.
> >>>
> >>> Per David Howell's recent post, note that the new fscache API is
> >>> divided into two separate APIs, a 'netfs' API and an 'fscache' API.
> >>> The netfs API was done to help simplify the IO paths of network
> >>> filesystems, and can be called even when fscache is disabled, thus
> >>> simplifing both readpage and readahead implementations.  However,
> >>> for now these NFS conversion patches only call the netfs API when
> >>> fscache is enabled, similar to the existing NFS code.
> >>>
> >>> Trond and Anna, I would appreciate your guidance on this patchset.
> >>> At least I would like your feedback regarding the direction
> >>> you would like these patches to go, in particular, the following
> >>> items:
> >>>
> >>> 1. Whether you are ok with using the netfs API unconditionally even
> >>> when fscache is disabled, or prefer this "least invasive to NFS"
> >>> approach.  Note the unconditional use of the netfs API is the
> >>> recommended approach per David's post and the AFS and CEPH
> >>> implementations, but I was unsure if you would accept this
> >>> approach or would prefer to minimize changes to NFS.  Note if
> >>> we keep the current approach to minimize NFS changes, we will
> >>> have to address some problems with page unlocking such as with
> >>> patch 13 in the series.
> >>>
> >>> 2. Whether to keep the NFS fscache implementation as "read only"
> >>> or if we add write through support.  Today we only enable fscache
> >>> when a file is open read-only and disable / invalidate when a file
> >>> is open for write.
> >>>
> >>> Still TODO
> >>> 1. Address known issues (lockdep, page unlocking), depending on
> >>> what is decided as far as implementation direction
> >>> a) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
> >>> with GFP_KERNEL which may sleep (dhowells noted this in a review)
> >>> b) nfs_refresh_inode() takes inode->i_lock but may call
> >>> __fscache_invalidate() which may sleep (found with lockdep)
> >>> 2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
> >>> * Compare with netfs stats and determine if still needed
> >>> 3. Cleanup dfprintks and/or convert to tracepoints
> >>> 4. Further tests (see "Not tested yet")
> >>
> >> Can you say whether your approach has any performance impact?
> > No I cannot.
> >
> >> In particular, what comparative benchmarks have been run?
> >>
> > No comparisons so far, but note the last bullet - "performance".
> >
> > Are you wondering about performance with/without fscache for this
> > series, or the old vs new fscache, or something else?
>
> I'd like to have some explicit performance-related merge worthiness
> criteria. For example: "No performance regressions, and here's how
> we're going to determine that we're good: fio / iozone / yada with
> NFS/TCP and NFS/RDMA on 100GbE; for very little additional CPU
> cost measured via perf xyzzy. Also some benchmark that measures lock
> contention."
>
> We haven't been especially careful about this in the past when
> reworking the client's primary I/O paths. Nothing unreasonable, but
> it should be stated up front where we want to end up.
>
Makes sense.

> Another approach might be: we're going to start by making fscache
> opt-in. As confidence increases over time and good performance is
> demonstrated, then we'll unify the fscache and non-cached I/O paths.
>
It sounds like what you want is what I've done in this first implementation.
This implementation takes a "least invasive to NFS" approach, staying
with the old fscache on/off logic, even though this was not ideal or what
was recommended as the end game for the netfs API.

>
> >>> Checks run
> >>> 1. checkpatch: PASS*
> >>> * a few warnings, mostly trivial fixups, some unrelated to this set
> >>> 2. kernel builds with each patch: PASS
> >>> * each patch in series built cleanly which ensure bisection
> >>>
> >>> Tests run
> >>> 1. Custom NFS+fscache unit tests for basic operation: PASS*
> >>> * no oops or data corruptions
> >>> * Some op counts are a bit off but these are mostly due
> >>>   to statistics not implemented properly (NFSIOS_FSCACHE_*)
> >>> 2. cthon04: PASS (test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys)
> >>> * No failures or oopses for any version or test options
> >>> 3. iozone tests (fsc,vers=3,4.0,4.1,4.2,sec=sys): PASS
> >>> * No failures or oopses
> >>> 4. kernel build (fsc,vers=3,4.1,4.2): PASS*
> >>> * all builds finish without errors or data corruption
> >>> * one lockdep "scheduling while atomic" fired with NFS41 and
> >>>   was due to one an fscache invalidation code path (known issue 'b' above)
> >>> 5. xfstests/generic (fsc,vers=4.2, nofsc,vers=4.2): PASS*
> >>>  * generic/013 (pass but triggers i_lock lockdep warning known issue 'a' above)
> >>>  * NOTE: The following tests failed with errors, but they
> >>>    also fail on vanilla 5.10-rc4 so are not related to this
> >>>    patchset
> >>>    * generic/074 (lockep invalid wait context - nfs_free_request())
> >>>    * generic/538 (short read)
> >>>    * generic/551 (pread: Unknown error 524, Data verification fail)
> >>>    * generic/568 (ERROR: File grew from 4096 B to 8192 B when writing to the fallocated range)
> >>>
> >>> Not tested yet:
> >>> * error injections (for example, connection disruptions, server errors during IO, etc)
> >>> * pNFS
> >>> * many process mixed read/write on same file
> >>> * performance
> >>> Dave Wysochanski (13):
> >>> NFS: Clean up nfs_readpage() and nfs_readpages()
> >>> NFS: In nfs_readpage() only increment NFSIOS_READPAGES when read
> >>>   succeeds
> >>> NFS: Refactor nfs_readpage() and nfs_readpage_async() to use
> >>>   nfs_readdesc
> >>> NFS: Call readpage_async_filler() from nfs_readpage_async()
> >>> NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()
> >>> NFS: Allow internal use of read structs and functions
> >>> NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie
> >>> NFS: Convert fscache_enable_cookie and fscache_disable_cookie
> >>> NFS: Convert fscache invalidation and update aux_data and i_size
> >>> NFS: Convert to the netfs API and nfs_readpage to use netfs_readpage
> >>> NFS: Convert readpage to readahead and use netfs_readahead for fscache
> >>> NFS: Allow NFS use of new fscache API in build
> >>> NFS: Ensure proper page unlocking when reads fail with retryable
> >>>   errors
> >>>
> >>> fs/nfs/Kconfig             |   2 +-
> >>> fs/nfs/direct.c            |   2 +
> >>> fs/nfs/file.c              |  22 ++--
> >>> fs/nfs/fscache-index.c     |  94 --------------
> >>> fs/nfs/fscache.c           | 315 ++++++++++++++++++++-------------------------
> >>> fs/nfs/fscache.h           | 132 +++++++------------
> >>> fs/nfs/inode.c             |   4 +-
> >>> fs/nfs/internal.h          |   8 ++
> >>> fs/nfs/nfs4proc.c          |   2 +-
> >>> fs/nfs/pagelist.c          |   2 +
> >>> fs/nfs/read.c              | 248 ++++++++++++++++-------------------
> >>> fs/nfs/write.c             |   3 +-
> >>> include/linux/nfs_fs.h     |   5 +-
> >>> include/linux/nfs_iostat.h |   2 +-
> >>> include/linux/nfs_page.h   |   1 +
> >>> include/linux/nfs_xdr.h    |   1 +
> >>> 16 files changed, 339 insertions(+), 504 deletions(-)
> >>>
> >>> --
> >>> 1.8.3.1
> >>>
> >>
> >> --
> >> Chuck Lever
>
> --
> Chuck Lever
>
>
>


  reply	other threads:[~2020-11-21 18:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-21 13:29 [PATCH v1 0/13] Convert NFS to new netfs and fscache APIs Dave Wysochanski
2020-11-21 16:12 ` Chuck Lever
2020-11-21 17:01   ` David Wysochanski
2020-11-21 17:16     ` Chuck Lever
2020-11-21 18:28       ` David Wysochanski [this message]
2020-11-21 18:47         ` Chuck Lever
2020-11-30 13:18           ` David Wysochanski
2020-11-30 13:05 ` David Wysochanski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALF+zOnqDeFS+WHe8XUAhbzhkYOBMp2JrAFvqcHqxKsBDzycwA@mail.gmail.com \
    --to=dwysocha@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).