All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "jlayton@redhat.com" <jlayton@redhat.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"bfields@redhat.com" <bfields@redhat.com>
Subject: Re: [PATCH 00/16] Cache open file descriptors in knfsd
Date: Mon, 1 Jul 2019 15:17:05 +0000	[thread overview]
Message-ID: <98d5ef75d1fa2b8775f52d378ca7d8dd1a542ae1.camel@hammerspace.com> (raw)
In-Reply-To: <F1C28446-51F0-4999-AAA6-4FEA9371E36A@oracle.com>

On Mon, 2019-07-01 at 11:02 -0400, Chuck Lever wrote:
> Interesting work! Kudos to you and Jeff.
> 
> 
> > On Jun 30, 2019, at 9:52 AM, Trond Myklebust <trondmy@gmail.com>
> > wrote:
> > 
> > When a NFSv3 READ or WRITE request comes in, the first thing knfsd
> > has
> > to do is open a new file descriptor. While this is often a
> > relatively
> > inexpensive thing to do for most local filesystems, it is usually
> > less
> > so for FUSE, clustered or networked filesystems that are being
> > exported
> > by knfsd.
> 
> True, I haven't measured much effect if any of open and close on
> local
> file systems. It would be valuable if the cover letter provided a
> more
> quantified assessment of the cost for these other use cases. It
> sounds
> plausible to me that they would be more expensive, but I'm wondering
> if
> the additional complexity of an open file cache is warranted and
> effective. Do you have any benchmark results to share?
> 
> Are there particular workloads where you believe open caching will be
> especially beneficial?

I'd expect pretty much anything with a nontrivial open() method. i.e.:
FUSE, GFS2, OCFS2, CEPH, etc. to benefit.

I've seen no slowdowns so far with traditional filesystems: i.e. ext4
and xfs.

Note that the removal of the raparms cache does in many way compensate
for the new need to lookup the struct file.

> > This set of patches attempts to reduce some of that cost by caching
> > open file descriptors so that they may be reused by other incoming
> > READ/WRITE requests for the same file.
> 
> Is the open file cache a single cache per server? Wondering if there
> can be significant interference (eg lock contention or cache
> sloshing)
> between separate workloads on different exports, for example.

The file cache is global. Cache lookups are lockless (i.e. RCU
protected), so there is little contention for the case where there is
already an entry. For the case where we have to add an entry, there is
a mutex that might get contended in the case of workloads with lots of
small file open+closes.

> Do you have any benchmark results that show that removing the raparms
> cache is harmless?

The same information is carried in struct file. The whole raparms cache
was just a hack in order to allow us to port the readahead information
across struct file instances. Now that we are caching the struct file
itself, the raparms hack is unnecessary.

IOW: I haven't seen any slowdowns so far, however I don't have access
to a bleeding edge networking setup that would push this further.

> > One danger when doing this, is that knfsd may end up caching file
> > descriptors for files that have been unlinked. In order to deal
> > with
> > this issue, we use fsnotify to monitor the files, and have hooks to
> > evict those descriptors from the file cache if the i_nlink value
> > goes to 0.
> > 
> > Jeff Layton (12):
> >  sunrpc: add a new cache_detail operation for when a cache is
> > flushed
> >  locks: create a new notifier chain for lease attempts
> >  nfsd: add a new struct file caching facility to nfsd
> >  nfsd: hook up nfsd_write to the new nfsd_file cache
> >  nfsd: hook up nfsd_read to the nfsd_file cache
> >  nfsd: hook nfsd_commit up to the nfsd_file cache
> >  nfsd: convert nfs4_file->fi_fds array to use nfsd_files
> >  nfsd: convert fi_deleg_file and ls_file fields to nfsd_file
> >  nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
> >  nfsd: have nfsd_test_lock use the nfsd_file cache
> >  nfsd: rip out the raparms cache
> >  nfsd: close cached files prior to a REMOVE or RENAME that would
> >    replace target
> > 
> > Trond Myklebust (4):
> >  notify: export symbols for use by the knfsd file cache
> >  vfs: Export flush_delayed_fput for use by knfsd.
> >  nfsd: Fix up some unused variable warnings
> >  nfsd: Fix the documentation for svcxdr_tmpalloc()
> > 
> > fs/file_table.c                  |   1 +
> > fs/locks.c                       |  62 +++
> > fs/nfsd/Kconfig                  |   1 +
> > fs/nfsd/Makefile                 |   3 +-
> > fs/nfsd/blocklayout.c            |   3 +-
> > fs/nfsd/export.c                 |  13 +
> > fs/nfsd/filecache.c              | 885
> > +++++++++++++++++++++++++++++++
> > fs/nfsd/filecache.h              |  60 +++
> > fs/nfsd/nfs4layouts.c            |  12 +-
> > fs/nfsd/nfs4proc.c               |  83 +--
> > fs/nfsd/nfs4state.c              | 183 ++++---
> > fs/nfsd/nfs4xdr.c                |  31 +-
> > fs/nfsd/nfssvc.c                 |  16 +-
> > fs/nfsd/state.h                  |  10 +-
> > fs/nfsd/trace.h                  | 140 +++++
> > fs/nfsd/vfs.c                    | 295 ++++-------
> > fs/nfsd/vfs.h                    |   9 +-
> > fs/nfsd/xdr4.h                   |  19 +-
> > fs/notify/fsnotify.h             |   2 -
> > fs/notify/group.c                |   2 +
> > fs/notify/mark.c                 |   6 +
> > include/linux/fs.h               |   5 +
> > include/linux/fsnotify_backend.h |   2 +
> > include/linux/sunrpc/cache.h     |   1 +
> > net/sunrpc/cache.c               |   3 +
> > 25 files changed, 1465 insertions(+), 382 deletions(-)
> > create mode 100644 fs/nfsd/filecache.c
> > create mode 100644 fs/nfsd/filecache.h
> > 
> > -- 
> > 2.21.0
> > 
> 
> --
> Chuck Lever
> 
> 
> 
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2019-07-01 15:17 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-30 13:52 [PATCH 00/16] Cache open file descriptors in knfsd Trond Myklebust
2019-06-30 13:52 ` [PATCH 01/16] sunrpc: add a new cache_detail operation for when a cache is flushed Trond Myklebust
2019-06-30 13:52   ` [PATCH 02/16] locks: create a new notifier chain for lease attempts Trond Myklebust
2019-06-30 13:52     ` [PATCH 03/16] notify: export symbols for use by the knfsd file cache Trond Myklebust
2019-06-30 13:52       ` [PATCH 04/16] vfs: Export flush_delayed_fput for use by knfsd Trond Myklebust
2019-06-30 13:52         ` [PATCH 05/16] nfsd: add a new struct file caching facility to nfsd Trond Myklebust
2019-06-30 13:52           ` [PATCH 06/16] nfsd: hook up nfsd_write to the new nfsd_file cache Trond Myklebust
2019-06-30 13:52             ` [PATCH 07/16] nfsd: hook up nfsd_read to the " Trond Myklebust
2019-06-30 13:52               ` [PATCH 08/16] nfsd: hook nfsd_commit up " Trond Myklebust
2019-06-30 13:52                 ` [PATCH 09/16] nfsd: convert nfs4_file->fi_fds array to use nfsd_files Trond Myklebust
2019-06-30 13:52                   ` [PATCH 10/16] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Trond Myklebust
2019-06-30 13:52                     ` [PATCH 11/16] nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache Trond Myklebust
2019-06-30 13:52                       ` [PATCH 12/16] nfsd: have nfsd_test_lock use " Trond Myklebust
2019-06-30 13:52                         ` [PATCH 13/16] nfsd: rip out the raparms cache Trond Myklebust
2019-06-30 13:52                           ` [PATCH 14/16] nfsd: close cached files prior to a REMOVE or RENAME that would replace target Trond Myklebust
2019-06-30 13:52                             ` [PATCH 15/16] nfsd: Fix up some unused variable warnings Trond Myklebust
2019-06-30 13:52                               ` [PATCH 16/16] nfsd: Fix the documentation for svcxdr_tmpalloc() Trond Myklebust
2019-06-30 15:57           ` [PATCH 05/16] nfsd: add a new struct file caching facility to nfsd Matthew Wilcox
2019-06-30 16:15             ` Trond Myklebust
2019-06-30 15:27     ` [PATCH 02/16] locks: create a new notifier chain for lease attempts Matthew Wilcox
2019-06-30 15:50       ` Trond Myklebust
2019-07-01 15:02 ` [PATCH 00/16] Cache open file descriptors in knfsd Chuck Lever
2019-07-01 15:17   ` Trond Myklebust [this message]
2019-07-01 15:39     ` Chuck Lever
2019-07-31 22:05 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98d5ef75d1fa2b8775f52d378ca7d8dd1a542ae1.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=bfields@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.