linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "dai.ngo@oracle.com" <dai.ngo@oracle.com>,
	"Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: 'ls -lrt' performance issue on large dir while dir is being modified
Date: Wed, 15 Jan 2020 18:54:15 +0000	[thread overview]
Message-ID: <9fdf37ffe4b3f7016a60e3a61c2087a825348b28.camel@hammerspace.com> (raw)
In-Reply-To: <770937d3-9439-db4a-1f6e-59a59f2c08b9@oracle.com>

On Wed, 2020-01-15 at 10:11 -0800, Dai Ngo wrote:
> Hi Anna, Trond,
> 
> Would you please let me know your opinion regarding reverting the
> change in
> nfs_force_use_readdirplus to call nfs_zap_mapping instead of
> invalidate_mapping_pages.
> This change is to prevent the cookie of the READDIRPLUS to be reset
> to 0 while
> an instance of 'ls' is running and the directory is being modified.
> 
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 
> > a73e2f8bd8ec..5d4a64555fa7 100644 --- a/fs/nfs/dir.c +++ 
> > b/fs/nfs/dir.c @@ -444,7 +444,7 @@ void 
> > nfs_force_use_readdirplus(struct inode *dir)      if 
> > (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&          
> > !list_empty(&nfsi->open_files)) {          
> > set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags); -        
> > invalidate_mapping_pages(dir->i_mapping, 0, -1); +        
> > nfs_zap_mapping(dir, dir->i_mapping);      }  } 
> 
> Thanks,
> -Dai
> 
> On 12/19/19 8:01 PM, Dai Ngo wrote:
> > Hi Anna, Trond,
> > 
> > I made a mistake with the 5.5 numbers. The VM that runs 5.5 has
> > some
> > problems. There is no regression with 5.5, here are the new
> > numbers:
> > 
> > Upstream Linux 5.5.0-rc1 [ORI] 93296: 3m10.917s  197891: 10m35.789s
> > Upstream Linux 5.5.0-rc1 [MOD] 98614: 1m59.649s  192801: 3m55.003s
> > 
> > My apologies for the mistake.
> > 
> > Now there is no regression with 5.5, I'd like to get your opinion
> > regarding the change to revert the call from
> > invalidate_mapping_pages
> > to nfs_zap_mapping in nfs_force_use_readdirplus to prevent the
> > current 'ls' from restarting the READDIRPLUS3 from cookie 0. I'm
> > not quite sure about the intention of the prior change from
> > nfs_zap_mapping to invalidate_mapping_pages so that is why I'm
> > seeking advise. Or do you have any suggestions to achieve the same?
> > 
> > Thanks,
> > -Dai
> > 
> > On 12/17/19 4:34 PM, Dai Ngo wrote:
> > > Hi,
> > > 
> > > I'd like to report an issue with 'ls -lrt' on NFSv3 client takes
> > > a very long time to display the content of a large directory
> > > (100k - 200k files) while the directory is being modified by
> > > another NFSv3 client.
> > > 
> > > The problem can be reproduced using 3 systems. One system serves
> > > as the NFS server, one system runs as the client that doing the
> > > 'ls -lrt' and another system runs the client that creates files
> > > on the server.
> > >     Client1 creates files using this simple script:
> > > 
> > > > #!/bin/sh
> > > > 
> > > > if [ $# -lt 2 ]; then
> > > >         echo "Usage: $0 number_of_files base_filename"
> > > >         exit
> > > > fi    nfiles=$1
> > > > fname=$2
> > > > echo "creating $nfiles files using filename[$fname]..."
> > > > i=0         while [ i -lt $nfiles ] ;
> > > > do            i=`expr $i + 1`
> > > >         echo "xyz" > $fname$i
> > > >         echo "$fname$i" done
> > > 
> > > Client2 runs 'time ls -lrt /tmp/mnt/bd1 |wc -l' in a loop.
> > > 
> > > The network traces and dtrace probes showed numerous READDIRPLUS3
> > > requests restarting  from cookie 0 which seemed to indicate the
> > > cached pages of the directory were invalidated causing the pages
> > > to be refilled starting from cookie 0 until the current requested
> > > cookie.  The cached page invalidation were tracked to
> > > nfs_force_use_readdirplus().  To verify, I made the below
> > > modification, ran the test for various kernel versions and
> > > captured the results shown below.
> > > 
> > > The modification is:
> > > 
> > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > index a73e2f8bd8ec..5d4a64555fa7 100644
> > > > --- a/fs/nfs/dir.c
> > > > +++ b/fs/nfs/dir.c
> > > > @@ -444,7 +444,7 @@ void nfs_force_use_readdirplus(struct inode
> > > > *dir)
> > > >      if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
> > > >          !list_empty(&nfsi->open_files)) {
> > > >          set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
> > > > -        invalidate_mapping_pages(dir->i_mapping, 0, -1);
> > > > +        nfs_zap_mapping(dir, dir->i_mapping);
> > > >      }
> > > >  }

This change is only reverting part of commit 79f687a3de9e. My problem
with that is as follows:

RFC1813 states that NFSv3 READDIRPLUS cookies and verifiers must match
those returned by previous READDIRPLUS calls, and READDIR cookies and
verifiers must match those returned by previous READDIR calls. It says
nothing about being able to assume cookies from READDIR and READDIRPLUS
calls are interchangeable. So the only reason I can see for the
invalidate_mapping_pages() is to ensure that we do separate the two
cookie caches.

OTOH, for NFSv4, there is no separate READDIRPLUS function, so there
really does not appear to be any reason to clear the page cache at all
as we're switching between requesting attributes or not.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2020-01-15 18:54 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18  0:34 'ls -lrt' performance issue on large dir while dir is being modified Dai Ngo
2019-12-20  4:01 ` Dai Ngo
2020-01-15 18:11   ` Dai Ngo
2020-01-15 18:54     ` Trond Myklebust [this message]
2020-01-15 19:06       ` Trond Myklebust
2020-01-15 19:28         ` Dai Ngo
2020-01-18  2:29         ` Dai Ngo
2020-01-18 15:58           ` Trond Myklebust
2020-01-18 17:26             ` Chuck Lever
2020-01-18 17:31               ` Trond Myklebust
2020-01-18 18:03                 ` Dai Ngo
2020-01-20 20:52                   ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fdf37ffe4b3f7016a60e3a61c2087a825348b28.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=dai.ngo@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).