linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Trond Myklebust <trondmy@hammerspace.com>, "jack@suse.cz" <jack@suse.cz>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"nfbrown@suse.com" <nfbrown@suse.com>,
	Bruce Fields <bfields@redhat.com>
Subject: Re: Performance regression with random IO pattern from the client
Date: Wed, 30 Mar 2022 15:38:38 +0000	[thread overview]
Message-ID: <FF014DFC-EC48-4CB7-A3F4-04FBB82E4A27@oracle.com> (raw)
In-Reply-To: <64a4832afd830d7c831ab687bc7a72cc791c2f0c.camel@hammerspace.com>



> On Mar 30, 2022, at 11:03 AM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Wed, 2022-03-30 at 12:34 +0200, Jan Kara wrote:
>> Hello,
>> 
>> during our performance testing we have noticed that commit
>> b6669305d35a
>> ("nfsd: Reduce the number of calls to nfsd_file_gc()") has introduced
>> a
>> performance regression when a client does random buffered writes. The
>> workload on NFS client is fio running 4 processed doing random
>> buffered writes to 4
>> different files and the files are large enough to hit dirty limits
>> and
>> force writeback from the client. In particular the invocation is
>> like:
>> 
>> fio --direct=0 --ioengine=sync --thread --directory=/mnt/mnt1 --
>> invalidate=1 --group_reporting=1 --runtime=300 --fallocate=posix --
>> ramp_time=10 --name=RandomReads-128000-4k-4 --new_group --
>> rw=randwrite --size=4000m --numjobs=4 --bs=4k --
>> filename_format=FioWorkloads.\$jobnum --end_fsync=1
>> 
>> The reason why commit b6669305d35a regresses performance is the
>> filemap_flush() call it adds into nfsd_file_put(). Before this commit
>> writeback on the server happened from nfsd_commit() code resulting in
>> rather long semisequential streams of 4k writes. After commit
>> b6669305d35a
>> all the writeback happens from filemap_flush() calls resulting in
>> much
>> longer average seek distance (IO from different files is more
>> interleaved)
>> and about 16-20% regression in the achieved writeback throughput when
>> the
>> backing store is rotational storage.
>> 
>> I think the filemap_flush() from nfsd_file_put() is indeed rather
>> aggressive and I think we'd be better off to just leave writeback to
>> either
>> nfsd_commit() or standard dirty page cleaning happening on the
>> system. I
>> assume the rationale for the filemap_flush() call was to make it more
>> likely the file can be evicted during the garbage collection run? Was
>> there
>> any particular problem leading to addition of this call or was it
>> just "it
>> seemed like a good idea" thing?
>> 
>> Thanks in advance for ideas.
>> 
>>                                                                 Honza
> 
> It was mainly introduced to reduce the amount of work that
> nfsd_file_free() needs to do. In particular when re-exporting NFS, the
> call to filp_close() can be expensive because it synchronously flushes
> out dirty pages. That again means that some of the calls to
> nfsd_file_dispose_list() can end up being very expensive (particularly
> the ones run by the garbage collector itself).

The "no regressions" rule suggests that some kind of action needs
to be taken. I don't have a sense of whether Jan's workload or NFS
re-export is the more common use case, however.

I can see that filp_close() on a file on an NFS mount could be
costly if that file has dirty pages, due to the NFS client's
"flush on close" semantic.

Trond, are there alternatives to flushing in the nfsd_file_put()
path? I'm willing to entertain bug fix patches rather than a
mechanical revert of b6669305d35a.


--
Chuck Lever




  reply	other threads:[~2022-03-30 15:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 10:34 Performance regression with random IO pattern from the client Jan Kara
2022-03-30 15:03 ` Trond Myklebust
2022-03-30 15:38   ` Chuck Lever III [this message]
2022-03-30 16:19     ` Trond Myklebust
2022-03-30 16:19     ` Jan Kara
2022-03-30 17:56       ` Chuck Lever III
2022-03-30 22:02         ` Trond Myklebust
2022-03-31 13:09           ` Jan Kara
2022-03-31 14:20             ` Chuck Lever III
2022-03-31 14:22               ` Chuck Lever III
2022-03-30 16:14   ` Jan Kara
2022-03-31  8:43 ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FF014DFC-EC48-4CB7-A3F4-04FBB82E4A27@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=bfields@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfbrown@suse.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).