linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Javorski <mike.javorski@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: Mel Gorman <mgorman@suse.com>,
	Chuck Lever III <chuck.lever@oracle.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: NFS server regression in kernel 5.13 (tested w/ 5.13.9)
Date: Thu, 26 Aug 2021 23:11:12 -0700	[thread overview]
Message-ID: <CAOv1SKDTcg5WDp5zf3ZGL0enJ7K693W-9TMYKcrgweyzp6Qjhg@mail.gmail.com> (raw)
In-Reply-To: <163004202961.7591.12633163545286005205@noble.neil.brown.name>

Neil:

I am actually compiling a 5.13.13 kernel with the patch that Chuck
suggested earlier right now. I am doing the full compile matching the
distro compile as I don't have a targeted kernel config ready to go
(it's been years), and I want to test like for like anyway. It should
be ready to install in the AM, my time, so I will test with that first
tomorrow and see if it resolves the issue, if not, I will report back
and then try your revert suggestion. On the issue of memory though, my
server has 16GB of memory (and free currently shows ~1GB unused, and
~11GB in buffers/caches), so this really shouldn't be an available
memory issue, but I guess we'll find out.

Thanks for the info.

- mike

On Thu, Aug 26, 2021 at 10:27 PM NeilBrown <neilb@suse.de> wrote:
>
>
> [[Mel: if you read through to the end you'll see why I cc:ed you on this]]
>
> On Fri, 27 Aug 2021, Mike Javorski wrote:
> > I just tried the same mount with 4 different nfsvers values: 3, 4.0, 4.1 and 4.2
> >
> > At first I thought it might be "working" because I only got freezes
> > with 4.2 at first, but I went back and re-tested (to be sure) and got
> > freezes with all 4 versions. So the nfsvers setting doesn't seem to
> > have an impact. I did verify at each pass that the 'nfsvers=' value
> > was present and correct in the mount output.
> >
> > FYI: another user posted on the archlinux reddit with a similar issue,
> > I suggested they try with a 5.12 kernel and that "solved" the issue
> > for them as well.
>
> well... I have good news and I have bad news.
>
> First the good.
> I reviewed all the symptoms again, and browsed the commits between
> working and not-working, and the only pattern that made any sense was
> that there was some issue with memory allocation.  The pauses - I
> reasoned - were most likely pauses while allocating memory.
>
> So instead of testing in a VM with 2G of RAM, I tried 512MB, and
> suddenly the problem was trivial to reproduce.  Specifically I created a
> (sparse) 1GB file on the test VM, exported it over NFS, and ran "md5sum"
> on the file from an NFS client.  With 5.12 this reliably takes about 90 seconds
> (as it does with 2G RAM).  On 5.13 and 512MB RAM, it usually takes a lot
> longer.  5, 6, 7, 8 minutes (and assorted seconds).
>
> The most questionable nfsd/ memory related patch in 5.13 is
>
>  Commit f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator")
>
> I reverted that and now the problem is no longer there.  Gone.  90seconds
> every time.
>
> Now the bad news: I don't know why.  That patch should be a good patch,
> with a small performance improvement, particularly at very high loads.
> (maybe even a big improvement at very high loads).
> The problem must be in alloc_pages_bulk_array(), which is a new
> interface, so not possible to bisect.
>
> So I might have a look at the code next week, but I've cc:ed Mel Gorman
> in case he comes up with some ideas sooner.
>
> For now, you can just revert that patch.
>
> Thanks for all the testing you did!!  It certainly helped.
>
> NeilBrown
>

  reply	other threads:[~2021-08-27  6:11 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-08 22:37 NFS server regression in kernel 5.13 (tested w/ 5.13.9) Mike Javorski
2021-08-08 22:47 ` Chuck Lever III
2021-08-08 23:23   ` Mike Javorski
2021-08-09  0:01 ` NeilBrown
2021-08-09  0:28   ` Mike Javorski
2021-08-10  0:50     ` Mike Javorski
2021-08-10  1:28       ` NeilBrown
2021-08-10 11:54         ` Daire Byrne
2021-08-13  1:51         ` Mike Javorski
2021-08-13  2:39           ` NeilBrown
2021-08-13  2:53             ` Mike Javorski
2021-08-15  1:23               ` Mike Javorski
2021-08-16  1:20                 ` NeilBrown
2021-08-16 13:21                   ` Chuck Lever III
2021-08-16 16:25                     ` Mike Javorski
2021-08-16 23:01                       ` NeilBrown
2021-08-20  0:31                         ` NeilBrown
2021-08-20  0:52                           ` Mike Javorski
2021-08-22  0:17                             ` Mike Javorski
2021-08-22  3:41                               ` NeilBrown
2021-08-22  4:05                                 ` Mike Javorski
2021-08-22 22:00                                   ` NeilBrown
2021-08-26 19:34                                     ` Mike Javorski
2021-08-26 21:44                                       ` NeilBrown
2021-08-27  0:07                                         ` Mike Javorski
2021-08-27  5:27                                           ` NeilBrown
2021-08-27  6:11                                             ` Mike Javorski [this message]
2021-08-27  7:14                                               ` NeilBrown
2021-08-27 14:13                                                 ` Chuck Lever III
2021-08-27 17:07                                                   ` Mike Javorski
2021-08-27 22:00                                                     ` Mike Javorski
2021-08-27 23:49                                                       ` Chuck Lever III
2021-08-28  3:22                                                         ` Mike Javorski
2021-08-28 18:23                                                           ` Chuck Lever III
2021-08-29 22:36                                                             ` [PATCH] SUNRPC: don't pause on incomplete allocation NeilBrown
2021-08-30  9:12                                                               ` Mel Gorman
2021-08-30 20:46                                                               ` J. Bruce Fields
     [not found]                                                             ` <163027609524.7591.4987241695872857175@noble.neil.brown.name>
2021-08-30  9:11                                                               ` [PATCH] MM: clarify effort used in alloc_pages_bulk_*() Mel Gorman
2021-09-04 17:41                                                             ` NFS server regression in kernel 5.13 (tested w/ 5.13.9) Mike Javorski
2021-09-05  2:02                                                               ` Chuck Lever III
2021-09-16  2:45                                                                 ` Mike Javorski
2021-09-16 18:58                                                                   ` Chuck Lever III
2021-09-16 19:21                                                                     ` Mike Javorski
2021-09-17 14:41                                                                       ` J. Bruce Fields
2021-08-16 16:09                   ` Mike Javorski
2021-08-16 23:04                     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOv1SKDTcg5WDp5zf3ZGL0enJ7K693W-9TMYKcrgweyzp6Qjhg@mail.gmail.com \
    --to=mike.javorski@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mgorman@suse.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).