linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Chuck Lever" <chuck.lever@oracle.com>
Cc: "Murphy Zhou" <jencce.kernel@gmail.com>,
	"Trond Myklebust" <trondmy@hammerspace.com>,
	"Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] NFSv4: fix stateid refreshing when CLOSE racing with OPEN
Date: Tue, 08 Sep 2020 08:43:47 -0400	[thread overview]
Message-ID: <3ABDF891-C6AE-4EA8-BCB1-D8BD3015280C@redhat.com> (raw)
In-Reply-To: <BE205FBB-E5BC-40B2-8F3D-B7B6A7EBEB53@oracle.com>

On 4 Sep 2020, at 10:14, Chuck Lever wrote:

>> On Sep 4, 2020, at 6:55 AM, Benjamin Coddington <bcodding@redhat.com> 
>> wrote:
>>
>> On 3 Sep 2020, at 23:04, Murphy Zhou wrote:
>>
>>> Hi Benjamin,
>>>
>>> On Thu, Sep 03, 2020 at 01:54:26PM -0400, Benjamin Coddington wrote:
>>>>
>>>> On 11 Oct 2019, at 10:14, Trond Myklebust wrote:
>>>>> On Fri, 2019-10-11 at 16:49 +0800, Murphy Zhou wrote:
>>>>>> On Thu, Oct 10, 2019 at 02:46:40PM +0000, Trond Myklebust wrote:
>>>>>>> On Thu, 2019-10-10 at 15:40 +0800, Murphy Zhou wrote:
>>>> ...
>>>>>>>> @@ -3367,14 +3368,16 @@ static bool
>>>>>>>> nfs4_refresh_open_old_stateid(nfs4_stateid *dst,
>>>>>>>> 			break;
>>>>>>>> 		}
>>>>>>>> 		seqid_open = state->open_stateid.seqid;
>>>>>>>> -		if (read_seqretry(&state->seqlock, seq))
>>>>>>>> -			continue;
>>>>>>>>
>>>>>>>> 		dst_seqid = be32_to_cpu(dst->seqid);
>>>>>>>> -		if ((s32)(dst_seqid - be32_to_cpu(seqid_open)) >= 0)
>>>>>>>> +		if ((s32)(dst_seqid - be32_to_cpu(seqid_open)) > 0)
>>>>>>>> 			dst->seqid = cpu_to_be32(dst_seqid + 1);
>>>>>>>
>>>>>>> This negates the whole intention of the patch you reference in 
>>>>>>> the
>>>>>>> 'Fixes:', which was to allow us to CLOSE files even if seqid 
>>>>>>> bumps
>>>>>>> have
>>>>>>> been lost due to interrupted RPC calls e.g. when using 'soft' or
>>>>>>> 'softerr' mounts.
>>>>>>> With the above change, the check could just be tossed out
>>>>>>> altogether,
>>>>>>> because dst_seqid will never become larger than seqid_open.
>>>>>>
>>>>>> Hmm.. I got it wrong. Thanks for the explanation.
>>>>>
>>>>> So to be clear: I'm not saying that what you describe is not a 
>>>>> problem.
>>>>> I'm just saying that the fix you propose is really no better than
>>>>> reverting the entire patch. I'd prefer not to do that, and would 
>>>>> rather
>>>>> see us look for ways to fix both problems, but if we can't find 
>>>>> such as
>>>>> fix then that would be the better solution.
>>>>
>>>> Hi Trond and Murphy Zhou,
>>>>
>>>> Sorry to resurrect this old thread, but I'm wondering if any 
>>>> progress was
>>>> made on this front.
>>>
>>> This failure stoped showing up since v5.6-rc1 release cycle
>>> in my records. Can you reproduce this on latest upstream kernel?
>>
>> I'm seeing it on generic/168 on a v5.8 client against a v5.3 knfsd 
>> server.
>> When I test against v5.8 server, the test takes longer to complete 
>> and I
>> have yet to reproduce the livelock.
>>
>> - on v5.3 server takes ~50 iterations to produce, each test completes 
>> in ~40
>> seconds
>> - on v5.8 server my test has run ~750 iterations without getting into
>> the lock, each test takes ~60 seconds.
>>
>> I suspect recent changes to the server have changed the timing of 
>> open
>> replies such that the problem isn't reproduced on the client.
>
> The Linux NFS server in v5.4 does behave differently than earlier
> kernels with NFSv4.0, and it is performance-related. The filecache
> went into v5.4, and that seems to change the frequency at which
> the server offers delegations.

Just a point of reference - finally reproduced it on a v5.8 server after
4900 runs.  This took several days, and helped to heat the basement.

Ben


  reply	other threads:[~2020-09-08 19:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10  7:40 [PATCH] NFSv4: fix stateid refreshing when CLOSE racing with OPEN Murphy Zhou
2019-10-10 14:46 ` Trond Myklebust
2019-10-11  8:49   ` Murphy Zhou
2019-10-11 14:14     ` Trond Myklebust
2020-09-03 17:54       ` Benjamin Coddington
2020-09-04  3:04         ` Murphy Zhou
2020-09-04 10:55           ` Benjamin Coddington
2020-09-04 14:14             ` Chuck Lever
2020-09-08 12:43               ` Benjamin Coddington [this message]
2020-09-04 16:13         ` Olga Kornievskaia
2019-10-10 17:32 ` Olga Kornievskaia
2019-10-11  9:42   ` Murphy Zhou
2019-10-11 14:18   ` Trond Myklebust
2019-10-11 18:50     ` Olga Kornievskaia
2019-10-19  0:34       ` Olga Kornievskaia
2019-10-21 17:15         ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ABDF891-C6AE-4EA8-BCB1-D8BD3015280C@redhat.com \
    --to=bcodding@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=jencce.kernel@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).