All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "aglo@umich.edu" <aglo@umich.edu>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>
Subject: Re: [PATCH v2 0/9] Various NFSv4 state error handling fixes
Date: Thu, 19 Sep 2019 23:42:09 +0000	[thread overview]
Message-ID: <8720be3295e3b0035396b9bec70231a628231c93.camel@hammerspace.com> (raw)
In-Reply-To: <CAN-5tyGX_Mb-wGTREtSWRFFSNK0qjgqLbm8SFPG=DPM7M2OWoQ@mail.gmail.com>

Hi Olga

On Thu, 2019-09-19 at 09:14 -0400, Olga Kornievskaia wrote:
> Hi Trond,
> 
> On Wed, Sep 18, 2019 at 9:49 PM Trond Myklebust <
> trondmy@hammerspace.com> wrote:
> > Hi Olga
> > 
> > On Wed, 2019-09-18 at 15:38 -0400, Olga Kornievskaia wrote:
> > > Hi Trond,
> > > 
> > > These set of patches do not address the locking problem. It's
> > > actually
> > > not the locking patch (which I thought it was as I reverted it
> > > and
> > > still had the issue). Without the whole patch series the unlock
> > > works
> > > fine so something in these new patches. Something is up with the
> > > 2
> > > patches:
> > > NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
> > > NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU
> > > 
> > > If I remove either one separately, unlock fails but if I remove
> > > both
> > > unlock works.
> > 
> > Can you describe how you are testing this, and perhaps provide
> > wireshark traces that show how we're triggering these problems?
> 
> I'm triggering by running "nfstest_lock --nfsversion 4.1 --runtest
> btest01" against either linux or ontap servers (while the test
> doesn't
> fail but on the network trace you can see unlock failing with
> bad_stateid). Network trace attached.
> 
> But actually a simple test open, lock, unlock does the trick (network
> trace attached).
> fd1 = open(RDWR)
> fctl(fd1) (lock /unlock)


These traces really do not mesh with what I'm seeing using a simple
Connectathon lock test run. When I look at the wireshark output from
that, I see exadtly two cases where the stateid arguments are both
zero, and those are both SETATTR, so expected.

All the LOCKU are showing up as non-zero stateids, and so I'm seeing no
BAD_STATEID or OLD_STATEID errors at all.

Is there something special about how your test is running?

Cheers
  Trond

PS: Note: I do think I need a v3 of the LOCKU patch in order to add a
spinlock around the new copy of the lock stateid in
nfs4_alloc_unlockdata(). However I don't see how the missing spinlocks
could cause you to consistently be seeing an all-zero stateid argument.


> 
> > Thanks!
> >   Trond
> > 
> > > On Mon, Sep 16, 2019 at 4:46 PM Trond Myklebust <
> > > trondmy@gmail.com>
> > > wrote:
> > > > Various NFSv4 fixes to ensure we handle state errors correctly.
> > > > In
> > > > particular, we need to ensure that for COMPOUNDs like CLOSE and
> > > > DELEGRETURN, that may have an embedded LAYOUTRETURN, we handle
> > > > the
> > > > layout state errors so that a retry of either the LAYOUTRETURN,
> > > > or
> > > > the later CLOSE/DELEGRETURN does not corrupt the LAYOUTRETURN
> > > > reply.
> > > > 
> > > > Also ensure that if we get a NFS4ERR_OLD_STATEID, then we do
> > > > our
> > > > best to still try to destroy the state on the server, in order
> > > > to
> > > > avoid causing state leakage.
> > > > 
> > > > v2: Fix bug reports from Olga
> > > >  - Try to avoid sending old stateids on CLOSE/OPEN_DOWNGRADE
> > > > when
> > > >    doing fully serialised NFSv4.0.
> > > >  - Ensure LOCKU initialises the stateid correctly.
> > > > 
> > > > Trond Myklebust (9):
> > > >   pNFS: Ensure we do clear the return-on-close layout stateid
> > > > on
> > > > fatal
> > > >     errors
> > > >   NFSv4: Clean up pNFS return-on-close error handling
> > > >   NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close
> > > >   NFSv4: Handle RPC level errors in LAYOUTRETURN
> > > >   NFSv4: Add a helper to increment stateid seqids
> > > >   pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping
> > > > the
> > > > state
> > > >     seqid
> > > >   NFSv4: Fix OPEN_DOWNGRADE error handling
> > > >   NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
> > > >   NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU
> > > > 
> > > >  fs/nfs/nfs4_fs.h   |  11 ++-
> > > >  fs/nfs/nfs4proc.c  | 204 ++++++++++++++++++++++++++++++-------
> > > > ----
> > > > ----
> > > >  fs/nfs/nfs4state.c |  16 ----
> > > >  fs/nfs/pnfs.c      |  71 ++++++++++++++--
> > > >  fs/nfs/pnfs.h      |  17 +++-
> > > >  5 files changed, 229 insertions(+), 90 deletions(-)
> > > > 
> > > > --
> > > > 2.21.0
> > > > 
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
> > 
> > 
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  parent reply	other threads:[~2019-09-19 23:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16 20:44 [PATCH v2 0/9] Various NFSv4 state error handling fixes Trond Myklebust
2019-09-16 20:44 ` [PATCH v2 1/9] pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors Trond Myklebust
2019-09-16 20:44   ` [PATCH v2 2/9] NFSv4: Clean up pNFS return-on-close error handling Trond Myklebust
2019-09-16 20:44     ` [PATCH v2 3/9] NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close Trond Myklebust
2019-09-16 20:44       ` [PATCH v2 4/9] NFSv4: Handle RPC level errors in LAYOUTRETURN Trond Myklebust
2019-09-16 20:44         ` [PATCH v2 5/9] NFSv4: Add a helper to increment stateid seqids Trond Myklebust
2019-09-16 20:44           ` [PATCH v2 6/9] pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping the state seqid Trond Myklebust
2019-09-16 20:44             ` [PATCH v2 7/9] NFSv4: Fix OPEN_DOWNGRADE error handling Trond Myklebust
2019-09-16 20:44               ` [PATCH v2 8/9] NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE Trond Myklebust
2019-09-16 20:44                 ` [PATCH v2 9/9] NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU Trond Myklebust
2019-09-18 19:38 ` [PATCH v2 0/9] Various NFSv4 state error handling fixes Olga Kornievskaia
2019-09-19  1:49   ` Trond Myklebust
     [not found]     ` <CAN-5tyGX_Mb-wGTREtSWRFFSNK0qjgqLbm8SFPG=DPM7M2OWoQ@mail.gmail.com>
2019-09-19 23:42       ` Trond Myklebust [this message]
2019-09-20 14:25         ` Olga Kornievskaia
2019-09-20 14:54           ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8720be3295e3b0035396b9bec70231a628231c93.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.