Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>
Subject: Re: [PATCH v2 0/9] Various NFSv4 state error handling fixes
Date: Fri, 20 Sep 2019 10:25:37 -0400
Message-ID: <CAN-5tyHgP75cJL4SNb+-Q8iDaJOTPt6JUbjMQjJYvdZfBrdecA@mail.gmail.com> (raw)
In-Reply-To: <8720be3295e3b0035396b9bec70231a628231c93.camel@hammerspace.com>

Hi Trond,

On Thu, Sep 19, 2019 at 7:42 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> Hi Olga
>
> On Thu, 2019-09-19 at 09:14 -0400, Olga Kornievskaia wrote:
> > Hi Trond,
> >
> > On Wed, Sep 18, 2019 at 9:49 PM Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > > Hi Olga
> > >
> > > On Wed, 2019-09-18 at 15:38 -0400, Olga Kornievskaia wrote:
> > > > Hi Trond,
> > > >
> > > > These set of patches do not address the locking problem. It's
> > > > actually
> > > > not the locking patch (which I thought it was as I reverted it
> > > > and
> > > > still had the issue). Without the whole patch series the unlock
> > > > works
> > > > fine so something in these new patches. Something is up with the
> > > > 2
> > > > patches:
> > > > NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
> > > > NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU
> > > >
> > > > If I remove either one separately, unlock fails but if I remove
> > > > both
> > > > unlock works.
> > >
> > > Can you describe how you are testing this, and perhaps provide
> > > wireshark traces that show how we're triggering these problems?
> >
> > I'm triggering by running "nfstest_lock --nfsversion 4.1 --runtest
> > btest01" against either linux or ontap servers (while the test
> > doesn't
> > fail but on the network trace you can see unlock failing with
> > bad_stateid). Network trace attached.
> >
> > But actually a simple test open, lock, unlock does the trick (network
> > trace attached).
> > fd1 = open(RDWR)
> > fctl(fd1) (lock /unlock)
>
>
> These traces really do not mesh with what I'm seeing using a simple
> Connectathon lock test run. When I look at the wireshark output from
> that, I see exadtly two cases where the stateid arguments are both
> zero, and those are both SETATTR, so expected.
>
> All the LOCKU are showing up as non-zero stateids, and so I'm seeing no
> BAD_STATEID or OLD_STATEID errors at all.
>
> Is there something special about how your test is running?

There is nothing special that I can think of about my setup or how
test run. I pull from your testing branch, build it (no extra
patches). Run tests over 4.1 (default mount opts) against a linux
server (typically same kernel).

Is this patch series somewhere in your git branches? I've been testing
your testing branch (as I could see v2 changes were in the testing
branch). It's not obvious to me what was changed in v3 to see if the
testing branch has the right code.

>
> Cheers
>   Trond
>
> PS: Note: I do think I need a v3 of the LOCKU patch in order to add a
> spinlock around the new copy of the lock stateid in
> nfs4_alloc_unlockdata(). However I don't see how the missing spinlocks
> could cause you to consistently be seeing an all-zero stateid argument.

I'm not testing with v3 so I don't think that's it.

>
>
> >
> > > Thanks!
> > >   Trond
> > >
> > > > On Mon, Sep 16, 2019 at 4:46 PM Trond Myklebust <
> > > > trondmy@gmail.com>
> > > > wrote:
> > > > > Various NFSv4 fixes to ensure we handle state errors correctly.
> > > > > In
> > > > > particular, we need to ensure that for COMPOUNDs like CLOSE and
> > > > > DELEGRETURN, that may have an embedded LAYOUTRETURN, we handle
> > > > > the
> > > > > layout state errors so that a retry of either the LAYOUTRETURN,
> > > > > or
> > > > > the later CLOSE/DELEGRETURN does not corrupt the LAYOUTRETURN
> > > > > reply.
> > > > >
> > > > > Also ensure that if we get a NFS4ERR_OLD_STATEID, then we do
> > > > > our
> > > > > best to still try to destroy the state on the server, in order
> > > > > to
> > > > > avoid causing state leakage.
> > > > >
> > > > > v2: Fix bug reports from Olga
> > > > >  - Try to avoid sending old stateids on CLOSE/OPEN_DOWNGRADE
> > > > > when
> > > > >    doing fully serialised NFSv4.0.
> > > > >  - Ensure LOCKU initialises the stateid correctly.
> > > > >
> > > > > Trond Myklebust (9):
> > > > >   pNFS: Ensure we do clear the return-on-close layout stateid
> > > > > on
> > > > > fatal
> > > > >     errors
> > > > >   NFSv4: Clean up pNFS return-on-close error handling
> > > > >   NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close
> > > > >   NFSv4: Handle RPC level errors in LAYOUTRETURN
> > > > >   NFSv4: Add a helper to increment stateid seqids
> > > > >   pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping
> > > > > the
> > > > > state
> > > > >     seqid
> > > > >   NFSv4: Fix OPEN_DOWNGRADE error handling
> > > > >   NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
> > > > >   NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU
> > > > >
> > > > >  fs/nfs/nfs4_fs.h   |  11 ++-
> > > > >  fs/nfs/nfs4proc.c  | 204 ++++++++++++++++++++++++++++++-------
> > > > > ----
> > > > > ----
> > > > >  fs/nfs/nfs4state.c |  16 ----
> > > > >  fs/nfs/pnfs.c      |  71 ++++++++++++++--
> > > > >  fs/nfs/pnfs.h      |  17 +++-
> > > > >  5 files changed, 229 insertions(+), 90 deletions(-)
> > > > >
> > > > > --
> > > > > 2.21.0
> > > > >
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@hammerspace.com
> > >
> > >
> Trond Myklebust
> CTO, Hammerspace Inc
> 4300 El Camino Real, Suite 105
> Los Altos, CA 94022
> www.hammer.space
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16 20:44 Trond Myklebust
2019-09-16 20:44 ` [PATCH v2 1/9] pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors Trond Myklebust
2019-09-16 20:44   ` [PATCH v2 2/9] NFSv4: Clean up pNFS return-on-close error handling Trond Myklebust
2019-09-16 20:44     ` [PATCH v2 3/9] NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close Trond Myklebust
2019-09-16 20:44       ` [PATCH v2 4/9] NFSv4: Handle RPC level errors in LAYOUTRETURN Trond Myklebust
2019-09-16 20:44         ` [PATCH v2 5/9] NFSv4: Add a helper to increment stateid seqids Trond Myklebust
2019-09-16 20:44           ` [PATCH v2 6/9] pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping the state seqid Trond Myklebust
2019-09-16 20:44             ` [PATCH v2 7/9] NFSv4: Fix OPEN_DOWNGRADE error handling Trond Myklebust
2019-09-16 20:44               ` [PATCH v2 8/9] NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE Trond Myklebust
2019-09-16 20:44                 ` [PATCH v2 9/9] NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU Trond Myklebust
2019-09-18 19:38 ` [PATCH v2 0/9] Various NFSv4 state error handling fixes Olga Kornievskaia
2019-09-19  1:49   ` Trond Myklebust
     [not found]     ` <CAN-5tyGX_Mb-wGTREtSWRFFSNK0qjgqLbm8SFPG=DPM7M2OWoQ@mail.gmail.com>
2019-09-19 23:42       ` Trond Myklebust
2019-09-20 14:25         ` Olga Kornievskaia [this message]
2019-09-20 14:54           ` Trond Myklebust

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyHgP75cJL4SNb+-Q8iDaJOTPt6JUbjMQjJYvdZfBrdecA@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=Anna.Schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git