Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Leon Kyneur <leonk@dug.com>
Cc: Benjamin Coddington <bcodding@redhat.com>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: troubleshooting LOCK FH and NFS4ERR_BAD_SEQID
Date: Tue, 24 Sep 2019 10:45:20 -0400
Message-ID: <CAN-5tyHnXiDnEspzMSpEXnx6+fOJe1a4V3V8PnYbY5B9LsfQYQ@mail.gmail.com> (raw)
In-Reply-To: <CAACwWuOs7SMq48C4=09afag0xmYH5U9Z9LOyVvxJfMyZzq4u9g@mail.gmail.com>

Hi Leon,

Given this trace, I don't see anything that the client is doing wrong.
I'd say this points to the server losing track of the seqid.

On Tue, Sep 24, 2019 at 12:18 AM Leon Kyneur <leonk@dug.com> wrote:
>
> On Thu, Sep 19, 2019 at 9:36 PM Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > Hi Leon,
> >
> > Given that you have a network trace with the BAD_SEQID error can you
> > filter on the file handle and trace the operations? It should be
> > possible to tell at that point if it's the client that's sending an
> > incorrect seq or the server. I'm assuming we can either trace the
> > locking stated to a previous (successful) use or to an open.  Provided
> > tshark output isn't enough to see what was inside the packets.
> >
> > If the LOCK that's failing used a seqid that wasn't bootstrapped from
> > the open seqid, then it can't be caused by a double CLOSE problem. But
> > if this LOCK was was bootstrapped from the open stateid we need to
> > traces open owner sequencing and a double close could have been the
> > cause.
>
>
> Hi Olga
>
> I don't ever see a double close in my packet captures. Here's a more
> verbose dump, I'm not quite sure what you mean re LOCK "wasn't
> bootstrapped from the open seqid" that does seem to explain it as I
> don't see the 0x00000001 seq id on the from the OPEN call later in the
> LOCK FH call. The working example (also below) does show this.
>
> So I'm still wondering if this is the double close issue or something else?
>
> # tshark -2 -n -r ../nfs.pcap   -z
> proto,colinfo,nfs.stateid4.other,nfs.stateid4.other
> ...
>  22          3 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
> In 23) OPEN DH: 0xbfe01adc/m.db  nfs.seqid == 0x00000022
>  23          3 172.27.255.28 -> 172.27.50.222 NFS 394 V4 Reply (Call
> In 22) OPEN StateID: 0x7999  nfs.seqid == 0x00000001
> nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
>  24          3 172.27.50.222 -> 172.27.255.28 NFS 290 V4 Call (Reply
> In 25) OPEN_CONFIRM  nfs.seqid == 0x00000001  nfs.seqid == 0x00000023
> nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
>  25          3 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
> In 24) OPEN_CONFIRM  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
>  26          3 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
> In 27) OPEN DH: 0x272b8d23/.sqliterc  nfs.seqid == 0x00000024
>  27          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
> In 26) OPEN Status: NFS4ERR_NOENT
>  28          3 172.27.50.222 -> 172.27.255.28 NFS 374 V4 Call (Reply
> In 29) OPEN DH: 0x272b8d23/.sqlite_history  nfs.seqid == 0x00000025
>  29          3 172.27.255.28 -> 172.27.50.222 NFS 430 V4 Reply (Call
> In 28) OPEN StateID: 0x3203  nfs.seqid == 0x00000001  nfs.seqid ==
> 0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
> nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e6:02:00:00
>  30          3 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
> In 31) CLOSE StateID: 0x3203  nfs.seqid == 0x00000026  nfs.seqid ==
> 0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
>  31          3 172.27.255.28 -> 172.27.50.222 NFS 202 V4 Reply (Call
> In 30) CLOSE  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
>  32          3 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
> In 33) LOOKUP DH: 0x272b8d23/.terminfo
>  33          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
> In 32) LOOKUP Status: NFS4ERR_NOENT
>  34          3 172.27.50.222 -> 172.27.255.28 NFS 290 V4 Call (Reply
> In 35) ACCESS FH: 0x272b8d23, [Check: RD LU MD XT DL]
>  35          3 172.27.255.28 -> 172.27.50.222 NFS 194 V4 Reply (Call
> In 34) ACCESS, [Allowed: RD LU MD XT DL]
>  36          3 172.27.50.222 -> 172.27.255.28 NFS 282 V4 Call (Reply
> In 37) GETATTR FH: 0x272b8d23
>  37          3 172.27.255.28 -> 172.27.50.222 NFS 266 V4 Reply (Call
> In 36) GETATTR
>  38          3 172.27.50.222 -> 172.27.255.28 NFS 302 V4 Call (Reply
> In 39) LOOKUP DH: 0x272b8d23/.inputrc
>  39          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
> In 38) LOOKUP Status: NFS4ERR_NOENT
>  40          3 172.27.50.222 -> 172.27.255.28 TCP 66 822 > 2049 [ACK]
> Seq=3950 Ack=2617 Win=3077 Len=0 TSval=177154789 TSecr=3474577216
>  41          5 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
> In 42) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
> 0x00000027  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
>  42          5 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
> In 41) LOCK Status: NFS4ERR_BAD_SEQID
>  43          5 172.27.50.222 -> 172.27.255.28 TCP 66 822 > 2049 [ACK]
> Seq=4238 Ack=2673 Win=3077 Len=0 TSval=177156701 TSecr=3474579167
>
>
> working:
>
>   7          2 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
> In 8) OPEN DH: 0xbfe01adc/m.db  nfs.seqid == 0x00000017
>   8          2 172.27.255.28 -> 172.27.50.222 NFS 394 V4 Reply (Call
> In 7) OPEN StateID: 0x08c1  nfs.seqid == 0x00000001
> nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
>   9          2 172.27.50.222 -> 172.27.255.28 NFS 374 V4 Call (Reply
> In 10) OPEN DH: 0x272b8d23/.sqlite_history  nfs.seqid == 0x00000018
>  10          2 172.27.255.28 -> 172.27.50.222 NFS 430 V4 Reply (Call
> In 9) OPEN StateID: 0x147a  nfs.seqid == 0x00000001  nfs.seqid ==
> 0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
> nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fe:02:00:00
>  11          2 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
> In 12) CLOSE StateID: 0x147a  nfs.seqid == 0x00000019  nfs.seqid ==
> 0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
>  12          2 172.27.255.28 -> 172.27.50.222 NFS 202 V4 Reply (Call
> In 11) CLOSE  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
>  13          2 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
> Seq=1293 Ack=1157 Win=40 Len=0 TSval=177379723 TSecr=3474802150
>  14          4 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
> In 15) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
> 0x0000001a  nfs.seqid == 0x00000001  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
>  15          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
> In 14) LOCK  nfs.seqid == 0x00000001  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  16          4 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
> Seq=1581 Ack=1229 Win=40 Len=0 TSval=177381305 TSecr=3474803772
>  17          4 172.27.50.222 -> 172.27.255.28 NFS 318 V4 Call (Reply
> In 18) LOCK FH: 0x80589398 Offset: 1073741826 Length: 510   nfs.seqid
> == 0x00000001  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  18          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
> In 17) LOCK  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  19          4 172.27.50.222 -> 172.27.255.28 NFS 310 V4 Call (Reply
> In 20) LOCKU FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid
> == 0x00000002  nfs.seqid == 0x00000002  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  20          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
> In 19) LOCKU  nfs.seqid == 0x00000003  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  21          4 172.27.50.222 -> 172.27.255.28 NFS 298 V4 Call (Reply
> In 22) READ StateID: 0x2656 Offset: 0 Len: 6144  nfs.seqid ==
> 0x00000003  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  22          4 172.27.255.28 -> 172.27.50.222 NFS 6274 V4 Reply (Call
> In 21) READ
>  23          4 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
> Seq=2309 Ack=7581 Win=40 Len=0 TSval=177381309 TSecr=3474803775
>  24          4 172.27.50.222 -> 172.27.255.28 NFS 310 V4 Call (Reply
> In 25) LOCKU FH: 0x80589398 Offset: 0 Length: <End of File>  nfs.seqid
> == 0x00000003  nfs.seqid == 0x00000003  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  25          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
> In 24) LOCKU  nfs.seqid == 0x00000004  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
>  26          4 172.27.50.222 -> 172.27.255.28 NFS 190 V4 Call (Reply
> In 28) RELEASE_LOCKOWNER
>  27          4 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
> In 29) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
> 0x0000001b  nfs.seqid == 0x00000001  nfs.stateid4.other ==
> 8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
>  28          4 172.27.255.28 -> 172.27.50.222 NFS 114 V4 Reply (Call
> In 26) RELEASE_LOCKOWNER

      reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-12  8:27 Leon Kyneur
2019-09-17 11:28 ` Benjamin Coddington
2019-09-18  2:20   ` Leon Kyneur
2019-09-18 11:04     ` Benjamin Coddington
2019-09-18 14:32       ` Olga Kornievskaia
2019-09-19  4:22         ` Leon Kyneur
2019-09-19 13:36           ` Olga Kornievskaia
2019-09-24  4:17             ` Leon Kyneur
2019-09-24 14:45               ` Olga Kornievskaia [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyHnXiDnEspzMSpEXnx6+fOJe1a4V3V8PnYbY5B9LsfQYQ@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=bcodding@redhat.com \
    --cc=leonk@dug.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git