Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Leon Kyneur <leonk@dug.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: Benjamin Coddington <bcodding@redhat.com>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: troubleshooting LOCK FH and NFS4ERR_BAD_SEQID
Date: Tue, 24 Sep 2019 12:17:58 +0800
Message-ID: <CAACwWuOs7SMq48C4=09afag0xmYH5U9Z9LOyVvxJfMyZzq4u9g@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyHqtQMJdrm1WUSjmiT7mrRd4HDNNmvZ=fdWjOoOu54ApA@mail.gmail.com>

On Thu, Sep 19, 2019 at 9:36 PM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> Hi Leon,
>
> Given that you have a network trace with the BAD_SEQID error can you
> filter on the file handle and trace the operations? It should be
> possible to tell at that point if it's the client that's sending an
> incorrect seq or the server. I'm assuming we can either trace the
> locking stated to a previous (successful) use or to an open.  Provided
> tshark output isn't enough to see what was inside the packets.
>
> If the LOCK that's failing used a seqid that wasn't bootstrapped from
> the open seqid, then it can't be caused by a double CLOSE problem. But
> if this LOCK was was bootstrapped from the open stateid we need to
> traces open owner sequencing and a double close could have been the
> cause.


Hi Olga

I don't ever see a double close in my packet captures. Here's a more
verbose dump, I'm not quite sure what you mean re LOCK "wasn't
bootstrapped from the open seqid" that does seem to explain it as I
don't see the 0x00000001 seq id on the from the OPEN call later in the
LOCK FH call. The working example (also below) does show this.

So I'm still wondering if this is the double close issue or something else?

# tshark -2 -n -r ../nfs.pcap   -z
proto,colinfo,nfs.stateid4.other,nfs.stateid4.other
...
 22          3 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
In 23) OPEN DH: 0xbfe01adc/m.db  nfs.seqid == 0x00000022
 23          3 172.27.255.28 -> 172.27.50.222 NFS 394 V4 Reply (Call
In 22) OPEN StateID: 0x7999  nfs.seqid == 0x00000001
nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
 24          3 172.27.50.222 -> 172.27.255.28 NFS 290 V4 Call (Reply
In 25) OPEN_CONFIRM  nfs.seqid == 0x00000001  nfs.seqid == 0x00000023
nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
 25          3 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
In 24) OPEN_CONFIRM  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
 26          3 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
In 27) OPEN DH: 0x272b8d23/.sqliterc  nfs.seqid == 0x00000024
 27          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
In 26) OPEN Status: NFS4ERR_NOENT
 28          3 172.27.50.222 -> 172.27.255.28 NFS 374 V4 Call (Reply
In 29) OPEN DH: 0x272b8d23/.sqlite_history  nfs.seqid == 0x00000025
 29          3 172.27.255.28 -> 172.27.50.222 NFS 430 V4 Reply (Call
In 28) OPEN StateID: 0x3203  nfs.seqid == 0x00000001  nfs.seqid ==
0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e6:02:00:00
 30          3 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
In 31) CLOSE StateID: 0x3203  nfs.seqid == 0x00000026  nfs.seqid ==
0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
 31          3 172.27.255.28 -> 172.27.50.222 NFS 202 V4 Reply (Call
In 30) CLOSE  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:e5:02:00:00
 32          3 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
In 33) LOOKUP DH: 0x272b8d23/.terminfo
 33          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
In 32) LOOKUP Status: NFS4ERR_NOENT
 34          3 172.27.50.222 -> 172.27.255.28 NFS 290 V4 Call (Reply
In 35) ACCESS FH: 0x272b8d23, [Check: RD LU MD XT DL]
 35          3 172.27.255.28 -> 172.27.50.222 NFS 194 V4 Reply (Call
In 34) ACCESS, [Allowed: RD LU MD XT DL]
 36          3 172.27.50.222 -> 172.27.255.28 NFS 282 V4 Call (Reply
In 37) GETATTR FH: 0x272b8d23
 37          3 172.27.255.28 -> 172.27.50.222 NFS 266 V4 Reply (Call
In 36) GETATTR
 38          3 172.27.50.222 -> 172.27.255.28 NFS 302 V4 Call (Reply
In 39) LOOKUP DH: 0x272b8d23/.inputrc
 39          3 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
In 38) LOOKUP Status: NFS4ERR_NOENT
 40          3 172.27.50.222 -> 172.27.255.28 TCP 66 822 > 2049 [ACK]
Seq=3950 Ack=2617 Win=3077 Len=0 TSval=177154789 TSecr=3474577216
 41          5 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
In 42) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
0x00000027  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:e3:02:00:00
 42          5 172.27.255.28 -> 172.27.50.222 NFS 122 V4 Reply (Call
In 41) LOCK Status: NFS4ERR_BAD_SEQID
 43          5 172.27.50.222 -> 172.27.255.28 TCP 66 822 > 2049 [ACK]
Seq=4238 Ack=2673 Win=3077 Len=0 TSval=177156701 TSecr=3474579167


working:

  7          2 172.27.50.222 -> 172.27.255.28 NFS 370 V4 Call (Reply
In 8) OPEN DH: 0xbfe01adc/m.db  nfs.seqid == 0x00000017
  8          2 172.27.255.28 -> 172.27.50.222 NFS 394 V4 Reply (Call
In 7) OPEN StateID: 0x08c1  nfs.seqid == 0x00000001
nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
  9          2 172.27.50.222 -> 172.27.255.28 NFS 374 V4 Call (Reply
In 10) OPEN DH: 0x272b8d23/.sqlite_history  nfs.seqid == 0x00000018
 10          2 172.27.255.28 -> 172.27.50.222 NFS 430 V4 Reply (Call
In 9) OPEN StateID: 0x147a  nfs.seqid == 0x00000001  nfs.seqid ==
0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fe:02:00:00
 11          2 172.27.50.222 -> 172.27.255.28 NFS 306 V4 Call (Reply
In 12) CLOSE StateID: 0x147a  nfs.seqid == 0x00000019  nfs.seqid ==
0x00000001  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
 12          2 172.27.255.28 -> 172.27.50.222 NFS 202 V4 Reply (Call
In 11) CLOSE  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:fd:02:00:00
 13          2 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
Seq=1293 Ack=1157 Win=40 Len=0 TSval=177379723 TSecr=3474802150
 14          4 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
In 15) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
0x0000001a  nfs.seqid == 0x00000001  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
 15          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
In 14) LOCK  nfs.seqid == 0x00000001  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 16          4 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
Seq=1581 Ack=1229 Win=40 Len=0 TSval=177381305 TSecr=3474803772
 17          4 172.27.50.222 -> 172.27.255.28 NFS 318 V4 Call (Reply
In 18) LOCK FH: 0x80589398 Offset: 1073741826 Length: 510   nfs.seqid
== 0x00000001  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 18          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
In 17) LOCK  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 19          4 172.27.50.222 -> 172.27.255.28 NFS 310 V4 Call (Reply
In 20) LOCKU FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid
== 0x00000002  nfs.seqid == 0x00000002  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 20          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
In 19) LOCKU  nfs.seqid == 0x00000003  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 21          4 172.27.50.222 -> 172.27.255.28 NFS 298 V4 Call (Reply
In 22) READ StateID: 0x2656 Offset: 0 Len: 6144  nfs.seqid ==
0x00000003  nfs.stateid4.other == 8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 22          4 172.27.255.28 -> 172.27.50.222 NFS 6274 V4 Reply (Call
In 21) READ
 23          4 172.27.50.222 -> 172.27.255.28 TCP 66 919 > 2049 [ACK]
Seq=2309 Ack=7581 Win=40 Len=0 TSval=177381309 TSecr=3474803775
 24          4 172.27.50.222 -> 172.27.255.28 NFS 310 V4 Call (Reply
In 25) LOCKU FH: 0x80589398 Offset: 0 Length: <End of File>  nfs.seqid
== 0x00000003  nfs.seqid == 0x00000003  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 25          4 172.27.255.28 -> 172.27.50.222 NFS 138 V4 Reply (Call
In 24) LOCKU  nfs.seqid == 0x00000004  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:ff:02:00:00
 26          4 172.27.50.222 -> 172.27.255.28 NFS 190 V4 Call (Reply
In 28) RELEASE_LOCKOWNER
 27          4 172.27.50.222 -> 172.27.255.28 NFS 354 V4 Call (Reply
In 29) LOCK FH: 0x80589398 Offset: 1073741824 Length: 1   nfs.seqid ==
0x0000001b  nfs.seqid == 0x00000001  nfs.stateid4.other ==
8f:73:7b:5d:30:ab:b5:d0:fc:02:00:00
 28          4 172.27.255.28 -> 172.27.50.222 NFS 114 V4 Reply (Call
In 26) RELEASE_LOCKOWNER

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-12  8:27 Leon Kyneur
2019-09-17 11:28 ` Benjamin Coddington
2019-09-18  2:20   ` Leon Kyneur
2019-09-18 11:04     ` Benjamin Coddington
2019-09-18 14:32       ` Olga Kornievskaia
2019-09-19  4:22         ` Leon Kyneur
2019-09-19 13:36           ` Olga Kornievskaia
2019-09-24  4:17             ` Leon Kyneur [this message]
2019-09-24 14:45               ` Olga Kornievskaia

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAACwWuOs7SMq48C4=09afag0xmYH5U9Z9LOyVvxJfMyZzq4u9g@mail.gmail.com' \
    --to=leonk@dug.com \
    --cc=aglo@umich.edu \
    --cc=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git