* [bug report] task hang while testing xfstests generic/323 @ 2019-02-28 10:10 Jiufei Xue 2019-02-28 22:26 ` Olga Kornievskaia 0 siblings, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-02-28 10:10 UTC (permalink / raw) To: bfields; +Cc: Anna.Schumaker, trond.myklebust, linux-nfs, Joseph Qi Hi, when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task changed to zombie occasionally while a thread is hanging with the following stack: [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] [<0>] nfs_file_release+0x35/0x50 [nfs] [<0>] __fput+0xa2/0x1c0 [<0>] task_work_run+0x82/0xa0 [<0>] do_exit+0x2ac/0xc20 [<0>] do_group_exit+0x39/0xa0 [<0>] get_signal+0x1ce/0x5d0 [<0>] do_signal+0x36/0x620 [<0>] exit_to_usermode_loop+0x5e/0xc2 [<0>] do_syscall_64+0x16c/0x190 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID), the client will retry to close the file when stateid generation number in client is lower than server. The original intention of this commit is retrying the operation while racing with an OPEN. However, in this case the stateid generation remains mismatch forever. Any suggestions? Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-02-28 10:10 [bug report] task hang while testing xfstests generic/323 Jiufei Xue @ 2019-02-28 22:26 ` Olga Kornievskaia 2019-02-28 23:56 ` Trond Myklebust 2019-03-01 5:08 ` Jiufei Xue 0 siblings, 2 replies; 26+ messages in thread From: Olga Kornievskaia @ 2019-02-28 22:26 UTC (permalink / raw) To: Jiufei Xue Cc: J. Bruce Fields, Anna Schumaker, Trond Myklebust, linux-nfs, Joseph Qi On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: > > Hi, > > when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task > changed to zombie occasionally while a thread is hanging with the > following stack: > > [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] > [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] > [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] > [<0>] nfs_file_release+0x35/0x50 [nfs] > [<0>] __fput+0xa2/0x1c0 > [<0>] task_work_run+0x82/0xa0 > [<0>] do_exit+0x2ac/0xc20 > [<0>] do_group_exit+0x39/0xa0 > [<0>] get_signal+0x1ce/0x5d0 > [<0>] do_signal+0x36/0x620 > [<0>] exit_to_usermode_loop+0x5e/0xc2 > [<0>] do_syscall_64+0x16c/0x190 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [<0>] 0xffffffffffffffff > > Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on > NFS4ERR_OLD_STATEID), the client will retry to close the file when > stateid generation number in client is lower than server. > > The original intention of this commit is retrying the operation while > racing with an OPEN. However, in this case the stateid generation remains > mismatch forever. > > Any suggestions? Can you include a network trace of the failure? Is it possible that the server has crashed on reply to the close and that's why the task is hung? What server are you testing against? I have seen trace where close would get ERR_OLD_STATEID and would still retry with the same open state until it got a reply to the OPEN which changed the state and when the client received reply to that, it'll retry the CLOSE with the updated stateid. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-02-28 22:26 ` Olga Kornievskaia @ 2019-02-28 23:56 ` Trond Myklebust 2019-03-01 5:19 ` Jiufei Xue 2019-03-01 5:08 ` Jiufei Xue 1 sibling, 1 reply; 26+ messages in thread From: Trond Myklebust @ 2019-02-28 23:56 UTC (permalink / raw) To: jiufei.xue, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi On Thu, 2019-02-28 at 17:26 -0500, Olga Kornievskaia wrote: > On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < > jiufei.xue@linux.alibaba.com> wrote: > > Hi, > > > > when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task > > changed to zombie occasionally while a thread is hanging with the > > following stack: > > > > [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] > > [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] > > [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] > > [<0>] nfs_file_release+0x35/0x50 [nfs] > > [<0>] __fput+0xa2/0x1c0 > > [<0>] task_work_run+0x82/0xa0 > > [<0>] do_exit+0x2ac/0xc20 > > [<0>] do_group_exit+0x39/0xa0 > > [<0>] get_signal+0x1ce/0x5d0 > > [<0>] do_signal+0x36/0x620 > > [<0>] exit_to_usermode_loop+0x5e/0xc2 > > [<0>] do_syscall_64+0x16c/0x190 > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [<0>] 0xffffffffffffffff > > > > Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on > > NFS4ERR_OLD_STATEID), the client will retry to close the file when > > stateid generation number in client is lower than server. > > > > The original intention of this commit is retrying the operation > > while > > racing with an OPEN. However, in this case the stateid generation > > remains > > mismatch forever. > > > > Any suggestions? > > Can you include a network trace of the failure? Is it possible that > the server has crashed on reply to the close and that's why the task > is hung? What server are you testing against? > > I have seen trace where close would get ERR_OLD_STATEID and would > still retry with the same open state until it got a reply to the OPEN > which changed the state and when the client received reply to that, > it'll retry the CLOSE with the updated stateid. I agree with Olga's assessment. The server is not allowed to randomly change the values of the seqid, and the client should be taking pains to replay any OPEN calls for which a reply is missed. The expectation is therefore that NFS4ERR_OLD_STATEID should always be a temporary state. If it is not, then the bugreport needs to explain why the server bumped the seqid without informing the client. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-02-28 23:56 ` Trond Myklebust @ 2019-03-01 5:19 ` Jiufei Xue 0 siblings, 0 replies; 26+ messages in thread From: Jiufei Xue @ 2019-03-01 5:19 UTC (permalink / raw) To: Trond Myklebust, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi On 2019/3/1 上午7:56, Trond Myklebust wrote: > On Thu, 2019-02-28 at 17:26 -0500, Olga Kornievskaia wrote: >> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < >> jiufei.xue@linux.alibaba.com> wrote: >>> Hi, >>> >>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task >>> changed to zombie occasionally while a thread is hanging with the >>> following stack: >>> >>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >>> [<0>] nfs_file_release+0x35/0x50 [nfs] >>> [<0>] __fput+0xa2/0x1c0 >>> [<0>] task_work_run+0x82/0xa0 >>> [<0>] do_exit+0x2ac/0xc20 >>> [<0>] do_group_exit+0x39/0xa0 >>> [<0>] get_signal+0x1ce/0x5d0 >>> [<0>] do_signal+0x36/0x620 >>> [<0>] exit_to_usermode_loop+0x5e/0xc2 >>> [<0>] do_syscall_64+0x16c/0x190 >>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> [<0>] 0xffffffffffffffff >>> >>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on >>> NFS4ERR_OLD_STATEID), the client will retry to close the file when >>> stateid generation number in client is lower than server. >>> >>> The original intention of this commit is retrying the operation >>> while >>> racing with an OPEN. However, in this case the stateid generation >>> remains >>> mismatch forever. >>> >>> Any suggestions? >> >> Can you include a network trace of the failure? Is it possible that >> the server has crashed on reply to the close and that's why the task >> is hung? What server are you testing against? >> >> I have seen trace where close would get ERR_OLD_STATEID and would >> still retry with the same open state until it got a reply to the OPEN >> which changed the state and when the client received reply to that, >> it'll retry the CLOSE with the updated stateid. > > I agree with Olga's assessment. The server is not allowed to randomly > change the values of the seqid, and the client should be taking pains > to replay any OPEN calls for which a reply is missed. The expectation > is therefore that NFS4ERR_OLD_STATEID should always be a temporary > state. > The server bumped the seqid because of a new OPEN from another thread. And I doubt that maybe the new OPEN task exit while receiving a signal without update the stateid. > If it is not, then the bugreport needs to explain why the server bumped > the seqid without informing the client. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-02-28 22:26 ` Olga Kornievskaia 2019-02-28 23:56 ` Trond Myklebust @ 2019-03-01 5:08 ` Jiufei Xue 2019-03-01 8:49 ` Jiufei Xue 1 sibling, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-01 5:08 UTC (permalink / raw) To: Olga Kornievskaia Cc: J. Bruce Fields, Anna Schumaker, Trond Myklebust, linux-nfs, Joseph Qi Hi Olga, On 2019/3/1 上午6:26, Olga Kornievskaia wrote: > On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: >> >> Hi, >> >> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task >> changed to zombie occasionally while a thread is hanging with the >> following stack: >> >> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >> [<0>] nfs_file_release+0x35/0x50 [nfs] >> [<0>] __fput+0xa2/0x1c0 >> [<0>] task_work_run+0x82/0xa0 >> [<0>] do_exit+0x2ac/0xc20 >> [<0>] do_group_exit+0x39/0xa0 >> [<0>] get_signal+0x1ce/0x5d0 >> [<0>] do_signal+0x36/0x620 >> [<0>] exit_to_usermode_loop+0x5e/0xc2 >> [<0>] do_syscall_64+0x16c/0x190 >> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> [<0>] 0xffffffffffffffff >> >> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on >> NFS4ERR_OLD_STATEID), the client will retry to close the file when >> stateid generation number in client is lower than server. >> >> The original intention of this commit is retrying the operation while >> racing with an OPEN. However, in this case the stateid generation remains >> mismatch forever. >> >> Any suggestions? > > Can you include a network trace of the failure? Is it possible that > the server has crashed on reply to the close and that's why the task > is hung? What server are you testing against? > I am testing the local server, the testcase is attached below. # uname -r 5.0.0-rc8 # cat /etc/exports /mnt/vdd *(rw,no_subtree_check,no_root_squash) # cat test.sh mount /dev/vdd /mnt/vdd systemctl start nfs mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd /mnt/nfsscratch for ((i=0;i<10000;i++)) do echo "loop $i started" /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-held-by-io 1 100 /mnt/nfsscratch/aio-testfile sleep 5 done The command aio-last-ref-held-by-io from xfstests will spawn 100 threads. Each thread repeated doing open/io_submit/close/io_getevents/io_destroy until the parent exist because of timeout. Here is the log when the task hung. [10841.121523] nfs4_close_prepare: begin! [10841.121526] --> nfs4_alloc_slot used_slots=0000 highest_used=4294967295 max_slots=10 [10841.121528] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 slotid=0 [10841.121530] nfs4_close_prepare: done! [10841.121536] encode_sequence: sessionid=1551405749:3822331530:1:0 seqid=59495845 slotid=0 max_slotid=0 cache_this=1 [10841.121583] nfsd_dispatch: vers 4 proc 1 [10841.121588] __find_in_sessionid_hashtbl: 1551405749:3822331530:1:0 [10841.121590] nfsd4_sequence: slotid 0 [10841.121592] check_slot_seqid enter. seqid 59495845 slot_seqid 59495844 [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 61e12d9e c03beba6 0000000b) [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 61e12d9e c03beba6 0000000b) [10841.121613] NFSD: nfsd4_close on file aio-testfile [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid = (5c7892b5/e3d4268a/000ab905/0000746b) [10841.121619] nfsv4 compound returned 10024 [10841.121621] --> nfsd4_store_cache_entry slot 00000000f540ddd9 [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) [10841.122526] decode_attr_type: type=00 ...... [10841.122558] nfs4_close_done: begin! [10841.122561] --> nfs4_alloc_slot used_slots=0001 highest_used=0 max_slots=10 [10841.122564] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 slotid=1 [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 [10841.122568] nfs41_sequence_process: Error 0 free the slot [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid 4294967295 [10841.122575] nfs4_close_done: done, ret = 0! [10841.122584] nfs4_close_prepare: begin! ...... Client keep retry the CLOSE. > I have seen trace where close would get ERR_OLD_STATEID and would > still retry with the same open state until it got a reply to the OPEN > which changed the state and when the client received reply to that, > it'll retry the CLOSE with the updated stateid. > Yes, you are right. I have also noticed that. But what happened if the OPEN task exit receiving a signal? How can a client update the stateid? Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-01 5:08 ` Jiufei Xue @ 2019-03-01 8:49 ` Jiufei Xue 2019-03-01 13:08 ` Trond Myklebust 0 siblings, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-01 8:49 UTC (permalink / raw) To: Olga Kornievskaia Cc: J. Bruce Fields, Anna Schumaker, Trond Myklebust, linux-nfs, Joseph Qi On 2019/3/1 下午1:08, Jiufei Xue wrote: > Hi Olga, > > On 2019/3/1 上午6:26, Olga Kornievskaia wrote: >> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: >>> >>> Hi, >>> >>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the task >>> changed to zombie occasionally while a thread is hanging with the >>> following stack: >>> >>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >>> [<0>] nfs_file_release+0x35/0x50 [nfs] >>> [<0>] __fput+0xa2/0x1c0 >>> [<0>] task_work_run+0x82/0xa0 >>> [<0>] do_exit+0x2ac/0xc20 >>> [<0>] do_group_exit+0x39/0xa0 >>> [<0>] get_signal+0x1ce/0x5d0 >>> [<0>] do_signal+0x36/0x620 >>> [<0>] exit_to_usermode_loop+0x5e/0xc2 >>> [<0>] do_syscall_64+0x16c/0x190 >>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> [<0>] 0xffffffffffffffff >>> >>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on >>> NFS4ERR_OLD_STATEID), the client will retry to close the file when >>> stateid generation number in client is lower than server. >>> >>> The original intention of this commit is retrying the operation while >>> racing with an OPEN. However, in this case the stateid generation remains >>> mismatch forever. >>> >>> Any suggestions? >> >> Can you include a network trace of the failure? Is it possible that >> the server has crashed on reply to the close and that's why the task >> is hung? What server are you testing against? >> > I am testing the local server, the testcase is attached below. > > # uname -r > 5.0.0-rc8 > > # cat /etc/exports > /mnt/vdd *(rw,no_subtree_check,no_root_squash) > > # cat test.sh > mount /dev/vdd /mnt/vdd > systemctl start nfs > mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd /mnt/nfsscratch > > for ((i=0;i<10000;i++)) > do > echo "loop $i started" > /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-held-by-io 1 100 /mnt/nfsscratch/aio-testfile > sleep 5 > done > > The command aio-last-ref-held-by-io from xfstests will spawn 100 threads. Each thread > repeated doing open/io_submit/close/io_getevents/io_destroy until the parent exist > because of timeout. > > > Here is the log when the task hung. > > [10841.121523] nfs4_close_prepare: begin! > [10841.121526] --> nfs4_alloc_slot used_slots=0000 highest_used=4294967295 max_slots=10 > [10841.121528] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 slotid=0 > [10841.121530] nfs4_close_prepare: done! > [10841.121536] encode_sequence: sessionid=1551405749:3822331530:1:0 seqid=59495845 slotid=0 max_slotid=0 cache_this=1 > [10841.121583] nfsd_dispatch: vers 4 proc 1 > [10841.121588] __find_in_sessionid_hashtbl: 1551405749:3822331530:1:0 > [10841.121590] nfsd4_sequence: slotid 0 > [10841.121592] check_slot_seqid enter. seqid 59495845 slot_seqid 59495844 > [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 61e12d9e c03beba6 0000000b) > [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 61e12d9e c03beba6 0000000b) > [10841.121613] NFSD: nfsd4_close on file aio-testfile > [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid = (5c7892b5/e3d4268a/000ab905/0000746b) > [10841.121619] nfsv4 compound returned 10024 > [10841.121621] --> nfsd4_store_cache_entry slot 00000000f540ddd9 > [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) > [10841.122526] decode_attr_type: type=00 > ...... > [10841.122558] nfs4_close_done: begin! > [10841.122561] --> nfs4_alloc_slot used_slots=0001 highest_used=0 max_slots=10 > [10841.122564] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 slotid=1 > [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 > [10841.122568] nfs41_sequence_process: Error 0 free the slot > [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid 4294967295 > [10841.122575] nfs4_close_done: done, ret = 0! > [10841.122584] nfs4_close_prepare: begin! > ...... > Client keep retry the CLOSE. > >> I have seen trace where close would get ERR_OLD_STATEID and would >> still retry with the same open state until it got a reply to the OPEN >> which changed the state and when the client received reply to that, >> it'll retry the CLOSE with the updated stateid. >> > > Yes, you are right. I have also noticed that. But what happened if the > OPEN task exit receiving a signal? How can a client update the stateid? > > Thanks, > Jiufei > I have added a tracepoint nfsd_inc_and_copy_stateid and enable the following trace: nfs4_open_file,nfs4_open_stateid_update on client and nfsd_inc_and_copy_stateid on server. Here's some trace right before the problem. The generation on server is 31856(7c70) which is higher than client(31855). And the client didn't update the stateid forever. aio-last-ref-he-5224 [000] ...1 1105.173446: nfs4_open_file: error=0 (OK) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:11 fhandle=0x0ca0e657 name=00:2b:2/aio-testfile stateid=31854:0x2a82450b openstateid=31854:0x2a82450b aio-last-ref-he-5222 [000] ...1 1105.173568: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 nfsd-2991 [000] ...1 1105.173661: nfsd_inc_and_copy_stateid: client 5c78ec26:60bfa0c2 stateid 00095d13:00007c6f aio-last-ref-he-5234 [000] ...1 1105.174000: nfs4_open_file: error=0 (OK) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:11 fhandle=0x0ca0e657 name=00:2b:2/aio-testfile stateid=31854:0x2a82450b openstateid=31854:0x2a82450b aio-last-ref-he-5151 [001] ...1 1105.174226: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5235 [001] ...1 1105.174544: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5141 [001] ...4 1105.174752: nfs4_open_stateid_update: error=-4 (0x4) fileid=00:2b:11 fhandle=0x0ca0e657 stateid=25925:0xe0cb0a41 aio-last-ref-he-5151 [001] ...4 1105.178258: nfs4_open_stateid_update: error=-4 (0x4) fileid=00:2b:11 fhandle=0x0ca0e657 stateid=31855:0x2a82450b <...>-5181 [001] .N.1 1105.180254: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5151 [001] ...1 1105.180670: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5235 [001] ...1 1105.180929: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5181 [001] ...1 1105.181179: nfs4_open_file: error=-512 (ERESTARTSYS) flags=49152 (O_DIRECT|0x8000) fmode=READ fileid=00:2b:0 fhandle=0x00000000 name=00:2b:2/aio-testfile stateid=0:0x00000000 openstateid=0:0x00000000 aio-last-ref-he-5141 [001] ...1 1105.181816: nfs4_open_file: error=0 (OK) flags=49152 (O_DIRECT|0x8000) fmode=READ|0x801c fileid=00:2b:11 fhandle=0x0ca0e657 name=00:2b:2/aio-testfile stateid=31855:0x2a82450b openstateid=31855:0x2a82450b nfsd-2990 [001] ...1 1105.182033: nfsd_inc_and_copy_stateid: client 5c78ec26:60bfa0c2 stateid 00095d13:00007c70 Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-01 8:49 ` Jiufei Xue @ 2019-03-01 13:08 ` Trond Myklebust 2019-03-02 16:34 ` Jiufei Xue 2019-03-04 15:20 ` Jiufei Xue 0 siblings, 2 replies; 26+ messages in thread From: Trond Myklebust @ 2019-03-01 13:08 UTC (permalink / raw) To: jiufei.xue, aglo Cc: bfields, Anna.Schumaker, Trond Myklebust, linux-nfs, joseph.qi On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote: > > On 2019/3/1 下午1:08, Jiufei Xue wrote: > > Hi Olga, > > > > On 2019/3/1 上午6:26, Olga Kornievskaia wrote: > > > On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < > > > jiufei.xue@linux.alibaba.com> wrote: > > > > Hi, > > > > > > > > when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the > > > > task > > > > changed to zombie occasionally while a thread is hanging with > > > > the > > > > following stack: > > > > > > > > [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] > > > > [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] > > > > [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] > > > > [<0>] nfs_file_release+0x35/0x50 [nfs] > > > > [<0>] __fput+0xa2/0x1c0 > > > > [<0>] task_work_run+0x82/0xa0 > > > > [<0>] do_exit+0x2ac/0xc20 > > > > [<0>] do_group_exit+0x39/0xa0 > > > > [<0>] get_signal+0x1ce/0x5d0 > > > > [<0>] do_signal+0x36/0x620 > > > > [<0>] exit_to_usermode_loop+0x5e/0xc2 > > > > [<0>] do_syscall_64+0x16c/0x190 > > > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > [<0>] 0xffffffffffffffff > > > > > > > > Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on > > > > NFS4ERR_OLD_STATEID), the client will retry to close the file > > > > when > > > > stateid generation number in client is lower than server. > > > > > > > > The original intention of this commit is retrying the operation > > > > while > > > > racing with an OPEN. However, in this case the stateid > > > > generation remains > > > > mismatch forever. > > > > > > > > Any suggestions? > > > > > > Can you include a network trace of the failure? Is it possible > > > that > > > the server has crashed on reply to the close and that's why the > > > task > > > is hung? What server are you testing against? > > > > > I am testing the local server, the testcase is attached below. > > > > # uname -r > > 5.0.0-rc8 > > > > # cat /etc/exports > > /mnt/vdd *(rw,no_subtree_check,no_root_squash) > > > > # cat test.sh > > mount /dev/vdd /mnt/vdd > > systemctl start nfs > > mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd > > /mnt/nfsscratch > > > > for ((i=0;i<10000;i++)) > > do > > echo "loop $i started" > > /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-held- > > by-io 1 100 /mnt/nfsscratch/aio-testfile > > sleep 5 > > done > > > > The command aio-last-ref-held-by-io from xfstests will spawn 100 > > threads. Each thread > > repeated doing open/io_submit/close/io_getevents/io_destroy until > > the parent exist > > because of timeout. > > > > > > Here is the log when the task hung. > > > > [10841.121523] nfs4_close_prepare: begin! > > [10841.121526] --> nfs4_alloc_slot used_slots=0000 > > highest_used=4294967295 max_slots=10 > > [10841.121528] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 > > slotid=0 > > [10841.121530] nfs4_close_prepare: done! > > [10841.121536] encode_sequence: sessionid=1551405749:3822331530:1:0 > > seqid=59495845 slotid=0 max_slotid=0 cache_this=1 > > [10841.121583] nfsd_dispatch: vers 4 proc 1 > > [10841.121588] __find_in_sessionid_hashtbl: > > 1551405749:3822331530:1:0 > > [10841.121590] nfsd4_sequence: slotid 0 > > [10841.121592] check_slot_seqid enter. seqid 59495845 slot_seqid > > 59495844 > > [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 > > 61e12d9e c03beba6 0000000b) > > [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 > > 61e12d9e c03beba6 0000000b) > > [10841.121613] NFSD: nfsd4_close on file aio-testfile > > [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid = > > (5c7892b5/e3d4268a/000ab905/0000746b) > > [10841.121619] nfsv4 compound returned 10024 > > [10841.121621] --> nfsd4_store_cache_entry slot 00000000f540ddd9 > > [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) > > [10841.122526] decode_attr_type: type=00 > > ...... > > [10841.122558] nfs4_close_done: begin! > > [10841.122561] --> nfs4_alloc_slot used_slots=0001 highest_used=0 > > max_slots=10 > > [10841.122564] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 > > slotid=1 > > [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 > > [10841.122568] nfs41_sequence_process: Error 0 free the slot > > [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid > > 4294967295 > > [10841.122575] nfs4_close_done: done, ret = 0! > > [10841.122584] nfs4_close_prepare: begin! > > ...... > > Client keep retry the CLOSE. > > > > > I have seen trace where close would get ERR_OLD_STATEID and would > > > still retry with the same open state until it got a reply to the > > > OPEN > > > which changed the state and when the client received reply to > > > that, > > > it'll retry the CLOSE with the updated stateid. > > > > > > > Yes, you are right. I have also noticed that. But what happened if > > the > > OPEN task exit receiving a signal? How can a client update the > > stateid? > > static void nfs4_open_release(void *calldata) { struct nfs4_opendata *data = calldata; struct nfs4_state *state = NULL; /* If this request hasn't been cancelled, do nothing */ if (!data->cancelled) goto out_free; /* In case of error, no cleanup! */ if (data->rpc_status != 0 || !data->rpc_done) goto out_free; /* In case we need an open_confirm, no cleanup! */ if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) goto out_free; state = nfs4_opendata_to_nfs4_state(data); if (!IS_ERR(state)) nfs4_close_state(state, data->o_arg.fmode); out_free: nfs4_opendata_put(data); } Note what happens when the 'data->cancelled' flag is set. This is why we always ensure that stateful operations such as OPEN, and LOCK,... are run as asynchronous RPC tasks. OPEN_DOWNGRADE, CLOSE and LOCKU are also asynchronous, but they are inherently safe against signals since they do not establish new state that needs to be torn down. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-01 13:08 ` Trond Myklebust @ 2019-03-02 16:34 ` Jiufei Xue 2019-03-04 15:20 ` Jiufei Xue 1 sibling, 0 replies; 26+ messages in thread From: Jiufei Xue @ 2019-03-02 16:34 UTC (permalink / raw) To: Trond Myklebust, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Trond, On 2019/3/1 下午9:08, Trond Myklebust wrote: > On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote: >> >> On 2019/3/1 下午1:08, Jiufei Xue wrote: >>> Hi Olga, >>> >>> On 2019/3/1 上午6:26, Olga Kornievskaia wrote: >>>> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < >>>> jiufei.xue@linux.alibaba.com> wrote: >>>>> Hi, >>>>> >>>>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the >>>>> task >>>>> changed to zombie occasionally while a thread is hanging with >>>>> the >>>>> following stack: >>>>> >>>>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >>>>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >>>>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >>>>> [<0>] nfs_file_release+0x35/0x50 [nfs] >>>>> [<0>] __fput+0xa2/0x1c0 >>>>> [<0>] task_work_run+0x82/0xa0 >>>>> [<0>] do_exit+0x2ac/0xc20 >>>>> [<0>] do_group_exit+0x39/0xa0 >>>>> [<0>] get_signal+0x1ce/0x5d0 >>>>> [<0>] do_signal+0x36/0x620 >>>>> [<0>] exit_to_usermode_loop+0x5e/0xc2 >>>>> [<0>] do_syscall_64+0x16c/0x190 >>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>>> [<0>] 0xffffffffffffffff >>>>> >>>>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on >>>>> NFS4ERR_OLD_STATEID), the client will retry to close the file >>>>> when >>>>> stateid generation number in client is lower than server. >>>>> >>>>> The original intention of this commit is retrying the operation >>>>> while >>>>> racing with an OPEN. However, in this case the stateid >>>>> generation remains >>>>> mismatch forever. >>>>> >>>>> Any suggestions? >>>> >>>> Can you include a network trace of the failure? Is it possible >>>> that >>>> the server has crashed on reply to the close and that's why the >>>> task >>>> is hung? What server are you testing against? >>>> >>> I am testing the local server, the testcase is attached below. >>> >>> # uname -r >>> 5.0.0-rc8 >>> >>> # cat /etc/exports >>> /mnt/vdd *(rw,no_subtree_check,no_root_squash) >>> >>> # cat test.sh >>> mount /dev/vdd /mnt/vdd >>> systemctl start nfs >>> mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd >>> /mnt/nfsscratch >>> >>> for ((i=0;i<10000;i++)) >>> do >>> echo "loop $i started" >>> /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-held- >>> by-io 1 100 /mnt/nfsscratch/aio-testfile >>> sleep 5 >>> done >>> >>> The command aio-last-ref-held-by-io from xfstests will spawn 100 >>> threads. Each thread >>> repeated doing open/io_submit/close/io_getevents/io_destroy until >>> the parent exist >>> because of timeout. >>> >>> >>> Here is the log when the task hung. >>> >>> [10841.121523] nfs4_close_prepare: begin! >>> [10841.121526] --> nfs4_alloc_slot used_slots=0000 >>> highest_used=4294967295 max_slots=10 >>> [10841.121528] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 >>> slotid=0 >>> [10841.121530] nfs4_close_prepare: done! >>> [10841.121536] encode_sequence: sessionid=1551405749:3822331530:1:0 >>> seqid=59495845 slotid=0 max_slotid=0 cache_this=1 >>> [10841.121583] nfsd_dispatch: vers 4 proc 1 >>> [10841.121588] __find_in_sessionid_hashtbl: >>> 1551405749:3822331530:1:0 >>> [10841.121590] nfsd4_sequence: slotid 0 >>> [10841.121592] check_slot_seqid enter. seqid 59495845 slot_seqid >>> 59495844 >>> [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>> 61e12d9e c03beba6 0000000b) >>> [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>> 61e12d9e c03beba6 0000000b) >>> [10841.121613] NFSD: nfsd4_close on file aio-testfile >>> [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid = >>> (5c7892b5/e3d4268a/000ab905/0000746b) >>> [10841.121619] nfsv4 compound returned 10024 >>> [10841.121621] --> nfsd4_store_cache_entry slot 00000000f540ddd9 >>> [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) >>> [10841.122526] decode_attr_type: type=00 >>> ...... >>> [10841.122558] nfs4_close_done: begin! >>> [10841.122561] --> nfs4_alloc_slot used_slots=0001 highest_used=0 >>> max_slots=10 >>> [10841.122564] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 >>> slotid=1 >>> [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 >>> [10841.122568] nfs41_sequence_process: Error 0 free the slot >>> [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid >>> 4294967295 >>> [10841.122575] nfs4_close_done: done, ret = 0! >>> [10841.122584] nfs4_close_prepare: begin! >>> ...... >>> Client keep retry the CLOSE. >>> >>>> I have seen trace where close would get ERR_OLD_STATEID and would >>>> still retry with the same open state until it got a reply to the >>>> OPEN >>>> which changed the state and when the client received reply to >>>> that, >>>> it'll retry the CLOSE with the updated stateid. >>>> >>> >>> Yes, you are right. I have also noticed that. But what happened if >>> the >>> OPEN task exit receiving a signal? How can a client update the >>> stateid? >>> > > static void nfs4_open_release(void *calldata) > { > struct > nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > /* In case of error, no cleanup! */ > if (data->rpc_status != 0 || !data->rpc_done) > goto out_free; > /* In case we need an open_confirm, no cleanup! */ > if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) > goto out_free; > state = nfs4_opendata_to_nfs4_state(data); > if (!IS_ERR(state)) > nfs4_close_state(state, data->o_arg.fmode); > out_free: > nfs4_opendata_put(data); > } > > Note what happens when the 'data->cancelled' flag is set. This is why > we always ensure that stateful operations such as OPEN, and LOCK,... > are run as asynchronous RPC tasks. > OPEN_DOWNGRADE, CLOSE and LOCKU are also asynchronous, but they are > inherently safe against signals since they do not establish new state > that needs to be torn down. > Thank you for your correction. I have not noted that before. However, the problem can still be reproduced on 5.0.0-rc8. I found that server bumped the last seqid in function nfsd4_process_open2(), while not keeping the client updated for some reasons. Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-01 13:08 ` Trond Myklebust 2019-03-02 16:34 ` Jiufei Xue @ 2019-03-04 15:20 ` Jiufei Xue 2019-03-04 15:50 ` Trond Myklebust 1 sibling, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-04 15:20 UTC (permalink / raw) To: Trond Myklebust, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Trond, On 2019/3/1 下午9:08, Trond Myklebust wrote: > On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote: >> >> On 2019/3/1 下午1:08, Jiufei Xue wrote: >>> Hi Olga, >>> >>> On 2019/3/1 上午6:26, Olga Kornievskaia wrote: >>>> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < >>>> jiufei.xue@linux.alibaba.com> wrote: >>>>> Hi, >>>>> >>>>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, the >>>>> task >>>>> changed to zombie occasionally while a thread is hanging with >>>>> the >>>>> following stack: >>>>> >>>>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >>>>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >>>>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >>>>> [<0>] nfs_file_release+0x35/0x50 [nfs] >>>>> [<0>] __fput+0xa2/0x1c0 >>>>> [<0>] task_work_run+0x82/0xa0 >>>>> [<0>] do_exit+0x2ac/0xc20 >>>>> [<0>] do_group_exit+0x39/0xa0 >>>>> [<0>] get_signal+0x1ce/0x5d0 >>>>> [<0>] do_signal+0x36/0x620 >>>>> [<0>] exit_to_usermode_loop+0x5e/0xc2 >>>>> [<0>] do_syscall_64+0x16c/0x190 >>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>>> [<0>] 0xffffffffffffffff >>>>> >>>>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and DELEGRETURN on >>>>> NFS4ERR_OLD_STATEID), the client will retry to close the file >>>>> when >>>>> stateid generation number in client is lower than server. >>>>> >>>>> The original intention of this commit is retrying the operation >>>>> while >>>>> racing with an OPEN. However, in this case the stateid >>>>> generation remains >>>>> mismatch forever. >>>>> >>>>> Any suggestions? >>>> >>>> Can you include a network trace of the failure? Is it possible >>>> that >>>> the server has crashed on reply to the close and that's why the >>>> task >>>> is hung? What server are you testing against? >>>> >>> I am testing the local server, the testcase is attached below. >>> >>> # uname -r >>> 5.0.0-rc8 >>> >>> # cat /etc/exports >>> /mnt/vdd *(rw,no_subtree_check,no_root_squash) >>> >>> # cat test.sh >>> mount /dev/vdd /mnt/vdd >>> systemctl start nfs >>> mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd >>> /mnt/nfsscratch >>> >>> for ((i=0;i<10000;i++)) >>> do >>> echo "loop $i started" >>> /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-held- >>> by-io 1 100 /mnt/nfsscratch/aio-testfile >>> sleep 5 >>> done >>> >>> The command aio-last-ref-held-by-io from xfstests will spawn 100 >>> threads. Each thread >>> repeated doing open/io_submit/close/io_getevents/io_destroy until >>> the parent exist >>> because of timeout. >>> >>> >>> Here is the log when the task hung. >>> >>> [10841.121523] nfs4_close_prepare: begin! >>> [10841.121526] --> nfs4_alloc_slot used_slots=0000 >>> highest_used=4294967295 max_slots=10 >>> [10841.121528] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 >>> slotid=0 >>> [10841.121530] nfs4_close_prepare: done! >>> [10841.121536] encode_sequence: sessionid=1551405749:3822331530:1:0 >>> seqid=59495845 slotid=0 max_slotid=0 cache_this=1 >>> [10841.121583] nfsd_dispatch: vers 4 proc 1 >>> [10841.121588] __find_in_sessionid_hashtbl: >>> 1551405749:3822331530:1:0 >>> [10841.121590] nfsd4_sequence: slotid 0 >>> [10841.121592] check_slot_seqid enter. seqid 59495845 slot_seqid >>> 59495844 >>> [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>> 61e12d9e c03beba6 0000000b) >>> [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>> 61e12d9e c03beba6 0000000b) >>> [10841.121613] NFSD: nfsd4_close on file aio-testfile >>> [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid = >>> (5c7892b5/e3d4268a/000ab905/0000746b) >>> [10841.121619] nfsv4 compound returned 10024 >>> [10841.121621] --> nfsd4_store_cache_entry slot 00000000f540ddd9 >>> [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) >>> [10841.122526] decode_attr_type: type=00 >>> ...... >>> [10841.122558] nfs4_close_done: begin! >>> [10841.122561] --> nfs4_alloc_slot used_slots=0001 highest_used=0 >>> max_slots=10 >>> [10841.122564] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 >>> slotid=1 >>> [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 >>> [10841.122568] nfs41_sequence_process: Error 0 free the slot >>> [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid >>> 4294967295 >>> [10841.122575] nfs4_close_done: done, ret = 0! >>> [10841.122584] nfs4_close_prepare: begin! >>> ...... >>> Client keep retry the CLOSE. >>> >>>> I have seen trace where close would get ERR_OLD_STATEID and would >>>> still retry with the same open state until it got a reply to the >>>> OPEN >>>> which changed the state and when the client received reply to >>>> that, >>>> it'll retry the CLOSE with the updated stateid. >>>> >>> >>> Yes, you are right. I have also noticed that. But what happened if >>> the >>> OPEN task exit receiving a signal? How can a client update the >>> stateid? >>> > > static void nfs4_open_release(void *calldata) > { > struct > nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > /* In case of error, no cleanup! */ > if (data->rpc_status != 0 || !data->rpc_done) > goto out_free; > /* In case we need an open_confirm, no cleanup! */ > if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) > goto out_free; > state = nfs4_opendata_to_nfs4_state(data); > if (!IS_ERR(state)) > nfs4_close_state(state, data->o_arg.fmode); > out_free: > nfs4_opendata_put(data); > } > > Note what happens when the 'data->cancelled' flag is set. This is why > we always ensure that stateful operations such as OPEN, and LOCK,... > are run as asynchronous RPC tasks. I noticed that when receiving a signal, the rpc_status will be set to -ERESTARTSYS in rpc_exit->rpc_exit_task->nfs4_open_done, which made stateid not updated in nfs4_open_release(). So is it possible that the stateid between the client and the server remain mismatch if the OPEN task received the signal right after transmitting the message or receiving the response, but before calling rpc_exit_task normally? Thanks, Jiufei > OPEN_DOWNGRADE, CLOSE and LOCKU are also asynchronous, but they are > inherently safe against signals since they do not establish new state > that needs to be torn down. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-04 15:20 ` Jiufei Xue @ 2019-03-04 15:50 ` Trond Myklebust 2019-03-05 5:09 ` Jiufei Xue 0 siblings, 1 reply; 26+ messages in thread From: Trond Myklebust @ 2019-03-04 15:50 UTC (permalink / raw) To: jiufei.xue, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi On Mon, 2019-03-04 at 23:20 +0800, Jiufei Xue wrote: > Hi Trond, > > On 2019/3/1 下午9:08, Trond Myklebust wrote: > > On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote: > > > On 2019/3/1 下午1:08, Jiufei Xue wrote: > > > > Hi Olga, > > > > > > > > On 2019/3/1 上午6:26, Olga Kornievskaia wrote: > > > > > On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < > > > > > jiufei.xue@linux.alibaba.com> wrote: > > > > > > Hi, > > > > > > > > > > > > when I tested xfstests/generic/323 with NFSv4.1 and v4.2, > > > > > > the > > > > > > task > > > > > > changed to zombie occasionally while a thread is hanging > > > > > > with > > > > > > the > > > > > > following stack: > > > > > > > > > > > > [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] > > > > > > [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] > > > > > > [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] > > > > > > [<0>] nfs_file_release+0x35/0x50 [nfs] > > > > > > [<0>] __fput+0xa2/0x1c0 > > > > > > [<0>] task_work_run+0x82/0xa0 > > > > > > [<0>] do_exit+0x2ac/0xc20 > > > > > > [<0>] do_group_exit+0x39/0xa0 > > > > > > [<0>] get_signal+0x1ce/0x5d0 > > > > > > [<0>] do_signal+0x36/0x620 > > > > > > [<0>] exit_to_usermode_loop+0x5e/0xc2 > > > > > > [<0>] do_syscall_64+0x16c/0x190 > > > > > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > [<0>] 0xffffffffffffffff > > > > > > > > > > > > Since commit 12f275cdd163(NFSv4: Retry CLOSE and > > > > > > DELEGRETURN on > > > > > > NFS4ERR_OLD_STATEID), the client will retry to close the > > > > > > file > > > > > > when > > > > > > stateid generation number in client is lower than server. > > > > > > > > > > > > The original intention of this commit is retrying the > > > > > > operation > > > > > > while > > > > > > racing with an OPEN. However, in this case the stateid > > > > > > generation remains > > > > > > mismatch forever. > > > > > > > > > > > > Any suggestions? > > > > > > > > > > Can you include a network trace of the failure? Is it > > > > > possible > > > > > that > > > > > the server has crashed on reply to the close and that's why > > > > > the > > > > > task > > > > > is hung? What server are you testing against? > > > > > > > > > I am testing the local server, the testcase is attached below. > > > > > > > > # uname -r > > > > 5.0.0-rc8 > > > > > > > > # cat /etc/exports > > > > /mnt/vdd *(rw,no_subtree_check,no_root_squash) > > > > > > > > # cat test.sh > > > > mount /dev/vdd /mnt/vdd > > > > systemctl start nfs > > > > mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd > > > > /mnt/nfsscratch > > > > > > > > for ((i=0;i<10000;i++)) > > > > do > > > > echo "loop $i started" > > > > /usr/local/xfstests/src/aio-dio-regress/aio-last-ref- > > > > held- > > > > by-io 1 100 /mnt/nfsscratch/aio-testfile > > > > sleep 5 > > > > done > > > > > > > > The command aio-last-ref-held-by-io from xfstests will spawn > > > > 100 > > > > threads. Each thread > > > > repeated doing open/io_submit/close/io_getevents/io_destroy > > > > until > > > > the parent exist > > > > because of timeout. > > > > > > > > > > > > Here is the log when the task hung. > > > > > > > > [10841.121523] nfs4_close_prepare: begin! > > > > [10841.121526] --> nfs4_alloc_slot used_slots=0000 > > > > highest_used=4294967295 max_slots=10 > > > > [10841.121528] <-- nfs4_alloc_slot used_slots=0001 > > > > highest_used=0 > > > > slotid=0 > > > > [10841.121530] nfs4_close_prepare: done! > > > > [10841.121536] encode_sequence: > > > > sessionid=1551405749:3822331530:1:0 > > > > seqid=59495845 slotid=0 max_slotid=0 cache_this=1 > > > > [10841.121583] nfsd_dispatch: vers 4 proc 1 > > > > [10841.121588] __find_in_sessionid_hashtbl: > > > > 1551405749:3822331530:1:0 > > > > [10841.121590] nfsd4_sequence: slotid 0 > > > > [10841.121592] check_slot_seqid enter. seqid 59495845 > > > > slot_seqid > > > > 59495844 > > > > [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 > > > > 61e12d9e c03beba6 0000000b) > > > > [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 > > > > 61e12d9e c03beba6 0000000b) > > > > [10841.121613] NFSD: nfsd4_close on file aio-testfile > > > > [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid > > > > = > > > > (5c7892b5/e3d4268a/000ab905/0000746b) > > > > [10841.121619] nfsv4 compound returned 10024 > > > > [10841.121621] --> nfsd4_store_cache_entry slot > > > > 00000000f540ddd9 > > > > [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) > > > > [10841.122526] decode_attr_type: type=00 > > > > ...... > > > > [10841.122558] nfs4_close_done: begin! > > > > [10841.122561] --> nfs4_alloc_slot used_slots=0001 > > > > highest_used=0 > > > > max_slots=10 > > > > [10841.122564] <-- nfs4_alloc_slot used_slots=0003 > > > > highest_used=1 > > > > slotid=1 > > > > [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 > > > > [10841.122568] nfs41_sequence_process: Error 0 free the slot > > > > [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid > > > > 4294967295 > > > > [10841.122575] nfs4_close_done: done, ret = 0! > > > > [10841.122584] nfs4_close_prepare: begin! > > > > ...... > > > > Client keep retry the CLOSE. > > > > > > > > > I have seen trace where close would get ERR_OLD_STATEID and > > > > > would > > > > > still retry with the same open state until it got a reply to > > > > > the > > > > > OPEN > > > > > which changed the state and when the client received reply to > > > > > that, > > > > > it'll retry the CLOSE with the updated stateid. > > > > > > > > > > > > > Yes, you are right. I have also noticed that. But what happened > > > > if > > > > the > > > > OPEN task exit receiving a signal? How can a client update the > > > > stateid? > > > > > > > > static void nfs4_open_release(void *calldata) > > { > > struct > > nfs4_opendata *data = calldata; > > struct nfs4_state *state = NULL; > > > > /* If this request hasn't been cancelled, do nothing */ > > if (!data->cancelled) > > goto out_free; > > /* In case of error, no cleanup! */ > > if (data->rpc_status != 0 || !data->rpc_done) > > goto out_free; > > /* In case we need an open_confirm, no cleanup! */ > > if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) > > goto out_free; > > state = nfs4_opendata_to_nfs4_state(data); > > if (!IS_ERR(state)) > > nfs4_close_state(state, data->o_arg.fmode); > > out_free: > > nfs4_opendata_put(data); > > } > > > > Note what happens when the 'data->cancelled' flag is set. This is > > why > > we always ensure that stateful operations such as OPEN, and > > LOCK,... > > are run as asynchronous RPC tasks. > > I noticed that when receiving a signal, the rpc_status will be set to > -ERESTARTSYS in rpc_exit->rpc_exit_task->nfs4_open_done, which made > stateid > not updated in nfs4_open_release(). > > So is it possible that the stateid between the client and the server > remain > mismatch if the OPEN task received the signal right after > transmitting the > message or receiving the response, but before calling rpc_exit_task > normally? If the rpc_task could be signalled, yes. However I don't see how that can happen to an async RPC call. Have you found such a case? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-04 15:50 ` Trond Myklebust @ 2019-03-05 5:09 ` Jiufei Xue 2019-03-05 14:45 ` Trond Myklebust 0 siblings, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-05 5:09 UTC (permalink / raw) To: Trond Myklebust, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi On 2019/3/4 下午11:50, Trond Myklebust wrote: > On Mon, 2019-03-04 at 23:20 +0800, Jiufei Xue wrote: >> Hi Trond, >> >> On 2019/3/1 下午9:08, Trond Myklebust wrote: >>> On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote: >>>> On 2019/3/1 下午1:08, Jiufei Xue wrote: >>>>> Hi Olga, >>>>> >>>>> On 2019/3/1 上午6:26, Olga Kornievskaia wrote: >>>>>> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue < >>>>>> jiufei.xue@linux.alibaba.com> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2, >>>>>>> the >>>>>>> task >>>>>>> changed to zombie occasionally while a thread is hanging >>>>>>> with >>>>>>> the >>>>>>> following stack: >>>>>>> >>>>>>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] >>>>>>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4] >>>>>>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs] >>>>>>> [<0>] nfs_file_release+0x35/0x50 [nfs] >>>>>>> [<0>] __fput+0xa2/0x1c0 >>>>>>> [<0>] task_work_run+0x82/0xa0 >>>>>>> [<0>] do_exit+0x2ac/0xc20 >>>>>>> [<0>] do_group_exit+0x39/0xa0 >>>>>>> [<0>] get_signal+0x1ce/0x5d0 >>>>>>> [<0>] do_signal+0x36/0x620 >>>>>>> [<0>] exit_to_usermode_loop+0x5e/0xc2 >>>>>>> [<0>] do_syscall_64+0x16c/0x190 >>>>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>>>>> [<0>] 0xffffffffffffffff >>>>>>> >>>>>>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and >>>>>>> DELEGRETURN on >>>>>>> NFS4ERR_OLD_STATEID), the client will retry to close the >>>>>>> file >>>>>>> when >>>>>>> stateid generation number in client is lower than server. >>>>>>> >>>>>>> The original intention of this commit is retrying the >>>>>>> operation >>>>>>> while >>>>>>> racing with an OPEN. However, in this case the stateid >>>>>>> generation remains >>>>>>> mismatch forever. >>>>>>> >>>>>>> Any suggestions? >>>>>> >>>>>> Can you include a network trace of the failure? Is it >>>>>> possible >>>>>> that >>>>>> the server has crashed on reply to the close and that's why >>>>>> the >>>>>> task >>>>>> is hung? What server are you testing against? >>>>>> >>>>> I am testing the local server, the testcase is attached below. >>>>> >>>>> # uname -r >>>>> 5.0.0-rc8 >>>>> >>>>> # cat /etc/exports >>>>> /mnt/vdd *(rw,no_subtree_check,no_root_squash) >>>>> >>>>> # cat test.sh >>>>> mount /dev/vdd /mnt/vdd >>>>> systemctl start nfs >>>>> mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd >>>>> /mnt/nfsscratch >>>>> >>>>> for ((i=0;i<10000;i++)) >>>>> do >>>>> echo "loop $i started" >>>>> /usr/local/xfstests/src/aio-dio-regress/aio-last-ref- >>>>> held- >>>>> by-io 1 100 /mnt/nfsscratch/aio-testfile >>>>> sleep 5 >>>>> done >>>>> >>>>> The command aio-last-ref-held-by-io from xfstests will spawn >>>>> 100 >>>>> threads. Each thread >>>>> repeated doing open/io_submit/close/io_getevents/io_destroy >>>>> until >>>>> the parent exist >>>>> because of timeout. >>>>> >>>>> >>>>> Here is the log when the task hung. >>>>> >>>>> [10841.121523] nfs4_close_prepare: begin! >>>>> [10841.121526] --> nfs4_alloc_slot used_slots=0000 >>>>> highest_used=4294967295 max_slots=10 >>>>> [10841.121528] <-- nfs4_alloc_slot used_slots=0001 >>>>> highest_used=0 >>>>> slotid=0 >>>>> [10841.121530] nfs4_close_prepare: done! >>>>> [10841.121536] encode_sequence: >>>>> sessionid=1551405749:3822331530:1:0 >>>>> seqid=59495845 slotid=0 max_slotid=0 cache_this=1 >>>>> [10841.121583] nfsd_dispatch: vers 4 proc 1 >>>>> [10841.121588] __find_in_sessionid_hashtbl: >>>>> 1551405749:3822331530:1:0 >>>>> [10841.121590] nfsd4_sequence: slotid 0 >>>>> [10841.121592] check_slot_seqid enter. seqid 59495845 >>>>> slot_seqid >>>>> 59495844 >>>>> [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>>>> 61e12d9e c03beba6 0000000b) >>>>> [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9 >>>>> 61e12d9e c03beba6 0000000b) >>>>> [10841.121613] NFSD: nfsd4_close on file aio-testfile >>>>> [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid >>>>> = >>>>> (5c7892b5/e3d4268a/000ab905/0000746b) >>>>> [10841.121619] nfsv4 compound returned 10024 >>>>> [10841.121621] --> nfsd4_store_cache_entry slot >>>>> 00000000f540ddd9 >>>>> [10841.121625] renewing client (clientid 5c7892b5/e3d4268a) >>>>> [10841.122526] decode_attr_type: type=00 >>>>> ...... >>>>> [10841.122558] nfs4_close_done: begin! >>>>> [10841.122561] --> nfs4_alloc_slot used_slots=0001 >>>>> highest_used=0 >>>>> max_slots=10 >>>>> [10841.122564] <-- nfs4_alloc_slot used_slots=0003 >>>>> highest_used=1 >>>>> slotid=1 >>>>> [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0 >>>>> [10841.122568] nfs41_sequence_process: Error 0 free the slot >>>>> [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid >>>>> 4294967295 >>>>> [10841.122575] nfs4_close_done: done, ret = 0! >>>>> [10841.122584] nfs4_close_prepare: begin! >>>>> ...... >>>>> Client keep retry the CLOSE. >>>>> >>>>>> I have seen trace where close would get ERR_OLD_STATEID and >>>>>> would >>>>>> still retry with the same open state until it got a reply to >>>>>> the >>>>>> OPEN >>>>>> which changed the state and when the client received reply to >>>>>> that, >>>>>> it'll retry the CLOSE with the updated stateid. >>>>>> >>>>> >>>>> Yes, you are right. I have also noticed that. But what happened >>>>> if >>>>> the >>>>> OPEN task exit receiving a signal? How can a client update the >>>>> stateid? >>>>> >>> >>> static void nfs4_open_release(void *calldata) >>> { >>> struct >>> nfs4_opendata *data = calldata; >>> struct nfs4_state *state = NULL; >>> >>> /* If this request hasn't been cancelled, do nothing */ >>> if (!data->cancelled) >>> goto out_free; >>> /* In case of error, no cleanup! */ >>> if (data->rpc_status != 0 || !data->rpc_done) >>> goto out_free; >>> /* In case we need an open_confirm, no cleanup! */ >>> if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) >>> goto out_free; >>> state = nfs4_opendata_to_nfs4_state(data); >>> if (!IS_ERR(state)) >>> nfs4_close_state(state, data->o_arg.fmode); >>> out_free: >>> nfs4_opendata_put(data); >>> } >>> >>> Note what happens when the 'data->cancelled' flag is set. This is >>> why >>> we always ensure that stateful operations such as OPEN, and >>> LOCK,... >>> are run as asynchronous RPC tasks. >> >> I noticed that when receiving a signal, the rpc_status will be set to >> -ERESTARTSYS in rpc_exit->rpc_exit_task->nfs4_open_done, which made >> stateid >> not updated in nfs4_open_release(). >> >> So is it possible that the stateid between the client and the server >> remain >> mismatch if the OPEN task received the signal right after >> transmitting the >> message or receiving the response, but before calling rpc_exit_task >> normally? > > If the rpc_task could be signalled, yes. However I don't see how that > can happen to an async RPC call. Have you found such a case? > I haven't found such a case, just guess :) But now I have found something new that may helpful. When the problem happened, rpc_status was set to -121 which returned from nfs4_xdr_dec_open->decode_putfh. I think it is the cause that client can not update the stateid. kworker/u4:0-7 [001] .... 559.920542: nfs4_xdr_dec_open: status=0 openstateid=40289:0xd4bdfeb7 kworker/1:0-2929 [001] .... 559.920550: nfs4_open_stateid_update: error=0 (OK) fileid=00:2b:12 fhandle=0x1e4566cb stateid=40289:0xd4bdfeb7 kworker/1:0-2929 [001] .... 559.920551: nfs4_open_release: cancel=1 rpc_status=0 rpc_done=1 openstateid=40289:0xd4bdfeb7 nfsd-2741 [001] .... 559.920571: nfsd_compound_status: op=1/6 OP_SEQUENCE status=0 nfsd-2741 [001] .... 559.920575: nfsd_compound_status: op=2/6 OP_PUTFH status=0 nfsd-2741 [001] .... 559.920581: nfsd_inc_and_copy_stateid: client 5c7df867:90134041 stateid 00031b7d:00009d62 // 40290 nfsd-2741 [001] .... 559.920584: nfsd_compound_status: op=3/6 OP_OPEN status=0 nfsd-2741 [001] .... 559.920584: nfsd_compound_status: op=4/6 OP_GETFH status=0 nfsd-2741 [001] .... 559.920585: nfsd_compound_status: op=5/6 OP_ACCESS status=0 nfsd-2741 [001] .... 559.920588: nfsd_compound_status: op=6/6 OP_GETATTR status=0 kworker/u4:0-7 [001] .... 559.920622: nfs4_xdr_dec_open: status=-121 openstateid=0:0x7bd5c66f // added for debug kworker/1:0-2929 [001] .... 559.920628: nfs4_open_release: cancel=1 rpc_status=-121 rpc_done=1 openstateid=0:0x7bd5c66f // added for debug And we got this error in dmesg: [ 559.834808] nfs: Server returned operation 53 but we issued a request for 22 [ 559.835741] nfs: Server returned operation 53 but we issued a request for 22 However, I don't kown why the client received two OP_SEQUENCE response. Any ideas? Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-05 5:09 ` Jiufei Xue @ 2019-03-05 14:45 ` Trond Myklebust 2019-03-06 9:59 ` Jiufei Xue 0 siblings, 1 reply; 26+ messages in thread From: Trond Myklebust @ 2019-03-05 14:45 UTC (permalink / raw) To: jiufei.xue, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi On Tue, 2019-03-05 at 13:09 +0800, Jiufei Xue wrote: > But now I have found something new that may helpful. When the problem > happened, rpc_status was set to -121 which returned from > nfs4_xdr_dec_open->decode_putfh. I think it is the cause that client > can > not update the stateid. > > kworker/u4:0-7 [001] .... 559.920542: nfs4_xdr_dec_open: > status=0 openstateid=40289:0xd4bdfeb7 > kworker/1:0-2929 [001] .... 559.920550: > nfs4_open_stateid_update: error=0 (OK) fileid=00:2b:12 > fhandle=0x1e4566cb stateid=40289:0xd4bdfeb7 > kworker/1:0-2929 [001] .... 559.920551: nfs4_open_release: > cancel=1 rpc_status=0 rpc_done=1 openstateid=40289:0xd4bdfeb7 > > nfsd-2741 [001] .... 559.920571: nfsd_compound_status: > op=1/6 OP_SEQUENCE status=0 > nfsd-2741 [001] .... 559.920575: nfsd_compound_status: > op=2/6 OP_PUTFH status=0 > nfsd-2741 [001] .... 559.920581: > nfsd_inc_and_copy_stateid: client 5c7df867:90134041 stateid > 00031b7d:00009d62 // 40290 > nfsd-2741 [001] .... 559.920584: nfsd_compound_status: > op=3/6 OP_OPEN status=0 > nfsd-2741 [001] .... 559.920584: nfsd_compound_status: > op=4/6 OP_GETFH status=0 > nfsd-2741 [001] .... 559.920585: nfsd_compound_status: > op=5/6 OP_ACCESS status=0 > nfsd-2741 [001] .... 559.920588: nfsd_compound_status: > op=6/6 OP_GETATTR status=0 > kworker/u4:0-7 [001] .... 559.920622: nfs4_xdr_dec_open: > status=-121 openstateid=0:0x7bd5c66f // added for debug > kworker/1:0-2929 [001] .... 559.920628: nfs4_open_release: > cancel=1 rpc_status=-121 rpc_done=1 openstateid=0:0x7bd5c66f // added > for debug > > And we got this error in dmesg: > [ 559.834808] nfs: Server returned operation 53 but we issued a > request for 22 > [ 559.835741] nfs: Server returned operation 53 but we issued a > request for 22 > > However, I don't kown why the client received two OP_SEQUENCE > response. > Any ideas? > No, that's quite weird... Were you able to catch this behaviour on wireshark? It would be interesting to see if the server is actually sending 2 SEQUENCE ops or if that is a corruption that occurs on the client. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-05 14:45 ` Trond Myklebust @ 2019-03-06 9:59 ` Jiufei Xue 2019-03-06 16:09 ` bfields 0 siblings, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-06 9:59 UTC (permalink / raw) To: Trond Myklebust, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Trond, > > No, that's quite weird... Were you able to catch this behaviour on > wireshark? It would be interesting to see if the server is actually > sending 2 SEQUENCE ops or if that is a corruption that occurs on the > client. > Many packages are dropped by kernel when I use tcpdump, so I am not sure whether the server is sending 2 SEQUENCE. However, according to tracepoint nfsd_compound_status, the server is processing SEQUENCE op only once. I think it is more likely blames to the client. Is there any other way to catch the behaviour? Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-06 9:59 ` Jiufei Xue @ 2019-03-06 16:09 ` bfields 2019-03-10 22:20 ` Olga Kornievskaia 0 siblings, 1 reply; 26+ messages in thread From: bfields @ 2019-03-06 16:09 UTC (permalink / raw) To: Jiufei Xue; +Cc: Trond Myklebust, aglo, Anna.Schumaker, linux-nfs, joseph.qi On Wed, Mar 06, 2019 at 05:59:39PM +0800, Jiufei Xue wrote: > Hi Trond, > > > > > No, that's quite weird... Were you able to catch this behaviour on > > wireshark? It would be interesting to see if the server is actually > > sending 2 SEQUENCE ops or if that is a corruption that occurs on the > > client. > > > > Many packages are dropped by kernel when I use tcpdump, so I am not sure > whether the server is sending 2 SEQUENCE. For what it's worth, tcpdump with -s200 or so should be enough to get this. Or maybe capturing from a different machine could help? > However, according to tracepoint nfsd_compound_status, the server is > processing SEQUENCE op only once. I think it is more likely blames to > the client. If there was a bug in nfsd's xdr decoding, that'd happen after that tracepoint. That'd be weird corruption, but perhaps all the possibilites are weird at this point. --b. > Is there any other way to catch the behaviour? > > Thanks, > Jiufei > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-06 16:09 ` bfields @ 2019-03-10 22:20 ` Olga Kornievskaia 2019-03-11 14:30 ` Trond Myklebust 0 siblings, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-10 22:20 UTC (permalink / raw) To: bfields; +Cc: Jiufei Xue, Trond Myklebust, Anna.Schumaker, linux-nfs, joseph.qi On Wed, Mar 6, 2019 at 11:09 AM bfields@fieldses.org <bfields@fieldses.org> wrote: > > On Wed, Mar 06, 2019 at 05:59:39PM +0800, Jiufei Xue wrote: > > Hi Trond, > > > > > > > > No, that's quite weird... Were you able to catch this behaviour on > > > wireshark? It would be interesting to see if the server is actually > > > sending 2 SEQUENCE ops or if that is a corruption that occurs on the > > > client. > > > > > > > Many packages are dropped by kernel when I use tcpdump, so I am not sure > > whether the server is sending 2 SEQUENCE. > > For what it's worth, tcpdump with -s200 or so should be enough to get > this. Or maybe capturing from a different machine could help? > > > However, according to tracepoint nfsd_compound_status, the server is > > processing SEQUENCE op only once. I think it is more likely blames to > > the client. > > If there was a bug in nfsd's xdr decoding, that'd happen after that > tracepoint. That'd be weird corruption, but perhaps all the > possibilites are weird at this point. This is not a bug in the Linux nfsd. I was able to reproduce it against the Ontap server. I ran using the RHEL7.6 client (but it's upsteram since Jiufei hit it). (I wanted to attach a trace but it looks like it's too big as my first attempt at posting this was unsuccessful. I can send a trace to a personal email if interested). There are a bunch of cases where multiple operations are using the same seqid and slot. Example of such weirdness is (nfs.seqid == 0x000002f4) && (nfs.slotid == 0) and the one leading to the hang. In frame 415870, there is an OPEN using that seqid and slot for the first time (but this slot will be re-used a bunch of times before it gets a reply in frame 415908 with the open stateid seq=40). (also in this packet there is an example of reuse slot=1+seqid=0x000128f7 by both TEST_STATEID and OPEN but let's set that aside). In frame 415874 (in the same packet), client sends 5 opens on the SAME seqid and slot (all have distinct xids). In a ways that's end up being alright since opens are for the same file and thus reply out of the cache and the reply is ERR_DELAY. But in frame 415876, client sends again uses the same seqid and slot and in this case it's used by 3opens and a test_stateid. Client in all this mess never processes the open stateid seq=40 and keeps on resending CLOSE with seq=37 (also to note client "missed" processing seqid=38 and 39 as well. 39 probably because it was a reply on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't tracked 38 but i'm assuming it's the same). I don't know how many times but after 5mins, I see a TEST_STATEID that again uses the same seqid+slot (which gets a reply from the cache matching OPEN). Also open + close (still with seq=37) open is replied to but after this client goes into a soft lockup logs have "nfs4_schedule_state_manager: kthread_ruan: -4" over and over again . then a soft lockup. Looking back on slot 0. nfs.seqid=0x000002f3 was used in frame=415866 by the TEST_STATEID. This is replied to in frame 415877 (with an ERR_DELAY). But before the client got a reply, it used the slot and the seq by frame 415874. TEST_STATEID is a synchronous and interruptible operation. I'm suspecting that somehow it was interrupted and that's who the slot was able to be re-used by the frame 415874. But how the several opens were able to get the same slot I don't know.. Regardless of how we got into the soft lockup. Client shouldn't be using the same slot+seqid on multiple operations. > > --b. > > > Is there any other way to catch the behaviour? > > > > Thanks, > > Jiufei > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-10 22:20 ` Olga Kornievskaia @ 2019-03-11 14:30 ` Trond Myklebust 2019-03-11 15:07 ` Olga Kornievskaia 2019-03-11 15:12 ` Trond Myklebust 0 siblings, 2 replies; 26+ messages in thread From: Trond Myklebust @ 2019-03-11 14:30 UTC (permalink / raw) To: bfields, aglo; +Cc: jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi Hi Olga, On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > There are a bunch of cases where multiple operations are using the > same seqid and slot. > > Example of such weirdness is (nfs.seqid == 0x000002f4) && (nfs.slotid > == 0) and the one leading to the hang. > > In frame 415870, there is an OPEN using that seqid and slot for the > first time (but this slot will be re-used a bunch of times before it > gets a reply in frame 415908 with the open stateid seq=40). (also in > this packet there is an example of reuse slot=1+seqid=0x000128f7 by > both TEST_STATEID and OPEN but let's set that aside). > > In frame 415874 (in the same packet), client sends 5 opens on the > SAME > seqid and slot (all have distinct xids). In a ways that's end up > being > alright since opens are for the same file and thus reply out of the > cache and the reply is ERR_DELAY. But in frame 415876, client sends > again uses the same seqid and slot and in this case it's used by > 3opens and a test_stateid. > > Client in all this mess never processes the open stateid seq=40 and > keeps on resending CLOSE with seq=37 (also to note client "missed" > processing seqid=38 and 39 as well. 39 probably because it was a > reply > on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't > tracked 38 but i'm assuming it's the same). I don't know how many > times but after 5mins, I see a TEST_STATEID that again uses the same > seqid+slot (which gets a reply from the cache matching OPEN). Also > open + close (still with seq=37) open is replied to but after this > client goes into a soft lockup logs have > "nfs4_schedule_state_manager: > kthread_ruan: -4" over and over again . then a soft lockup. > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in frame=415866 > by the TEST_STATEID. This is replied to in frame 415877 (with an > ERR_DELAY). But before the client got a reply, it used the slot and > the seq by frame 415874. TEST_STATEID is a synchronous and > interruptible operation. I'm suspecting that somehow it was > interrupted and that's who the slot was able to be re-used by the > frame 415874. But how the several opens were able to get the same > slot > I don't know.. Is this still true with the current linux-next? I would expect this patch http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 to change the Linux client behaviour in the above regard. Cheers Trond -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 14:30 ` Trond Myklebust @ 2019-03-11 15:07 ` Olga Kornievskaia 2019-03-11 15:13 ` Olga Kornievskaia 2019-03-11 15:12 ` Trond Myklebust 1 sibling, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-11 15:07 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi On Mon, Mar 11, 2019 at 10:30 AM Trond Myklebust <trondmy@hammerspace.com> wrote: > > Hi Olga, > > On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > > There are a bunch of cases where multiple operations are using the > > same seqid and slot. > > > > Example of such weirdness is (nfs.seqid == 0x000002f4) && (nfs.slotid > > == 0) and the one leading to the hang. > > > > In frame 415870, there is an OPEN using that seqid and slot for the > > first time (but this slot will be re-used a bunch of times before it > > gets a reply in frame 415908 with the open stateid seq=40). (also in > > this packet there is an example of reuse slot=1+seqid=0x000128f7 by > > both TEST_STATEID and OPEN but let's set that aside). > > > > In frame 415874 (in the same packet), client sends 5 opens on the > > SAME > > seqid and slot (all have distinct xids). In a ways that's end up > > being > > alright since opens are for the same file and thus reply out of the > > cache and the reply is ERR_DELAY. But in frame 415876, client sends > > again uses the same seqid and slot and in this case it's used by > > 3opens and a test_stateid. > > > > Client in all this mess never processes the open stateid seq=40 and > > keeps on resending CLOSE with seq=37 (also to note client "missed" > > processing seqid=38 and 39 as well. 39 probably because it was a > > reply > > on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't > > tracked 38 but i'm assuming it's the same). I don't know how many > > times but after 5mins, I see a TEST_STATEID that again uses the same > > seqid+slot (which gets a reply from the cache matching OPEN). Also > > open + close (still with seq=37) open is replied to but after this > > client goes into a soft lockup logs have > > "nfs4_schedule_state_manager: > > kthread_ruan: -4" over and over again . then a soft lockup. > > > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in frame=415866 > > by the TEST_STATEID. This is replied to in frame 415877 (with an > > ERR_DELAY). But before the client got a reply, it used the slot and > > the seq by frame 415874. TEST_STATEID is a synchronous and > > interruptible operation. I'm suspecting that somehow it was > > interrupted and that's who the slot was able to be re-used by the > > frame 415874. But how the several opens were able to get the same > > slot > > I don't know.. > > Is this still true with the current linux-next? I would expect this > patch > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 > to change the Linux client behaviour in the above regard. Yes. I reproduced it against your origin/testing (5.0.0-rc7+) commit 0d1bf3407c4ae88 ("SUNRPC: allow dynamic allocation of back channel slots"). > > Cheers > Trond > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 15:07 ` Olga Kornievskaia @ 2019-03-11 15:13 ` Olga Kornievskaia 2019-03-15 6:30 ` Jiufei Xue 0 siblings, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-11 15:13 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi On Mon, Mar 11, 2019 at 11:07 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > On Mon, Mar 11, 2019 at 10:30 AM Trond Myklebust > <trondmy@hammerspace.com> wrote: > > > > Hi Olga, > > > > On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > > > There are a bunch of cases where multiple operations are using the > > > same seqid and slot. > > > > > > Example of such weirdness is (nfs.seqid == 0x000002f4) && (nfs.slotid > > > == 0) and the one leading to the hang. > > > > > > In frame 415870, there is an OPEN using that seqid and slot for the > > > first time (but this slot will be re-used a bunch of times before it > > > gets a reply in frame 415908 with the open stateid seq=40). (also in > > > this packet there is an example of reuse slot=1+seqid=0x000128f7 by > > > both TEST_STATEID and OPEN but let's set that aside). > > > > > > In frame 415874 (in the same packet), client sends 5 opens on the > > > SAME > > > seqid and slot (all have distinct xids). In a ways that's end up > > > being > > > alright since opens are for the same file and thus reply out of the > > > cache and the reply is ERR_DELAY. But in frame 415876, client sends > > > again uses the same seqid and slot and in this case it's used by > > > 3opens and a test_stateid. > > > > > > Client in all this mess never processes the open stateid seq=40 and > > > keeps on resending CLOSE with seq=37 (also to note client "missed" > > > processing seqid=38 and 39 as well. 39 probably because it was a > > > reply > > > on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't > > > tracked 38 but i'm assuming it's the same). I don't know how many > > > times but after 5mins, I see a TEST_STATEID that again uses the same > > > seqid+slot (which gets a reply from the cache matching OPEN). Also > > > open + close (still with seq=37) open is replied to but after this > > > client goes into a soft lockup logs have > > > "nfs4_schedule_state_manager: > > > kthread_ruan: -4" over and over again . then a soft lockup. > > > > > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in frame=415866 > > > by the TEST_STATEID. This is replied to in frame 415877 (with an > > > ERR_DELAY). But before the client got a reply, it used the slot and > > > the seq by frame 415874. TEST_STATEID is a synchronous and > > > interruptible operation. I'm suspecting that somehow it was > > > interrupted and that's who the slot was able to be re-used by the > > > frame 415874. But how the several opens were able to get the same > > > slot > > > I don't know.. > > > > Is this still true with the current linux-next? I would expect this > > patch > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 > > to change the Linux client behaviour in the above regard. > > Yes. I reproduced it against your origin/testing (5.0.0-rc7+) commit > 0d1bf3407c4ae88 ("SUNRPC: allow dynamic allocation of back channel > slots"). Let me double check that. I have reproduced the "infinite loop" or CLOSE on the upstream (I'm looking thru the trace points from friday). > > > > > > Cheers > > Trond > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@hammerspace.com > > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 15:13 ` Olga Kornievskaia @ 2019-03-15 6:30 ` Jiufei Xue 2019-03-15 20:33 ` Olga Kornievskaia 0 siblings, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-15 6:30 UTC (permalink / raw) To: Olga Kornievskaia, Trond Myklebust Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Olga, On 2019/3/11 下午11:13, Olga Kornievskaia wrote: > Let me double check that. I have reproduced the "infinite loop" or > CLOSE on the upstream (I'm looking thru the trace points from friday). Do you try to capture the packages when reproduced this issue on the upstream. I still lost kernel packages after some adjustment according to bfield's suggestion :( Thanks, Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-15 6:30 ` Jiufei Xue @ 2019-03-15 20:33 ` Olga Kornievskaia 2019-03-15 20:55 ` Trond Myklebust 2019-03-16 14:11 ` Jiufei Xue 0 siblings, 2 replies; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-15 20:33 UTC (permalink / raw) To: Jiufei Xue; +Cc: Trond Myklebust, bfields, Anna.Schumaker, linux-nfs, joseph.qi On Fri, Mar 15, 2019 at 2:31 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: > > Hi Olga, > > On 2019/3/11 下午11:13, Olga Kornievskaia wrote: > > Let me double check that. I have reproduced the "infinite loop" or > > CLOSE on the upstream (I'm looking thru the trace points from friday). > > Do you try to capture the packages when reproduced this issue on the > upstream. I still lost kernel packages after some adjustment according > to bfield's suggestion :( Hi Jiufei, Yes I have network trace captures but they are too big to post to the mailing list. I have reproduced the problem on the latest upstream origin/testing branch commit "SUNRPC: Take the transport send lock before binding+connecting". As you have noted before infinite loops is due to client "losing" an update to the seqid. one packet would send out an (recovery) OPEN with slot=0 seqid=Y. tracepoint (nfs4_open_file) would log that status=ERESTARTSYS. The rpc task would be sent and the rpc task would receive a reply but there is nobody there to receive it... This open that got a reply has an updated stateid seqid which client never updates. When CLOSE is sent, it's sent with the "old" stateid and puts the client in an infinite loop. Btw, CLOSE is sent on the interrupted slot which should get FALSE_RETRY which causes the client to terminate the session. But it would still keep sending the CLOSE with the old stateid. Some things I've noticed is that TEST_STATE op (as a part of the nfs41_test_and _free_expired_stateid()) for some reason always has a signal set even before issuing and RPC task so the task never completes (ever). I always thought that OPEN's can't be interrupted but I guess they are since they call rpc_wait_for_completion_task() and that's a killable event. But I don't know how to find out what's sending a signal to the process. I'm rather stuck here trying to figure out where to go from there. So I'm still trying to figure out what's causing the signal or also how to recover from it that the client doesn't lose that seqid. > > Thanks, > Jiufei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-15 20:33 ` Olga Kornievskaia @ 2019-03-15 20:55 ` Trond Myklebust 2019-03-16 14:11 ` Jiufei Xue 1 sibling, 0 replies; 26+ messages in thread From: Trond Myklebust @ 2019-03-15 20:55 UTC (permalink / raw) To: jiufei.xue, aglo; +Cc: bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Olga, On Fri, 2019-03-15 at 16:33 -0400, Olga Kornievskaia wrote: > On Fri, Mar 15, 2019 at 2:31 AM Jiufei Xue < > jiufei.xue@linux.alibaba.com> wrote: > > Hi Olga, > > > > On 2019/3/11 下午11:13, Olga Kornievskaia wrote: > > > Let me double check that. I have reproduced the "infinite loop" > > > or > > > CLOSE on the upstream (I'm looking thru the trace points from > > > friday). > > > > Do you try to capture the packages when reproduced this issue on > > the > > upstream. I still lost kernel packages after some adjustment > > according > > to bfield's suggestion :( > > Hi Jiufei, > > Yes I have network trace captures but they are too big to post to the > mailing list. I have reproduced the problem on the latest upstream > origin/testing branch commit "SUNRPC: Take the transport send lock > before binding+connecting". As you have noted before infinite loops > is > due to client "losing" an update to the seqid. > > one packet would send out an (recovery) OPEN with slot=0 seqid=Y. > tracepoint (nfs4_open_file) would log that status=ERESTARTSYS. The > rpc > task would be sent and the rpc task would receive a reply but there > is > nobody there to receive it... This open that got a reply has an > updated stateid seqid which client never updates. When CLOSE is sent, > it's sent with the "old" stateid and puts the client in an infinite > loop. Btw, CLOSE is sent on the interrupted slot which should get > FALSE_RETRY which causes the client to terminate the session. But it > would still keep sending the CLOSE with the old stateid. > > Some things I've noticed is that TEST_STATE op (as a part of the > nfs41_test_and _free_expired_stateid()) for some reason always has a > signal set even before issuing and RPC task so the task never > completes (ever). > > I always thought that OPEN's can't be interrupted but I guess they > are > since they call rpc_wait_for_completion_task() and that's a killable > event. But I don't know how to find out what's sending a signal to > the > process. I'm rather stuck here trying to figure out where to go from > there. So I'm still trying to figure out what's causing the signal or > also how to recover from it that the client doesn't lose that seqid. > Sending a fatal signal to a process waiting in rpc_wait_for_completion_task() will interrupt that wait, but it will not interrupt the running asynchronous RPC call. All it should do is switch the call to taking the 'data->cancelled == true' path in the RPC call clean up. Cheers Trond -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-15 20:33 ` Olga Kornievskaia 2019-03-15 20:55 ` Trond Myklebust @ 2019-03-16 14:11 ` Jiufei Xue 2019-03-19 15:33 ` Olga Kornievskaia 1 sibling, 1 reply; 26+ messages in thread From: Jiufei Xue @ 2019-03-16 14:11 UTC (permalink / raw) To: Olga Kornievskaia Cc: Trond Myklebust, bfields, Anna.Schumaker, linux-nfs, joseph.qi Hi Olga, On 2019/3/16 上午4:33, Olga Kornievskaia wrote: > On Fri, Mar 15, 2019 at 2:31 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: >> >> Hi Olga, >> >> On 2019/3/11 下午11:13, Olga Kornievskaia wrote: >>> Let me double check that. I have reproduced the "infinite loop" or >>> CLOSE on the upstream (I'm looking thru the trace points from friday). >> >> Do you try to capture the packages when reproduced this issue on the >> upstream. I still lost kernel packages after some adjustment according >> to bfield's suggestion :( > > Hi Jiufei, > > Yes I have network trace captures but they are too big to post to the > mailing list. I have reproduced the problem on the latest upstream > origin/testing branch commit "SUNRPC: Take the transport send lock > before binding+connecting". As you have noted before infinite loops is > due to client "losing" an update to the seqid. > > one packet would send out an (recovery) OPEN with slot=0 seqid=Y. > tracepoint (nfs4_open_file) would log that status=ERESTARTSYS. The rpc > task would be sent and the rpc task would receive a reply but there is > nobody there to receive it... This open that got a reply has an > updated stateid seqid which client never updates. When CLOSE is sent, > it's sent with the "old" stateid and puts the client in an infinite > loop. Btw, CLOSE is sent on the interrupted slot which should get > FALSE_RETRY which causes the client to terminate the session. But it > would still keep sending the CLOSE with the old stateid. > > Some things I've noticed is that TEST_STATE op (as a part of the > nfs41_test_and _free_expired_stateid()) for some reason always has a > signal set even before issuing and RPC task so the task never > completes (ever). > > I always thought that OPEN's can't be interrupted but I guess they are > since they call rpc_wait_for_completion_task() and that's a killable > event. But I don't know how to find out what's sending a signal to the > process. I'm rather stuck here trying to figure out where to go from > there. So I'm still trying to figure out what's causing the signal or > also how to recover from it that the client doesn't lose that seqid. > >> Thank you for you quick relpy. I have also noticed the ERESTARTSYS status for OPEN op, but I think it is returned by the open process which is woken in nfs4_run_open_task(). I found that the decode routine nfs4_xdr_dec_open returned -121, which I thought is the root cause of this problem. Could you please post the content of the last OPEN message? Thanks, Jiufei. >> Thanks, >> Jiufei > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-16 14:11 ` Jiufei Xue @ 2019-03-19 15:33 ` Olga Kornievskaia 0 siblings, 0 replies; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-19 15:33 UTC (permalink / raw) To: Jiufei Xue; +Cc: Trond Myklebust, bfields, Anna.Schumaker, linux-nfs, joseph.qi On Sat, Mar 16, 2019 at 10:11 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: > > > Hi Olga, > On 2019/3/16 上午4:33, Olga Kornievskaia wrote: > > On Fri, Mar 15, 2019 at 2:31 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote: > >> > >> Hi Olga, > >> > >> On 2019/3/11 下午11:13, Olga Kornievskaia wrote: > >>> Let me double check that. I have reproduced the "infinite loop" or > >>> CLOSE on the upstream (I'm looking thru the trace points from friday). > >> > >> Do you try to capture the packages when reproduced this issue on the > >> upstream. I still lost kernel packages after some adjustment according > >> to bfield's suggestion :( > > > > Hi Jiufei, > > > > Yes I have network trace captures but they are too big to post to the > > mailing list. I have reproduced the problem on the latest upstream > > origin/testing branch commit "SUNRPC: Take the transport send lock > > before binding+connecting". As you have noted before infinite loops is > > due to client "losing" an update to the seqid. > > > > one packet would send out an (recovery) OPEN with slot=0 seqid=Y. > > tracepoint (nfs4_open_file) would log that status=ERESTARTSYS. The rpc > > task would be sent and the rpc task would receive a reply but there is > > nobody there to receive it... This open that got a reply has an > > updated stateid seqid which client never updates. When CLOSE is sent, > > it's sent with the "old" stateid and puts the client in an infinite > > loop. Btw, CLOSE is sent on the interrupted slot which should get > > FALSE_RETRY which causes the client to terminate the session. But it > > would still keep sending the CLOSE with the old stateid. > > > > Some things I've noticed is that TEST_STATE op (as a part of the > > nfs41_test_and _free_expired_stateid()) for some reason always has a > > signal set even before issuing and RPC task so the task never > > completes (ever). > > > > I always thought that OPEN's can't be interrupted but I guess they are > > since they call rpc_wait_for_completion_task() and that's a killable > > event. But I don't know how to find out what's sending a signal to the > > process. I'm rather stuck here trying to figure out where to go from > > there. So I'm still trying to figure out what's causing the signal or > > also how to recover from it that the client doesn't lose that seqid. > > > >> > Thank you for you quick relpy. > > I have also noticed the ERESTARTSYS status for OPEN op, but I think it > is returned by the open process which is woken in nfs4_run_open_task(). > I found that the decode routine nfs4_xdr_dec_open returned -121, which > I thought is the root cause of this problem. Could you please post the > content of the last OPEN message? Hi Jiufei, Yes I think that's why the update isn't happening because the rpc_status isn't 0. Trond, rpc_status of the rpc tasks that were interrupted but are finishing are not able to succeed because when they try to decode_sequence the res->st_slot is NULL. Sequence op is not decoded and then when it tries to decode the PUTFH it throws unexpected op (expecting PUTFH but has SEQ still there instead). res->st_slot is going away because after the open(s) were interrupted and _nfs4_proc_open() returned an error (interrupted), it goes and frees the slot. Is it perhaps appropriate to only free the slot there when if (!data->cancelled) free_slot() otherwise. Let the async RPC task continue and once it's done it'll free the slot. How's this for a proposed fix: Subject: [PATCH 1/1] NFSv4.1 don't free interrupted slot on open Allow the async rpc task for finish and update the open state if needed, then free the slot. Otherwise, the async rpc unable to decode the reply. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> --- fs/nfs/nfs4proc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 4dbb0ee..96c2499 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2933,7 +2933,8 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata, } out: - nfs4_sequence_free_slot(&opendata->o_res.seq_res); + if (!opendata->cancelled) + nfs4_sequence_free_slot(&opendata->o_res.seq_res); return ret; } > > Thanks, > Jiufei. > > > > >> Thanks, > >> Jiufei > > ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 14:30 ` Trond Myklebust 2019-03-11 15:07 ` Olga Kornievskaia @ 2019-03-11 15:12 ` Trond Myklebust 2019-03-11 15:14 ` Olga Kornievskaia 1 sibling, 1 reply; 26+ messages in thread From: Trond Myklebust @ 2019-03-11 15:12 UTC (permalink / raw) To: bfields, aglo; +Cc: jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi On Mon, 2019-03-11 at 14:30 +0000, Trond Myklebust wrote: > Hi Olga, > > On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > > There are a bunch of cases where multiple operations are using the > > same seqid and slot. > > > > Example of such weirdness is (nfs.seqid == 0x000002f4) && > > (nfs.slotid > > == 0) and the one leading to the hang. > > > > In frame 415870, there is an OPEN using that seqid and slot for the > > first time (but this slot will be re-used a bunch of times before > > it > > gets a reply in frame 415908 with the open stateid seq=40). (also > > in > > this packet there is an example of reuse slot=1+seqid=0x000128f7 by > > both TEST_STATEID and OPEN but let's set that aside). > > > > In frame 415874 (in the same packet), client sends 5 opens on the > > SAME > > seqid and slot (all have distinct xids). In a ways that's end up > > being > > alright since opens are for the same file and thus reply out of the > > cache and the reply is ERR_DELAY. But in frame 415876, client sends > > again uses the same seqid and slot and in this case it's used by > > 3opens and a test_stateid. This should result in exactly 1 bump of the stateid seqid. > > Client in all this mess never processes the open stateid seq=40 and > > keeps on resending CLOSE with seq=37 (also to note client "missed" > > processing seqid=38 and 39 as well. 39 probably because it was a > > reply > > on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't > > tracked 38 but i'm assuming it's the same). I don't know how many > > times but after 5mins, I see a TEST_STATEID that again uses the > > same > > seqid+slot (which gets a reply from the cache matching OPEN). Also > > open + close (still with seq=37) open is replied to but after this > > client goes into a soft lockup logs have > > "nfs4_schedule_state_manager: > > kthread_ruan: -4" over and over again . then a soft lockup. > > > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in > > frame=415866 > > by the TEST_STATEID. This is replied to in frame 415877 (with an > > ERR_DELAY). But before the client got a reply, it used the slot and > > the seq by frame 415874. TEST_STATEID is a synchronous and > > interruptible operation. I'm suspecting that somehow it was > > interrupted and that's who the slot was able to be re-used by the > > frame 415874. But how the several opens were able to get the same > > slot > > I don't know.. > > Is this still true with the current linux-next? I would expect this > patch > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 > to change the Linux client behaviour in the above regard. > Note also that what you describe would appear to indicate a serious server bug. If the client is reusing a slot+seqid, then the correct server behaviour is to either return one of the errors NFS4ERR_DELAY, NFS4ERR_SEQ_FALSE_RETRY, NFS4ERR_RETRY_UNCACHED_REP, NFS4ERR_SEQ_MISORDERED, or else it must replay the exact same reply that it had cached for the original request. It is protocol violation for the server to execute new requests on a slot+seqid combination that has already been used. For that reason, it is also impossible for a replay to cause further state changes on the server; a replay means that the server belts out the exact same reply that was attempted sent earlier with no changes (with stateids that match bit for bit what would have been sent earlier). -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 15:12 ` Trond Myklebust @ 2019-03-11 15:14 ` Olga Kornievskaia 2019-03-11 15:28 ` Trond Myklebust 0 siblings, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2019-03-11 15:14 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi On Mon, Mar 11, 2019 at 11:12 AM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Mon, 2019-03-11 at 14:30 +0000, Trond Myklebust wrote: > > Hi Olga, > > > > On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > > > There are a bunch of cases where multiple operations are using the > > > same seqid and slot. > > > > > > Example of such weirdness is (nfs.seqid == 0x000002f4) && > > > (nfs.slotid > > > == 0) and the one leading to the hang. > > > > > > In frame 415870, there is an OPEN using that seqid and slot for the > > > first time (but this slot will be re-used a bunch of times before > > > it > > > gets a reply in frame 415908 with the open stateid seq=40). (also > > > in > > > this packet there is an example of reuse slot=1+seqid=0x000128f7 by > > > both TEST_STATEID and OPEN but let's set that aside). > > > > > > In frame 415874 (in the same packet), client sends 5 opens on the > > > SAME > > > seqid and slot (all have distinct xids). In a ways that's end up > > > being > > > alright since opens are for the same file and thus reply out of the > > > cache and the reply is ERR_DELAY. But in frame 415876, client sends > > > again uses the same seqid and slot and in this case it's used by > > > 3opens and a test_stateid. > > This should result in exactly 1 bump of the stateid seqid. > > > > Client in all this mess never processes the open stateid seq=40 and > > > keeps on resending CLOSE with seq=37 (also to note client "missed" > > > processing seqid=38 and 39 as well. 39 probably because it was a > > > reply > > > on the same kind of "Reused" slot=1 and seq=0x000128f7. I haven't > > > tracked 38 but i'm assuming it's the same). I don't know how many > > > times but after 5mins, I see a TEST_STATEID that again uses the > > > same > > > seqid+slot (which gets a reply from the cache matching OPEN). Also > > > open + close (still with seq=37) open is replied to but after this > > > client goes into a soft lockup logs have > > > "nfs4_schedule_state_manager: > > > kthread_ruan: -4" over and over again . then a soft lockup. > > > > > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in > > > frame=415866 > > > by the TEST_STATEID. This is replied to in frame 415877 (with an > > > ERR_DELAY). But before the client got a reply, it used the slot and > > > the seq by frame 415874. TEST_STATEID is a synchronous and > > > interruptible operation. I'm suspecting that somehow it was > > > interrupted and that's who the slot was able to be re-used by the > > > frame 415874. But how the several opens were able to get the same > > > slot > > > I don't know.. > > > > Is this still true with the current linux-next? I would expect this > > patch > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 > > to change the Linux client behaviour in the above regard. > > > > > Note also that what you describe would appear to indicate a serious > server bug. If the client is reusing a slot+seqid, then the correct > server behaviour is to either return one of the errors NFS4ERR_DELAY, > NFS4ERR_SEQ_FALSE_RETRY, NFS4ERR_RETRY_UNCACHED_REP, > NFS4ERR_SEQ_MISORDERED, or else it must replay the exact same reply > that it had cached for the original request. > > It is protocol violation for the server to execute new requests on a > slot+seqid combination that has already been used. For that reason, it > is also impossible for a replay to cause further state changes on the > server; a replay means that the server belts out the exact same reply > that was attempted sent earlier with no changes (with stateids that > match bit for bit what would have been sent earlier). > But it is the same requests because all of them are opens on the same file same everything. > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [bug report] task hang while testing xfstests generic/323 2019-03-11 15:14 ` Olga Kornievskaia @ 2019-03-11 15:28 ` Trond Myklebust 0 siblings, 0 replies; 26+ messages in thread From: Trond Myklebust @ 2019-03-11 15:28 UTC (permalink / raw) To: aglo; +Cc: bfields, jiufei.xue, Anna.Schumaker, linux-nfs, joseph.qi On Mon, 2019-03-11 at 11:14 -0400, Olga Kornievskaia wrote: > On Mon, Mar 11, 2019 at 11:12 AM Trond Myklebust > <trondmy@hammerspace.com> wrote: > > On Mon, 2019-03-11 at 14:30 +0000, Trond Myklebust wrote: > > > Hi Olga, > > > > > > On Sun, 2019-03-10 at 18:20 -0400, Olga Kornievskaia wrote: > > > > There are a bunch of cases where multiple operations are using > > > > the > > > > same seqid and slot. > > > > > > > > Example of such weirdness is (nfs.seqid == 0x000002f4) && > > > > (nfs.slotid > > > > == 0) and the one leading to the hang. > > > > > > > > In frame 415870, there is an OPEN using that seqid and slot for > > > > the > > > > first time (but this slot will be re-used a bunch of times > > > > before > > > > it > > > > gets a reply in frame 415908 with the open stateid > > > > seq=40). (also > > > > in > > > > this packet there is an example of reuse > > > > slot=1+seqid=0x000128f7 by > > > > both TEST_STATEID and OPEN but let's set that aside). > > > > > > > > In frame 415874 (in the same packet), client sends 5 opens on > > > > the > > > > SAME > > > > seqid and slot (all have distinct xids). In a ways that's end > > > > up > > > > being > > > > alright since opens are for the same file and thus reply out of > > > > the > > > > cache and the reply is ERR_DELAY. But in frame 415876, client > > > > sends > > > > again uses the same seqid and slot and in this case it's used > > > > by > > > > 3opens and a test_stateid. > > > > This should result in exactly 1 bump of the stateid seqid. > > > > > > Client in all this mess never processes the open stateid seq=40 > > > > and > > > > keeps on resending CLOSE with seq=37 (also to note client > > > > "missed" > > > > processing seqid=38 and 39 as well. 39 probably because it was > > > > a > > > > reply > > > > on the same kind of "Reused" slot=1 and seq=0x000128f7. I > > > > haven't > > > > tracked 38 but i'm assuming it's the same). I don't know how > > > > many > > > > times but after 5mins, I see a TEST_STATEID that again uses the > > > > same > > > > seqid+slot (which gets a reply from the cache matching OPEN). > > > > Also > > > > open + close (still with seq=37) open is replied to but after > > > > this > > > > client goes into a soft lockup logs have > > > > "nfs4_schedule_state_manager: > > > > kthread_ruan: -4" over and over again . then a soft lockup. > > > > > > > > Looking back on slot 0. nfs.seqid=0x000002f3 was used in > > > > frame=415866 > > > > by the TEST_STATEID. This is replied to in frame 415877 (with > > > > an > > > > ERR_DELAY). But before the client got a reply, it used the slot > > > > and > > > > the seq by frame 415874. TEST_STATEID is a synchronous and > > > > interruptible operation. I'm suspecting that somehow it was > > > > interrupted and that's who the slot was able to be re-used by > > > > the > > > > frame 415874. But how the several opens were able to get the > > > > same > > > > slot > > > > I don't know.. > > > > > > Is this still true with the current linux-next? I would expect > > > this > > > patch > > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3453d5708b33efe76f40eca1c0ed60923094b971 > > > to change the Linux client behaviour in the above regard. > > > > > > > Note also that what you describe would appear to indicate a serious > > server bug. If the client is reusing a slot+seqid, then the correct > > server behaviour is to either return one of the errors > > NFS4ERR_DELAY, > > NFS4ERR_SEQ_FALSE_RETRY, NFS4ERR_RETRY_UNCACHED_REP, > > NFS4ERR_SEQ_MISORDERED, or else it must replay the exact same reply > > that it had cached for the original request. > > > > It is protocol violation for the server to execute new requests on > > a > > slot+seqid combination that has already been used. For that reason, > > it > > is also impossible for a replay to cause further state changes on > > the > > server; a replay means that the server belts out the exact same > > reply > > that was attempted sent earlier with no changes (with stateids that > > match bit for bit what would have been sent earlier). > > > > But it is the same requests because all of them are opens on the same > file same everything. That is irrelevant. The whole point of the session slot mechanism is to provide reliable only once semantics as defined in section 2.10.6 of RFC5661: https://tools.ietf.org/html/rfc5661#section-2.10.6 -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2019-03-19 15:33 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-02-28 10:10 [bug report] task hang while testing xfstests generic/323 Jiufei Xue 2019-02-28 22:26 ` Olga Kornievskaia 2019-02-28 23:56 ` Trond Myklebust 2019-03-01 5:19 ` Jiufei Xue 2019-03-01 5:08 ` Jiufei Xue 2019-03-01 8:49 ` Jiufei Xue 2019-03-01 13:08 ` Trond Myklebust 2019-03-02 16:34 ` Jiufei Xue 2019-03-04 15:20 ` Jiufei Xue 2019-03-04 15:50 ` Trond Myklebust 2019-03-05 5:09 ` Jiufei Xue 2019-03-05 14:45 ` Trond Myklebust 2019-03-06 9:59 ` Jiufei Xue 2019-03-06 16:09 ` bfields 2019-03-10 22:20 ` Olga Kornievskaia 2019-03-11 14:30 ` Trond Myklebust 2019-03-11 15:07 ` Olga Kornievskaia 2019-03-11 15:13 ` Olga Kornievskaia 2019-03-15 6:30 ` Jiufei Xue 2019-03-15 20:33 ` Olga Kornievskaia 2019-03-15 20:55 ` Trond Myklebust 2019-03-16 14:11 ` Jiufei Xue 2019-03-19 15:33 ` Olga Kornievskaia 2019-03-11 15:12 ` Trond Myklebust 2019-03-11 15:14 ` Olga Kornievskaia 2019-03-11 15:28 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.