linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Need help debugging NFS issues new to 4.20 kernel
@ 2019-01-24 17:32 Jason L Tibbitts III
  2019-01-24 19:28 ` Jason L Tibbitts III
  2019-01-24 19:58 ` Trond Myklebust
  0 siblings, 2 replies; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-01-24 17:32 UTC (permalink / raw)
  To: linux-nfs

I could use some help figuring out the cause of some serious NFS client
issues I'm having with the 4.20.3 kernel which I did not see under
4.19.15.

I have a network of about 130 desktops (plus a bunch of other machines,
VMs and the like) running Fedora 29 connecting to six NFS servers
running CentOS 7.6 (with the heavily patched vendor kernel
3.10.0-957.1.3).  All machines involved are x86_64.  We use kerberized
NFS4 with generally sec=krb5i.  The exports are generally made with
"(rw,async,sec=krb5i:krb5p)".

Since I booted those clients into 4.20.3 I've started seeing processes
getting stuck in the D state.  The system itself will seem OK (except
for the high load average) as long as I don't touch the hung NFS mount.
Nothing was logged to dmesg or to the journal.  So far booting back into
the 4.19.15 kernel has cleared up the problem.  I cannot yet reproduce
this on demand; I've tried but it is probably related to some specific
usage pattern.

Has anyone else seen issues like this?  Can anyone help me to get more
useful information that might point to the problem?  I still haven't
learned how to debug NFS issues properly.  And if there's a stress test
tool I could easily run that might help to reproduce the issue, I'd be
happy to run it.

I note that 4.20.4 is out; I see one sunrpc fix which I guess could be
related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
involved have plenty of free memory so I doubt that's it.  I'll
certainly try it anyway.

Various package versions:
kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
nfs-utils-2.3.3-1.rc2.fc29.x86_64
gssproxy-0.8.0-6.fc29.x86_64
krb5-libs-1.16.1-25.fc29.i686

Thanks in advance for any help or advice,

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
@ 2019-01-24 19:28 ` Jason L Tibbitts III
  2019-01-24 19:58 ` Trond Myklebust
  1 sibling, 0 replies; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-01-24 19:28 UTC (permalink / raw)
  To: linux-nfs

Just had another machine get into this state.  This time I thought to at
least do alt-sysrq-w.  The result is a big pile of backtraces but it
looks like basically every NFS operation is hung up.

I'm not sure how useful it will be, but I tried to categorize the
various backtraces.  Most of them are like the following, going through
nfs_file_fsync:

tcsh            D    0  9511      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 wait_on_page_bit_common+0x11a/0x270
 ? file_check_and_advance_wb_err+0xc0/0xc0
 __filemap_fdatawait_range+0xe2/0x130
 ? __filemap_fdatawrite_range+0xc8/0xf0
 filemap_write_and_wait_range+0x45/0x80
 nfs_file_fsync+0x44/0x1e0 [nfs]
 filp_close+0x31/0x60
 put_files_struct+0x6c/0xc0
 do_exit+0x2b4/0xb30
 ? __fput+0x151/0x220
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 ? do_sys_open+0x128/0x210
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4760d46ca2
Code: Bad RIP value.
RSP: 002b:00007ffc29c788d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: fffffffffffffe00 RBX: 00005652381bf990 RCX: 00007f4760d46ca2
RDX: 0000000000000000 RSI: 00005652381d4b40 RDI: 00000000ffffff9c
RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc29c78920
R10: 0000000000000000 R11: 0000000000000246 R12: 00005652381d4b40
R13: 0000000000000000 R14: 00005652381dcd30 R15: 0000000000000000


There are a few others which are hung on locking operations:

QThread         D    0  9740      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_proc_lock+0x1a9/0x320 [nfsv4]
 ? do_unlk+0x98/0xe0 [nfs]
 locks_remove_posix+0xb1/0x140
 ? __nfs_commit_inode+0xc4/0x1b0 [nfs]
 filp_close+0x50/0x60
 put_files_struct+0x6c/0xc0
 do_exit+0x2b4/0xb30
 ? mem_cgroup_try_charge+0xe6/0x180
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff3e9321421
Code: Bad RIP value.
RSP: 002b:00007ff3d7ffeac0 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
RAX: fffffffffffffdfc RBX: 00007ff3c8004db0 RCX: 00007ff3e9321421
RDX: 00000000ffffffff RSI: 0000000000000001 RDI: 00007ff3c8004db0
RBP: 0000000000000001 R08: 0000000000000000 R09: 00007ff3c8000c78
R10: 00007ff3d0003480 R11: 0000000000000293 R12: 00000000ffffffff
R13: 00007ff3e6e48f10 R14: 00000000ffffffff R15: 0000000000000001

gvfsd-metadata  D    0 24382      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_do_close+0x2bf/0x2f0 [nfsv4]
 __put_nfs_open_context+0xa9/0x100 [nfs]
 nfs_file_release+0x39/0x40 [nfs]
 __fput+0xb4/0x220
 task_work_run+0x84/0xa0
 do_exit+0x2d2/0xb30
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 ? handle_mm_fault+0xda/0x200
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cf843d421
Code: Bad RIP value.
RSP: 002b:00007ffc224ea250 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
RAX: fffffffffffffdfc RBX: 000055c9db8fa2a0 RCX: 00007f3cf843d421
RDX: 00000000ffffffff RSI: 0000000000000001 RDI: 000055c9db8fa2a0
RBP: 0000000000000001 R08: 0000000000000000 R09: 000055c9db8ea488
R10: 000055c9db8dbae0 R11: 0000000000000293 R12: 00000000ffffffff
R13: 00007f3cf85bff10 R14: 00000000ffffffff R15: 0000000000000001


And some others waiting on metadata operations:

localStorage DB D    0 24794   9621 0x00000004
Call Trace:
 ? __schedule+0x253/0x850
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 wait_on_page_bit_common+0x11a/0x270
 ? file_check_and_advance_wb_err+0xc0/0xc0
 __filemap_fdatawait_range+0xe2/0x130
 filemap_write_and_wait+0x47/0x70
 nfs_wb_all+0x1a/0x120 [nfs]
 nfs_setattr+0x1cb/0x1d0 [nfs]
 notify_change+0x2da/0x440
 do_truncate+0x74/0xc0
 do_sys_ftruncate+0xb6/0x100
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7eff1cd1382b
Code: Bad RIP value.
RSP: 002b:00007eff0c3787c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
RAX: ffffffffffffffda RBX: 0000000000bd0000 RCX: 00007eff1cd1382b
RDX: 0002000100000000 RSI: 0000000000bd0000 RDI: 000000000000005c
RBP: 00007eff034d61a0 R08: 00007ffc7d3b40b0 R09: 00007ffc7d3b4080
R10: 000000000274ded2 R11: 0000000000000246 R12: 000000000000005c
R13: 00007eff02f081c0 R14: 00007eff025fd8e0 R15: 00007eff034832e0

df              D    0 25187  25186 0x00000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 ? __rpc_atrun+0x20/0x20 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 ? trace_event_raw_event_rpc_stats_latency+0x240/0x240 [sunrpc]
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 __rpc_execute+0xe0/0x350 [sunrpc]
 ? recalibrate_cpu_khz+0x10/0x10
 ? ktime_get+0x36/0xa0
 rpc_run_task+0x121/0x180 [sunrpc]
 nfs4_call_sync_sequence+0x64/0xa0 [nfsv4]
 _nfs4_proc_getattr+0xf8/0x120 [nfsv4]
 nfs4_proc_getattr+0x75/0x100 [nfsv4]
 __nfs_revalidate_inode+0x1b7/0x2a0 [nfs]
 nfs_getattr+0x118/0x2b0 [nfs]
 ? security_inode_getattr+0x30/0x50
 vfs_statx+0x89/0xe0
 __do_sys_newstat+0x39/0x70
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f7dc6195579
Code: Bad RIP value.
RSP: 002b:00007ffd44e725a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 000055ad9832ef80 RCX: 00007f7dc6195579
RDX: 00007ffd44e72650 RSI: 00007ffd44e72650 RDI: 000055ad9832e380
RBP: 000055ad9832e310 R08: 0000000000000003 R09: aaaaaaaaaaaaaaab
R10: 000055ad98326010 R11: 0000000000000246 R12: 000055ad9832e3c0
R13: 0000000000000000 R14: 000055ad9832b9b0 R15: 0000000000000000


And others on writes:

mozStorage #1   D    0 24816   9621 0x00000004
Call Trace:
 ? __schedule+0x253/0x850
 ? __switch_to_asm+0x40/0x70
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 __lock_page+0x13c/0x230
 ? file_check_and_advance_wb_err+0xc0/0xc0
 pagecache_get_page+0x1ae/0x340
 grab_cache_page_write_begin+0x1c/0x40
 nfs_write_begin+0x63/0x340 [nfs]
 generic_perform_write+0xf4/0x1b0
 ? __handle_mm_fault+0xb9d/0x1380
 nfs_file_write+0xdc/0x200 [nfs]
 __vfs_write+0x136/0x1a0
 vfs_write+0xa5/0x1a0
 ksys_write+0x4f/0xb0
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7eff1d131d57
Code: Bad RIP value.
RSP: 002b:00007eff00546380 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000002d RCX: 00007eff1d131d57
RDX: 0000000000008000 RSI: 00007efef8085000 RDI: 000000000000002d
RBP: 00007efef8085000 R08: 0000000000000000 R09: 00007ffc7d3b4080
R10: 000000000274d526 R11: 0000000000000293 R12: 0000000000008000
R13: 00007eff08604868 R14: 00000000000300c8 R15: 00007efef8085000


Not sure what this one is waiting on:

kdm-uh          D    0 25357  25345 0x00000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __switch_to_asm+0x34/0x70
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 ? __rpc_atrun+0x20/0x20 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 ? trace_event_raw_event_rpc_stats_latency+0x240/0x240 [sunrpc]
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 __rpc_execute+0xe0/0x350 [sunrpc]
 ? recalibrate_cpu_khz+0x10/0x10
 ? ktime_get+0x36/0xa0
 rpc_run_task+0x121/0x180 [sunrpc]
 nfs4_call_sync_sequence+0x64/0xa0 [nfsv4]
 _nfs4_proc_getattr+0xf8/0x120 [nfsv4]
 nfs4_proc_getattr+0x75/0x100 [nfsv4]
 __nfs_revalidate_inode+0x1b7/0x2a0 [nfs]
 nfs_do_access+0x2cd/0x400 [nfs]
 ? rpcauth_lookupcred+0x99/0xc0 [sunrpc]
 nfs_permission+0x1b9/0x1e0 [nfs]
 inode_permission+0xbe/0x180
 link_path_walk.part.49+0x3f2/0x520
 path_openat+0x9f/0x1610
 ? page_counter_try_charge+0x57/0xc0
 ? memcg_kmem_charge_memcg+0x76/0xa0
 ? get_random_u32+0x3e/0xd0
 do_filp_open+0x93/0x100
 ? __check_object_size+0x15d/0x189
 do_sys_open+0x186/0x210
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3f0fc5dca2
Code: Bad RIP value.
RSP: 002b:00007ffc18c62840 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000055a70650bfa0 RCX: 00007f3f0fc5dca2
RDX: 0000000000000800 RSI: 000055a70650bcf0 RDI: 00000000ffffff9c
RBP: 0000000000001e34 R08: 000055a70650c3b0 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffc18c62a30 R14: 000055a70650bcf0 R15: 00007ffc18c62afc

And there's one kernel thread, I assume for the mount since that's the
IP of the NFS server.

172.21.86.74-ma D    0 25099      2 0x80000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_proc_sequence+0x3c/0x50 [nfsv4]
 nfs4_run_state_manager+0x257/0x760 [nfsv4]
 ? nfs4_do_reclaim+0x680/0x680 [nfsv4]
 kthread+0x112/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x35/0x40


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
  2019-01-24 19:28 ` Jason L Tibbitts III
@ 2019-01-24 19:58 ` Trond Myklebust
  2019-01-25 19:13   ` Schumaker, Anna
  2019-01-25 19:51   ` Jason L Tibbitts III
  1 sibling, 2 replies; 25+ messages in thread
From: Trond Myklebust @ 2019-01-24 19:58 UTC (permalink / raw)
  To: tibbs, Anna.Schumaker, linux-nfs, Chuck.Lever

On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
> I could use some help figuring out the cause of some serious NFS
> client
> issues I'm having with the 4.20.3 kernel which I did not see under
> 4.19.15.
> 
> I have a network of about 130 desktops (plus a bunch of other
> machines,
> VMs and the like) running Fedora 29 connecting to six NFS servers
> running CentOS 7.6 (with the heavily patched vendor kernel
> 3.10.0-957.1.3).  All machines involved are x86_64.  We use
> kerberized
> NFS4 with generally sec=krb5i.  The exports are generally made with
> "(rw,async,sec=krb5i:krb5p)".
> 
> Since I booted those clients into 4.20.3 I've started seeing
> processes
> getting stuck in the D state.  The system itself will seem OK (except
> for the high load average) as long as I don't touch the hung NFS
> mount.
> Nothing was logged to dmesg or to the journal.  So far booting back
> into
> the 4.19.15 kernel has cleared up the problem.  I cannot yet
> reproduce
> this on demand; I've tried but it is probably related to some
> specific
> usage pattern.
> 
> Has anyone else seen issues like this?  Can anyone help me to get
> more
> useful information that might point to the problem?  I still haven't
> learned how to debug NFS issues properly.  And if there's a stress
> test
> tool I could easily run that might help to reproduce the issue, I'd
> be
> happy to run it.
> 
> I note that 4.20.4 is out; I see one sunrpc fix which I guess could
> be
> related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
> involved have plenty of free memory so I doubt that's it.  I'll
> certainly try it anyway.
> 
> Various package versions:
> kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
> kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
> nfs-utils-2.3.3-1.rc2.fc29.x86_64
> gssproxy-0.8.0-6.fc29.x86_64
> krb5-libs-1.16.1-25.fc29.i686
> 
> Thanks in advance for any help or advice,
> 
>  - J<

Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
regression") was supposed to be marked for stable as a fix. Chuck &
Anna?
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 19:58 ` Trond Myklebust
@ 2019-01-25 19:13   ` Schumaker, Anna
  2019-01-26 17:59     ` Sasha Levin
  2019-01-25 19:51   ` Jason L Tibbitts III
  1 sibling, 1 reply; 25+ messages in thread
From: Schumaker, Anna @ 2019-01-25 19:13 UTC (permalink / raw)
  To: tibbs, stable, trondmy, Chuck.Lever, linux-nfs

On Thu, 2019-01-24 at 19:58 +0000, Trond Myklebust wrote:
> On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
> > I could use some help figuring out the cause of some serious NFS
> > client
> > issues I'm having with the 4.20.3 kernel which I did not see under
> > 4.19.15.
> > 
> > I have a network of about 130 desktops (plus a bunch of other
> > machines,
> > VMs and the like) running Fedora 29 connecting to six NFS servers
> > running CentOS 7.6 (with the heavily patched vendor kernel
> > 3.10.0-957.1.3).  All machines involved are x86_64.  We use
> > kerberized
> > NFS4 with generally sec=krb5i.  The exports are generally made with
> > "(rw,async,sec=krb5i:krb5p)".
> > 
> > Since I booted those clients into 4.20.3 I've started seeing
> > processes
> > getting stuck in the D state.  The system itself will seem OK (except
> > for the high load average) as long as I don't touch the hung NFS
> > mount.
> > Nothing was logged to dmesg or to the journal.  So far booting back
> > into
> > the 4.19.15 kernel has cleared up the problem.  I cannot yet
> > reproduce
> > this on demand; I've tried but it is probably related to some
> > specific
> > usage pattern.
> > 
> > Has anyone else seen issues like this?  Can anyone help me to get
> > more
> > useful information that might point to the problem?  I still haven't
> > learned how to debug NFS issues properly.  And if there's a stress
> > test
> > tool I could easily run that might help to reproduce the issue, I'd
> > be
> > happy to run it.
> > 
> > I note that 4.20.4 is out; I see one sunrpc fix which I guess could
> > be
> > related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
> > involved have plenty of free memory so I doubt that's it.  I'll
> > certainly try it anyway.
> > 
> > Various package versions:
> > kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
> > kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
> > nfs-utils-2.3.3-1.rc2.fc29.x86_64
> > gssproxy-0.8.0-6.fc29.x86_64
> > krb5-libs-1.16.1-25.fc29.i686
> > 
> > Thanks in advance for any help or advice,
> > 
> >  - J<
> 
> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
> regression") was supposed to be marked for stable as a fix. Chuck &
> Anna?

Looks like I missed that, sorry!

Stable folks, can you please backport deaa5c96c2f7 ("SUNRPC: Address Kerberos
performance/behavior regression") to v4.20?

Thanks,
Anna
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 19:58 ` Trond Myklebust
  2019-01-25 19:13   ` Schumaker, Anna
@ 2019-01-25 19:51   ` Jason L Tibbitts III
  2019-02-05 18:12     ` Jason Tibbitts
  1 sibling, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-01-25 19:51 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
TM> regression") was supposed to be marked for stable as a fix.

I wonder, though; is that likely to be the root of the problem I'm
seeing?  The commit description talks about this as a performance
regression, but I'm seeing a complete loss of NFS functionality.

Sadly I still don't have a reproducer, so outside of just deploying the
patch and hoping, I have no way to actually test this.  So far I've been
running things like:

  stress-ng --all 0 --class 'filesystem' -t 10m --times

in an NFS4-krb5p mounted directory without being able to reproduce the
problem.  That drives the load up close to 200 but everything seems to make
progress.  So it must be some specific sequence that causes it; I just
don't know which.

I did get this to show up in the kernel log, though, when I typed "df"
while running that stress-ng command was running:

[94547.656419] NFS: server nas00 error: fileid changed
               fsid 0:57: expected fileid 0x6bc27eb, got 0x9996c6

I've never seen that one before, but it doesn't seem to hurt anything.
(This is still with 4.20.3.)

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-25 19:13   ` Schumaker, Anna
@ 2019-01-26 17:59     ` Sasha Levin
  0 siblings, 0 replies; 25+ messages in thread
From: Sasha Levin @ 2019-01-26 17:59 UTC (permalink / raw)
  To: Schumaker, Anna; +Cc: tibbs, stable, trondmy, Chuck.Lever, linux-nfs

On Fri, Jan 25, 2019 at 07:13:27PM +0000, Schumaker, Anna wrote:
>On Thu, 2019-01-24 at 19:58 +0000, Trond Myklebust wrote:
>> On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
>> > I could use some help figuring out the cause of some serious NFS
>> > client
>> > issues I'm having with the 4.20.3 kernel which I did not see under
>> > 4.19.15.
>> >
>> > I have a network of about 130 desktops (plus a bunch of other
>> > machines,
>> > VMs and the like) running Fedora 29 connecting to six NFS servers
>> > running CentOS 7.6 (with the heavily patched vendor kernel
>> > 3.10.0-957.1.3).  All machines involved are x86_64.  We use
>> > kerberized
>> > NFS4 with generally sec=krb5i.  The exports are generally made with
>> > "(rw,async,sec=krb5i:krb5p)".
>> >
>> > Since I booted those clients into 4.20.3 I've started seeing
>> > processes
>> > getting stuck in the D state.  The system itself will seem OK (except
>> > for the high load average) as long as I don't touch the hung NFS
>> > mount.
>> > Nothing was logged to dmesg or to the journal.  So far booting back
>> > into
>> > the 4.19.15 kernel has cleared up the problem.  I cannot yet
>> > reproduce
>> > this on demand; I've tried but it is probably related to some
>> > specific
>> > usage pattern.
>> >
>> > Has anyone else seen issues like this?  Can anyone help me to get
>> > more
>> > useful information that might point to the problem?  I still haven't
>> > learned how to debug NFS issues properly.  And if there's a stress
>> > test
>> > tool I could easily run that might help to reproduce the issue, I'd
>> > be
>> > happy to run it.
>> >
>> > I note that 4.20.4 is out; I see one sunrpc fix which I guess could
>> > be
>> > related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
>> > involved have plenty of free memory so I doubt that's it.  I'll
>> > certainly try it anyway.
>> >
>> > Various package versions:
>> > kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
>> > kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
>> > nfs-utils-2.3.3-1.rc2.fc29.x86_64
>> > gssproxy-0.8.0-6.fc29.x86_64
>> > krb5-libs-1.16.1-25.fc29.i686
>> >
>> > Thanks in advance for any help or advice,
>> >
>> >  - J<
>>
>> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
>> regression") was supposed to be marked for stable as a fix. Chuck &
>> Anna?
>
>Looks like I missed that, sorry!
>
>Stable folks, can you please backport deaa5c96c2f7 ("SUNRPC: Address Kerberos
>performance/behavior regression") to v4.20?

Queued for 4.20, thank you.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-25 19:51   ` Jason L Tibbitts III
@ 2019-02-05 18:12     ` Jason Tibbitts
  2019-02-06 12:05       ` Benjamin Coddington
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Tibbitts @ 2019-02-05 18:12 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "JLT" == Jason L Tibbitts <tibbs@math.uh.edu> writes:

JLT> I wonder, though; is that likely to be the root of the problem I'm
JLT> seeing?  The commit description talks about this as a performance
JLT> regression, but I'm seeing a complete loss of NFS functionality.

It turns out that 4.20.6 (which has the patch in question) does not
appear to solve my issue.  At least, I've still had a few complete NFS
hangups on machines running 4.20.6 though I haven't had a chance to
closely inspect one yet.

As before, any hints about how I could best debug this issue would be
appreciated.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-05 18:12     ` Jason Tibbitts
@ 2019-02-06 12:05       ` Benjamin Coddington
       [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
  0 siblings, 1 reply; 25+ messages in thread
From: Benjamin Coddington @ 2019-02-06 12:05 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

On 5 Feb 2019, at 13:12, Jason Tibbitts wrote:

>>>>>> "JLT" == Jason L Tibbitts <tibbs@math.uh.edu> writes:
>
> JLT> I wonder, though; is that likely to be the root of the problem I'm
> JLT> seeing?  The commit description talks about this as a performance
> JLT> regression, but I'm seeing a complete loss of NFS functionality.
>
> It turns out that 4.20.6 (which has the patch in question) does not
> appear to solve my issue.  At least, I've still had a few complete NFS
> hangups on machines running 4.20.6 though I haven't had a chance to
> closely inspect one yet.
>
> As before, any hints about how I could best debug this issue would be
> appreciated.

Hi Jason, you can enable the NFS and SUNRPC tracepoints to get a better idea
of what the client was doing right before the hang or what the client is
trying to do during the hang:

echo "sunrpc:*" >>  /sys/kernel/debug/tracing/set_event
echo "nfs:*" >> /sys/kernel/debug/tracing/set_event

Examine or provide the contents of /sys/kernel/debug/tracing/trace.

Ben

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
       [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
@ 2019-02-07 11:13           ` Benjamin Coddington
       [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
  0 siblings, 1 reply; 25+ messages in thread
From: Benjamin Coddington @ 2019-02-07 11:13 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

On 6 Feb 2019, at 15:59, Jason Tibbitts wrote:

> This machine has a load of 93; a bunch of those are just df commands
> started by a monitoring process.  So far the user hasn't complained, so
> I can still poke at the machine for a while.

-10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing up in
the trace?  Ah.. my fault - add "nfs4:*" to the set_events.

Ben

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
       [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
@ 2019-02-08 12:01               ` Benjamin Coddington
  2019-02-08 15:19                 ` Chuck Lever
  2019-02-15 20:33                 ` Jason L Tibbitts III
  0 siblings, 2 replies; 25+ messages in thread
From: Benjamin Coddington @ 2019-02-08 12:01 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever


On 7 Feb 2019, at 10:25, Jason Tibbitts wrote:

>>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:
>
> BC> -10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
> BC> trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing
> BC> up in the trace?  Ah.. my fault - add "nfs4:*" to the set_events.
>
> OK, attached is another trace.  Here's the same sequence I snipped
> previously:

So the client is calling SEQ over and over.. xs_stream_read_data sees
-EAGAIN.. I'm not an expert here, and not seeing what's going wrong.

Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
xs_read_stream_reply() to return recv.copied instead of "ret" to
xprt_complete_rqst()..

You could try reverting that commit and see if the problem goes away..

Ben

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 12:01               ` Benjamin Coddington
@ 2019-02-08 15:19                 ` Chuck Lever
  2019-02-08 17:17                   ` Jason L Tibbitts III
  2019-02-15 20:33                 ` Jason L Tibbitts III
  1 sibling, 1 reply; 25+ messages in thread
From: Chuck Lever @ 2019-02-08 15:19 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Jason Tibbitts, Trond Myklebust, Anna Schumaker, Linux NFS Mailing List



> On Feb 8, 2019, at 7:01 AM, Benjamin Coddington <bcodding@redhat.com> wrote:
> 
> 
> On 7 Feb 2019, at 10:25, Jason Tibbitts wrote:
> 
>>>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:
>> 
>> BC> -10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
>> BC> trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing
>> BC> up in the trace?  Ah.. my fault - add "nfs4:*" to the set_events.
>> 
>> OK, attached is another trace.  Here's the same sequence I snipped
>> previously:
> 
> So the client is calling SEQ over and over.. xs_stream_read_data sees
> -EAGAIN.. I'm not an expert here, and not seeing what's going wrong.

<wag>

The server is returning SEQ_MISORDERED to a singleton SEQUENCE.
That suggests the client is trying to re-establish its lease but is
sending a slot nr the server doesn't recognize for the virtual slot
used for this purpose. Could be a problem on either side, and I
don't know enough to say how this loop could have started.

</wag>


> Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
> xs_read_stream_reply() to return recv.copied instead of "ret" to
> xprt_complete_rqst()..
> 
> You could try reverting that commit and see if the problem goes away..
> 
> Ben

--
Chuck Lever




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 15:19                 ` Chuck Lever
@ 2019-02-08 17:17                   ` Jason L Tibbitts III
  0 siblings, 0 replies; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-08 17:17 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Benjamin Coddington, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List

>>>>> "CL" == Chuck Lever <chuck.lever@oracle.com> writes:

CL> The server is returning SEQ_MISORDERED to a singleton SEQUENCE.
CL> That suggests the client is trying to re-establish its lease but is
CL> sending a slot nr the server doesn't recognize for the virtual slot
CL> used for this purpose. Could be a problem on either side, and I
CL> don't know enough to say how this loop could have started.

In case I didn't say so earlier, the server is just Centos 7.6 running
that heavily modified 3.10.0-957.1.3.el7.x86_64.  I was not able to
reproduce anything like this throughout the 4.19 cycle but it started
cropping up pretty quickly on user desktops after booting into 4.20.3
(the first of the 4.20 series I tried).  And it does not reoccur after
rebooting back to 4.19.15.  Of course I know that doesn't absolve the
server.

It is possible for me to try to push a mainline kernel to all of the
relevant file servers, but that obviously carries other overhead and
risks that I would certainly prefer to avoid.

I just wish I knew how to reproduce this other than sticking it on a
bunch of machines and waiting until people complain.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 12:01               ` Benjamin Coddington
  2019-02-08 15:19                 ` Chuck Lever
@ 2019-02-15 20:33                 ` Jason L Tibbitts III
  2019-02-16 14:46                   ` Trond Myklebust
  1 sibling, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-15 20:33 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:

BC> Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
BC> xs_read_stream_reply() to return recv.copied instead of "ret" to
BC> xprt_complete_rqst()..

BC> You could try reverting that commit and see if the problem goes
BC> away..

So patching a revert of that into 4.20.7 was beyond me, but I received
some help from Jeremy Cline on IRC (in #fedora-kernel) and ended up with
a patch I'll include at the end.  So far it does seem to be better, but
because of secure boot annoyances I haven't been able to roll it out
more generally.  However, it has been stable for a week on a few hosts
which have been problematic with stock 4.20.6.

I will continue to test, but hopefully this helps folks to understand
what's happening.

 - J<

From 322f581f514ffedb8884656f136bd6a812a53714 Mon Sep 17 00:00:00 2001
From: Jeremy Cline <jcline@redhat.com>
Date: Fri, 8 Feb 2019 13:09:41 -0500
Subject: [PATCH] Revert "SUNRPC: Fix RPC receive hangs"

This reverts commit c443305529d1d3d3bee0d68fdd14ae89835e091f.

Signed-off-by: Jeremy Cline <jcline@redhat.com>
---
 net/sunrpc/xprtsock.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9cdbb6d6e7f5..2d9f0326d55b 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -417,7 +417,7 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 		seek = 0;
 	} else {
 		seek -= buf->head[0].iov_len;
@@ -439,7 +439,7 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 		seek = 0;
 	} else {
 		seek -= want;
@@ -455,13 +455,16 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 	} else
 		offset += buf->tail[0].iov_len;
 	ret = -EMSGSIZE;
 out:
 	*read = offset - seek_init;
 	return ret;
+eagain:
+	ret = -EAGAIN;
+	goto out;
 sock_err:
 	offset += seek;
 	goto out;
@@ -504,20 +507,21 @@ xs_read_stream_request(struct sock_xprt *transport, struct msghdr *msg,
 	if (transport->recv.offset == transport->recv.len) {
 		if (xs_read_stream_request_done(transport))
 			msg->msg_flags |= MSG_EOR;
-		return read;
+		return transport->recv.copied;
 	}
 
 	switch (ret) {
-	default:
-		break;
 	case -EFAULT:
 	case -EMSGSIZE:
 		msg->msg_flags |= MSG_TRUNC;
-		return read;
+		return transport->recv.copied;
 	case 0:
 		return -ESHUTDOWN;
+	default:
+		if (ret < 0)
+			return ret;
 	}
-	return ret < 0 ? ret : read;
+	return -EAGAIN;
 }
 
 static size_t
@@ -556,7 +560,7 @@ xs_read_stream_call(struct sock_xprt *transport, struct msghdr *msg, int flags)
 
 	ret = xs_read_stream_request(transport, msg, flags, req);
 	if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
-		xprt_complete_bc_request(req, transport->recv.copied);
+		xprt_complete_bc_request(req, ret);
 
 	return ret;
 }
@@ -589,7 +593,7 @@ xs_read_stream_reply(struct sock_xprt *transport, struct msghdr *msg, int flags)
 
 	spin_lock(&xprt->queue_lock);
 	if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
-		xprt_complete_rqst(req->rq_task, transport->recv.copied);
+		xprt_complete_rqst(req->rq_task, ret);
 	xprt_unpin_rqst(req);
 out:
 	spin_unlock(&xprt->queue_lock);
@@ -610,8 +614,10 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		if (ret <= 0)
 			goto out_err;
 		transport->recv.offset = ret;
-		if (transport->recv.offset != want)
-			return transport->recv.offset;
+		if (ret != want) {
+			ret = -EAGAIN;
+			goto out_err;
+		}
 		transport->recv.len = be32_to_cpu(transport->recv.fraghdr) &
 			RPC_FRAGMENT_SIZE_MASK;
 		transport->recv.offset -= sizeof(transport->recv.fraghdr);
@@ -619,9 +625,6 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 	}
 
 	switch (be32_to_cpu(transport->recv.calldir)) {
-	default:
-		msg.msg_flags |= MSG_TRUNC;
-		break;
 	case RPC_CALL:
 		ret = xs_read_stream_call(transport, &msg, flags);
 		break;
@@ -636,8 +639,6 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		goto out_err;
 	read += ret;
 	if (transport->recv.offset < transport->recv.len) {
-		if (!(msg.msg_flags & MSG_TRUNC))
-			return read;
 		msg.msg_flags = 0;
 		ret = xs_read_discard(transport->sock, &msg, flags,
 				transport->recv.len - transport->recv.offset);
@@ -646,7 +647,7 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		transport->recv.offset += ret;
 		read += ret;
 		if (transport->recv.offset != transport->recv.len)
-			return read;
+			return -EAGAIN;
 	}
 	if (xs_read_stream_request_done(transport)) {
 		trace_xs_stream_read_request(transport);
@@ -670,7 +671,7 @@ static void xs_stream_data_receive(struct sock_xprt *transport)
 		goto out;
 	for (;;) {
 		ret = xs_read_stream(transport, MSG_DONTWAIT);
-		if (ret < 0)
+		if (ret <= 0)
 			break;
 		read += ret;
 		cond_resched();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-15 20:33                 ` Jason L Tibbitts III
@ 2019-02-16 14:46                   ` Trond Myklebust
  2019-02-20  2:13                     ` Jason L Tibbitts III
  2019-02-20 15:25                     ` Jason L Tibbitts III
  0 siblings, 2 replies; 25+ messages in thread
From: Trond Myklebust @ 2019-02-16 14:46 UTC (permalink / raw)
  To: tibbs, bcodding; +Cc: Anna.Schumaker, linux-nfs, Chuck.Lever

On Fri, 2019-02-15 at 14:33 -0600, Jason L Tibbitts III wrote:
> > > > > > "BC" == Benjamin Coddington <bcodding@redhat.com> writes:
> 
> BC> Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
> BC> xs_read_stream_reply() to return recv.copied instead of "ret" to
> BC> xprt_complete_rqst()..

That's a good thing. xprt_complete_rqst() really dislikes negative
error values.

> BC> You could try reverting that commit and see if the problem goes
> BC> away..
> 
> So patching a revert of that into 4.20.7 was beyond me, but I
> received
> some help from Jeremy Cline on IRC (in #fedora-kernel) and ended up
> with
> a patch I'll include at the end.  So far it does seem to be better,
> but
> because of secure boot annoyances I haven't been able to roll it out
> more generally.  However, it has been stable for a week on a few
> hosts
> which have been problematic with stock 4.20.6.
> 
> I will continue to test, but hopefully this helps folks to understand
> what's happening.
> 

Hmm... Does the following patch help at all?
---
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 7754aa3e434f..a1a8903ae5d0 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -404,8 +404,8 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr
*msg, int flags,
 	size_t want, seek_init = seek, offset = 0;
 	ssize_t ret;
 
-	if (seek < buf->head[0].iov_len) {
-		want = min_t(size_t, count, buf->head[0].iov_len);
+	want = min_t(size_t, count, buf->head[0].iov_len);
+	if (seek < want) {
 		ret = xs_read_kvec(sock, msg, flags, &buf->head[0],
want, seek);
 		if (ret <= 0)
 			goto sock_err;
@@ -416,8 +416,8 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr
*msg, int flags,
 			goto out;
 		seek = 0;
 	} else {
-		seek -= buf->head[0].iov_len;
-		offset += buf->head[0].iov_len;
+		seek -= want;
+		offset += want;
 	}
 
 	want = xs_alloc_sparse_pages(buf,
@@ -442,8 +442,8 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr
*msg, int flags,
 		offset += want;
 	}
 
-	if (seek < buf->tail[0].iov_len) {
-		want = min_t(size_t, count - offset, buf-
>tail[0].iov_len);
+	want = min_t(size_t, count - offset, buf->tail[0].iov_len);
+	if (seek < want) {
 		ret = xs_read_kvec(sock, msg, flags, &buf->tail[0],
want, seek);
 		if (ret <= 0)
 			goto sock_err;
@@ -453,7 +453,7 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr
*msg, int flags,
 		if (ret != want)
 			goto out;
 	} else
-		offset += buf->tail[0].iov_len;
+		offset = seek_init;
 	ret = -EMSGSIZE;
 out:
 	*read = offset - seek_init;
@@ -497,11 +497,9 @@ xs_read_stream_request(struct sock_xprt
*transport, struct msghdr *msg,
 			&read);
 	transport->recv.offset += read;
 	transport->recv.copied += read;
-	if (transport->recv.offset == transport->recv.len) {
-		if (xs_read_stream_request_done(transport))
-			msg->msg_flags |= MSG_EOR;
-		return read;
-	}
+	if (transport->recv.offset == transport->recv.len &&
+	    xs_read_stream_request_done(transport))
+		msg->msg_flags |= MSG_EOR;
 
 	switch (ret) {
 	default:
@@ -671,6 +669,8 @@ static void xs_stream_data_receive(struct sock_xprt
*transport)
 		read += ret;
 		cond_resched();
 	}
+	if (ret == -ESHUTDOWN)
+		kernel_sock_shutdown(transport->sock, SHUT_RDWR);
 out:
 	mutex_unlock(&transport->recv_mutex);
 	trace_xs_stream_read_data(&transport->xprt, ret, read);
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-16 14:46                   ` Trond Myklebust
@ 2019-02-20  2:13                     ` Jason L Tibbitts III
  2019-02-20 15:25                     ` Jason L Tibbitts III
  1 sibling, 0 replies; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-20  2:13 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> Hmm... Does the following patch help at all?

I un-wrapped it and applied it over 4.20.10 and pushed it out to a
couple of problematic machines.  So far it's been OK there, so I've
rolled it out to 50 desktops (because they happen to have secure boot
disabled and thus allow me to boot custom kernels) and I guess we'll see
how that goes.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-16 14:46                   ` Trond Myklebust
  2019-02-20  2:13                     ` Jason L Tibbitts III
@ 2019-02-20 15:25                     ` Jason L Tibbitts III
  2019-02-20 15:37                       ` Trond Myklebust
  1 sibling, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-20 15:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

Sadly, after a bit more testing I find that 4.20.10 with that patch
applied still exhibits the problem.  Just found a machine where NFS hung
up just six minutes after the user logged in.  As always: nothing is
logged to the client or the server.  I turned on the tracing as
requested previously and see what I think it just the same sequence as
before.  Please let me know if there is any other information I can
provide.  Right now I'm going to build 4.20.10 with my previously posted
revert patch applied and roll that out to the same 50 hosts and see how
it fares.

 - J<

    kworker/u8:0-12608 [001] .... 47884.661073: rpc_request: task:20469@1 nfsv4 SEQUENCE (async)
    kworker/u8:0-12608 [001] .... 47884.661073: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_reserve [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_reserveresult [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_refresh [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661075: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_refreshresult [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_allocate [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_encode [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action: task:20469@1 flags=5281 state=001d status=0 action=call_bind [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action: task:20469@1 flags=5281 state=001d status=0 action=call_connect [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661092: rpc_task_run_action: task:20469@1 flags=5281 state=001d status=0 action=call_transmit [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661106: xprt_transmit: peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
    kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_run_action: task:20469@1 flags=5a81 state=0015 status=0 action=call_transmit_status [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_sleep: task:20469@1 flags=5a81 state=0015 status=0 timeout=60000 queue=xprt_pending
    kworker/u9:0-12884 [002] .... 47884.661371: xprt_lookup_rqst: peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
    kworker/u9:0-12884 [002] .... 47884.661372: xprt_complete_rqst: peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=112
    kworker/u9:0-12884 [002] .... 47884.661373: rpc_task_wakeup: task:20469@1 flags=5a81 state=0006 status=0 timeout=60000 queue=xprt_pending
    kworker/u9:0-12884 [002] .... 47884.661375: xs_stream_read_request: peer=[172.21.86.74]:2049 xid=0x0d7ca068 copied=112 reclen=112 offset=112
    kworker/u9:0-12884 [002] .... 47884.661376: xs_stream_read_data: peer=[172.21.86.74]:2049 err=-11 total=116
    kworker/u8:0-12608 [001] .... 47884.661388: rpc_task_run_action: task:20469@1 flags=5a81 state=0005 status=0 action=xprt_timer [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action: task:20469@1 flags=5a81 state=0005 status=0 action=call_status [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action: task:20469@1 flags=5a81 state=0005 status=0 action=call_decode [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661408: rpc_task_run_action: task:20469@1 flags=5a81 state=0005 status=-10063 action=rpc_exit_task [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661409: nfs4_sequence_done: error=-10063 (SEQ_MISORDERED) session=0x4480e725 slot_nr=0 seq_nr=1 highest_slotid=0 target_highest_slotid=0 status_flags=0 ()
    kworker/u8:0-12608 [001] .... 47884.661410: rpc_stats_latency: task:20469@1 xid=0x0d7ca068 nfsv4 SEQUENCE backlog=21 rtt=278 execute=339
    kworker/u8:0-12608 [001] .... 47884.661412: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=rpc_prepare_task [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661413: nfs4_setup_sequence: session=0x4480e725 slot_nr=0 seq_nr=1 highest_used_slotid=0
    kworker/u8:0-12608 [001] .... 47884.661413: rpc_task_run_action: task:20469@1 flags=5281 state=0005 status=0 action=call_start [sunrpc]
    kworker/u8:0-12608 [001] .... 47884.661414: rpc_request: task:20469@1 nfsv4 SEQUENCE (async)
   

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 15:25                     ` Jason L Tibbitts III
@ 2019-02-20 15:37                       ` Trond Myklebust
  2019-02-20 15:39                         ` Chuck Lever
                                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Trond Myklebust @ 2019-02-20 15:37 UTC (permalink / raw)
  To: tibbs; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

On Wed, 2019-02-20 at 09:25 -0600, Jason L Tibbitts III wrote:
> Sadly, after a bit more testing I find that 4.20.10 with that patch
> applied still exhibits the problem.  Just found a machine where NFS
> hung
> up just six minutes after the user logged in.  As always: nothing is
> logged to the client or the server.  I turned on the tracing as
> requested previously and see what I think it just the same sequence
> as
> before.  Please let me know if there is any other information I can
> provide.  Right now I'm going to build 4.20.10 with my previously
> posted
> revert patch applied and roll that out to the same 50 hosts and see
> how
> it fares.
> 
>  - J<
> 
>     kworker/u8:0-12608 [001] .... 47884.661073: rpc_request: task:
> 20469@1 nfsv4 SEQUENCE (async)
>     kworker/u8:0-12608 [001] .... 47884.661073: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_reserve
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_reserveresult
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_refresh
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661075: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_refreshresult
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_allocate
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_encode
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_bind [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_connect
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661092: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_transmit
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661106: xprt_transmit:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>     kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0015 status=0
> action=call_transmit_status [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_sleep: task:
> 20469@1 flags=5a81 state=0015 status=0 timeout=60000
> queue=xprt_pending
>     kworker/u9:0-12884 [002] .... 47884.661371: xprt_lookup_rqst:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>     kworker/u9:0-12884 [002] .... 47884.661372: xprt_complete_rqst:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=112
>     kworker/u9:0-12884 [002] .... 47884.661373: rpc_task_wakeup:
> task:20469@1 flags=5a81 state=0006 status=0 timeout=60000
> queue=xprt_pending
>     kworker/u9:0-12884 [002] .... 47884.661375:
> xs_stream_read_request: peer=[172.21.86.74]:2049 xid=0x0d7ca068
> copied=112 reclen=112 offset=112
>     kworker/u9:0-12884 [002] .... 47884.661376: xs_stream_read_data:
> peer=[172.21.86.74]:2049 err=-11 total=116
>     kworker/u8:0-12608 [001] .... 47884.661388: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=xprt_timer
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=call_status
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=call_decode
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661408: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=-10063 action=rpc_exit_task
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661409: nfs4_sequence_done:
> error=-10063 (SEQ_MISORDERED) session=0x4480e725 slot_nr=0 seq_nr=1
> highest_slotid=0 target_highest_slotid=0 status_flags=0 ()
>     kworker/u8:0-12608 [001] .... 47884.661410: rpc_stats_latency:
> task:20469@1 xid=0x0d7ca068 nfsv4 SEQUENCE backlog=21 rtt=278
> execute=339
>     kworker/u8:0-12608 [001] .... 47884.661412: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=rpc_prepare_task
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661413: nfs4_setup_sequence:
> session=0x4480e725 slot_nr=0 seq_nr=1 highest_used_slotid=0
>     kworker/u8:0-12608 [001] .... 47884.661413: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_start
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661414: rpc_request: task:
> 20469@1 nfsv4 SEQUENCE (async)

This is not an RPC layer issue. It is a SEQ_MISORDERED error on slot 0.
If the client can't recover then that will hang your NFSv4.1 session
against that server.

Are you sure this isn't what was happening previously?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 15:37                       ` Trond Myklebust
@ 2019-02-20 15:39                         ` Chuck Lever
  2019-02-20 15:41                         ` Trond Myklebust
  2019-02-20 16:25                         ` Jason L Tibbitts III
  2 siblings, 0 replies; 25+ messages in thread
From: Chuck Lever @ 2019-02-20 15:39 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: tibbs, bcodding, Anna Schumaker, Linux NFS Mailing List



> On Feb 20, 2019, at 10:37 AM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Wed, 2019-02-20 at 09:25 -0600, Jason L Tibbitts III wrote:
>> Sadly, after a bit more testing I find that 4.20.10 with that patch
>> applied still exhibits the problem.  Just found a machine where NFS
>> hung
>> up just six minutes after the user logged in.  As always: nothing is
>> logged to the client or the server.  I turned on the tracing as
>> requested previously and see what I think it just the same sequence
>> as
>> before.  Please let me know if there is any other information I can
>> provide.  Right now I'm going to build 4.20.10 with my previously
>> posted
>> revert patch applied and roll that out to the same 50 hosts and see
>> how
>> it fares.
>> 
>> - J<
>> 
>>    kworker/u8:0-12608 [001] .... 47884.661073: rpc_request: task:
>> 20469@1 nfsv4 SEQUENCE (async)
>>    kworker/u8:0-12608 [001] .... 47884.661073: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_reserve
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_reserveresult
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_refresh
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661075: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_refreshresult
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_allocate
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_encode
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
>> task:20469@1 flags=5281 state=001d status=0 action=call_bind [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
>> task:20469@1 flags=5281 state=001d status=0 action=call_connect
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661092: rpc_task_run_action:
>> task:20469@1 flags=5281 state=001d status=0 action=call_transmit
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661106: xprt_transmit:
>> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>>    kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_run_action:
>> task:20469@1 flags=5a81 state=0015 status=0
>> action=call_transmit_status [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_sleep: task:
>> 20469@1 flags=5a81 state=0015 status=0 timeout=60000
>> queue=xprt_pending
>>    kworker/u9:0-12884 [002] .... 47884.661371: xprt_lookup_rqst:
>> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>>    kworker/u9:0-12884 [002] .... 47884.661372: xprt_complete_rqst:
>> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=112
>>    kworker/u9:0-12884 [002] .... 47884.661373: rpc_task_wakeup:
>> task:20469@1 flags=5a81 state=0006 status=0 timeout=60000
>> queue=xprt_pending
>>    kworker/u9:0-12884 [002] .... 47884.661375:
>> xs_stream_read_request: peer=[172.21.86.74]:2049 xid=0x0d7ca068
>> copied=112 reclen=112 offset=112
>>    kworker/u9:0-12884 [002] .... 47884.661376: xs_stream_read_data:
>> peer=[172.21.86.74]:2049 err=-11 total=116
>>    kworker/u8:0-12608 [001] .... 47884.661388: rpc_task_run_action:
>> task:20469@1 flags=5a81 state=0005 status=0 action=xprt_timer
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
>> task:20469@1 flags=5a81 state=0005 status=0 action=call_status
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
>> task:20469@1 flags=5a81 state=0005 status=0 action=call_decode
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661408: rpc_task_run_action:
>> task:20469@1 flags=5a81 state=0005 status=-10063 action=rpc_exit_task
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661409: nfs4_sequence_done:
>> error=-10063 (SEQ_MISORDERED) session=0x4480e725 slot_nr=0 seq_nr=1
>> highest_slotid=0 target_highest_slotid=0 status_flags=0 ()
>>    kworker/u8:0-12608 [001] .... 47884.661410: rpc_stats_latency:
>> task:20469@1 xid=0x0d7ca068 nfsv4 SEQUENCE backlog=21 rtt=278
>> execute=339
>>    kworker/u8:0-12608 [001] .... 47884.661412: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=rpc_prepare_task
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661413: nfs4_setup_sequence:
>> session=0x4480e725 slot_nr=0 seq_nr=1 highest_used_slotid=0
>>    kworker/u8:0-12608 [001] .... 47884.661413: rpc_task_run_action:
>> task:20469@1 flags=5281 state=0005 status=0 action=call_start
>> [sunrpc]
>>    kworker/u8:0-12608 [001] .... 47884.661414: rpc_request: task:
>> 20469@1 nfsv4 SEQUENCE (async)
> 
> This is not an RPC layer issue. It is a SEQ_MISORDERED error on slot 0.
> If the client can't recover then that will hang your NFSv4.1 session
> against that server.
> 
> Are you sure this isn't what was happening previously?

This is exactly what was happening earlier. See my e-mail to the list
on Feb 8.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 15:37                       ` Trond Myklebust
  2019-02-20 15:39                         ` Chuck Lever
@ 2019-02-20 15:41                         ` Trond Myklebust
  2019-02-21 18:19                           ` Jason L Tibbitts III
  2019-02-20 16:25                         ` Jason L Tibbitts III
  2 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2019-02-20 15:41 UTC (permalink / raw)
  To: tibbs; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

On Wed, 2019-02-20 at 15:37 +0000, Trond Myklebust wrote:
> 
> This is not an RPC layer issue. It is a SEQ_MISORDERED error on slot
> 0.
> If the client can't recover then that will hang your NFSv4.1 session
> against that server.

Can you please try the following patch? It uses a different approach to
dealing with interrupted NFSv4.1 RPC calls.

8<-----------------------------------
From d72ad5fe4c2206458d127cc31e690318cf2e2731 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Wed, 20 Jun 2018 17:53:34 -0400
Subject: [PATCH] NFSv4.1: Avoid false retries when RPC calls are interrupted

A 'false retry' in NFSv4.1 occurs when the client attempts to transmit a
new RPC call using a slot+sequence number combination that references an
already cached one. Currently, the Linux NFS client will do this if a
user process interrupts an RPC call that is in progress.
The problem with doing so is that we defeat the main mechanism used by
the server to differentiate between a new call and a replayed one. Even
if the server is able to perfectly cache the arguments of the old call,
it cannot know if the client intended to replay or send a new call.

The obvious fix is to bump the sequence number pre-emptively if an
RPC call is interrupted, but in order to deal with the corner cases
where the interrupted call is not actually received and processed by
the server, we need to interpret the error NFS4ERR_SEQ_MISORDERED
as a sign that we need to either wait or locate a correct sequence
number that lies between the value we sent, and the last value that
was acked by a SEQUENCE call on that slot.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c    | 105 ++++++++++++++++++++-----------------------
 fs/nfs/nfs4session.c |   5 ++-
 fs/nfs/nfs4session.h |   5 ++-
 3 files changed, 55 insertions(+), 60 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 1dbdc85e6885..77c6e2d3f3fc 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -730,13 +730,25 @@ static void nfs41_sequence_free_slot(struct nfs4_sequence_res *res)
 	res->sr_slot = NULL;
 }
 
+static void nfs4_slot_sequence_record_sent(struct nfs4_slot *slot,
+		u32 seqnr)
+{
+	if ((s32)(seqnr - slot->seq_nr_highest_sent) > 0)
+		slot->seq_nr_highest_sent = seqnr;
+}
+static void nfs4_slot_sequence_acked(struct nfs4_slot *slot,
+		u32 seqnr)
+{
+	slot->seq_nr_highest_sent = seqnr;
+	slot->seq_nr_last_acked = seqnr;
+}
+
 static int nfs41_sequence_process(struct rpc_task *task,
 		struct nfs4_sequence_res *res)
 {
 	struct nfs4_session *session;
 	struct nfs4_slot *slot = res->sr_slot;
 	struct nfs_client *clp;
-	bool interrupted = false;
 	int ret = 1;
 
 	if (slot == NULL)
@@ -747,16 +759,12 @@ static int nfs41_sequence_process(struct rpc_task *task,
 
 	session = slot->table->session;
 
-	if (slot->interrupted) {
-		if (res->sr_status != -NFS4ERR_DELAY)
-			slot->interrupted = 0;
-		interrupted = true;
-	}
-
 	trace_nfs4_sequence_done(session, res);
 	/* Check the SEQUENCE operation status */
 	switch (res->sr_status) {
 	case 0:
+		/* Mark this sequence number as having been acked */
+		nfs4_slot_sequence_acked(slot, slot->seq_nr);
 		/* Update the slot's sequence and clientid lease timer */
 		slot->seq_done = 1;
 		clp = session->clp;
@@ -771,9 +779,9 @@ static int nfs41_sequence_process(struct rpc_task *task,
 		 * sr_status remains 1 if an RPC level error occurred.
 		 * The server may or may not have processed the sequence
 		 * operation..
-		 * Mark the slot as having hosted an interrupted RPC call.
 		 */
-		slot->interrupted = 1;
+		nfs4_slot_sequence_record_sent(slot, slot->seq_nr);
+		slot->seq_done = 1;
 		goto out;
 	case -NFS4ERR_DELAY:
 		/* The server detected a resend of the RPC call and
@@ -784,6 +792,7 @@ static int nfs41_sequence_process(struct rpc_task *task,
 			__func__,
 			slot->slot_nr,
 			slot->seq_nr);
+		nfs4_slot_sequence_acked(slot, slot->seq_nr);
 		goto out_retry;
 	case -NFS4ERR_RETRY_UNCACHED_REP:
 	case -NFS4ERR_SEQ_FALSE_RETRY:
@@ -791,6 +800,7 @@ static int nfs41_sequence_process(struct rpc_task *task,
 		 * The server thinks we tried to replay a request.
 		 * Retry the call after bumping the sequence ID.
 		 */
+		nfs4_slot_sequence_acked(slot, slot->seq_nr);
 		goto retry_new_seq;
 	case -NFS4ERR_BADSLOT:
 		/*
@@ -801,21 +811,28 @@ static int nfs41_sequence_process(struct rpc_task *task,
 			goto session_recover;
 		goto retry_nowait;
 	case -NFS4ERR_SEQ_MISORDERED:
+		nfs4_slot_sequence_record_sent(slot, slot->seq_nr);
 		/*
-		 * Was the last operation on this sequence interrupted?
-		 * If so, retry after bumping the sequence number.
+		 * Were one or more calls using this slot interrupted?
+		 * If the server never received the request, then our
+		 * transmitted slot sequence number may be too high.
 		 */
-		if (interrupted)
-			goto retry_new_seq;
-		/*
-		 * Could this slot have been previously retired?
-		 * If so, then the server may be expecting seq_nr = 1!
-		 */
-		if (slot->seq_nr != 1) {
-			slot->seq_nr = 1;
+		if ((s32)(slot->seq_nr - slot->seq_nr_last_acked) > 1) {
+			slot->seq_nr--;
 			goto retry_nowait;
 		}
-		goto session_recover;
+		/*
+		 * RFC5661:
+		 * A retry might be sent while the original request is
+		 * still in progress on the replier. The replier SHOULD
+		 * deal with the issue by returning NFS4ERR_DELAY as the
+		 * reply to SEQUENCE or CB_SEQUENCE operation, but
+		 * implementations MAY return NFS4ERR_SEQ_MISORDERED.
+		 *
+		 * Restart the search after a delay.
+		 */
+		slot->seq_nr = slot->seq_nr_highest_sent;
+		goto out_retry;
 	default:
 		/* Just update the slot sequence no. */
 		slot->seq_done = 1;
@@ -906,17 +923,6 @@ static const struct rpc_call_ops nfs41_call_sync_ops = {
 	.rpc_call_done = nfs41_call_sync_done,
 };
 
-static void
-nfs4_sequence_process_interrupted(struct nfs_client *client,
-		struct nfs4_slot *slot, const struct cred *cred)
-{
-	struct rpc_task *task;
-
-	task = _nfs41_proc_sequence(client, cred, slot, true);
-	if (!IS_ERR(task))
-		rpc_put_task_async(task);
-}
-
 #else	/* !CONFIG_NFS_V4_1 */
 
 static int nfs4_sequence_process(struct rpc_task *task, struct nfs4_sequence_res *res)
@@ -937,14 +943,6 @@ int nfs4_sequence_done(struct rpc_task *task,
 }
 EXPORT_SYMBOL_GPL(nfs4_sequence_done);
 
-static void
-nfs4_sequence_process_interrupted(struct nfs_client *client,
-		struct nfs4_slot *slot, const struct cred *cred)
-{
-	WARN_ON_ONCE(1);
-	slot->interrupted = 0;
-}
-
 #endif	/* !CONFIG_NFS_V4_1 */
 
 static
@@ -982,26 +980,19 @@ int nfs4_setup_sequence(struct nfs_client *client,
 		task->tk_timeout = 0;
 	}
 
-	for (;;) {
-		spin_lock(&tbl->slot_tbl_lock);
-		/* The state manager will wait until the slot table is empty */
-		if (nfs4_slot_tbl_draining(tbl) && !args->sa_privileged)
-			goto out_sleep;
-
-		slot = nfs4_alloc_slot(tbl);
-		if (IS_ERR(slot)) {
-			/* Try again in 1/4 second */
-			if (slot == ERR_PTR(-ENOMEM))
-				task->tk_timeout = HZ >> 2;
-			goto out_sleep;
-		}
-		spin_unlock(&tbl->slot_tbl_lock);
+	spin_lock(&tbl->slot_tbl_lock);
+	/* The state manager will wait until the slot table is empty */
+	if (nfs4_slot_tbl_draining(tbl) && !args->sa_privileged)
+		goto out_sleep;
 
-		if (likely(!slot->interrupted))
-			break;
-		nfs4_sequence_process_interrupted(client,
-				slot, task->tk_msg.rpc_cred);
+	slot = nfs4_alloc_slot(tbl);
+	if (IS_ERR(slot)) {
+		/* Try again in 1/4 second */
+		if (slot == ERR_PTR(-ENOMEM))
+			task->tk_timeout = HZ >> 2;
+		goto out_sleep;
 	}
+	spin_unlock(&tbl->slot_tbl_lock);
 
 	nfs4_sequence_attach_slot(args, res, slot);
 
diff --git a/fs/nfs/nfs4session.c b/fs/nfs/nfs4session.c
index 25c5255a395c..bcb532def9e2 100644
--- a/fs/nfs/nfs4session.c
+++ b/fs/nfs/nfs4session.c
@@ -110,6 +110,8 @@ static struct nfs4_slot *nfs4_new_slot(struct nfs4_slot_table  *tbl,
 		slot->table = tbl;
 		slot->slot_nr = slotid;
 		slot->seq_nr = seq_init;
+		slot->seq_nr_highest_sent = seq_init;
+		slot->seq_nr_last_acked = seq_init - 1;
 	}
 	return slot;
 }
@@ -276,7 +278,8 @@ static void nfs4_reset_slot_table(struct nfs4_slot_table *tbl,
 	p = &tbl->slots;
 	while (*p) {
 		(*p)->seq_nr = ivalue;
-		(*p)->interrupted = 0;
+		(*p)->seq_nr_highest_sent = ivalue;
+		(*p)->seq_nr_last_acked = ivalue - 1;
 		p = &(*p)->next;
 	}
 	tbl->highest_used_slotid = NFS4_NO_SLOT;
diff --git a/fs/nfs/nfs4session.h b/fs/nfs/nfs4session.h
index 3c550f297561..230509b77121 100644
--- a/fs/nfs/nfs4session.h
+++ b/fs/nfs/nfs4session.h
@@ -23,8 +23,9 @@ struct nfs4_slot {
 	unsigned long		generation;
 	u32			slot_nr;
 	u32		 	seq_nr;
-	unsigned int		interrupted : 1,
-				privileged : 1,
+	u32		 	seq_nr_last_acked;
+	u32		 	seq_nr_highest_sent;
+	unsigned int		privileged : 1,
 				seq_done : 1;
 };
 
-- 
2.20.1

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 15:37                       ` Trond Myklebust
  2019-02-20 15:39                         ` Chuck Lever
  2019-02-20 15:41                         ` Trond Myklebust
@ 2019-02-20 16:25                         ` Jason L Tibbitts III
  2019-02-20 16:45                           ` Trond Myklebust
  2 siblings, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-20 16:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> This is not an RPC layer issue. It is a SEQ_MISORDERED error on slot
TM> 0.  If the client can't recover then that will hang your NFSv4.1
TM> session against that server.

TM> Are you sure this isn't what was happening previously?

I'm sorry if I wasn't clear.  I applied your patch and it doesn't seem
to make any difference.  The behavior appears to be exactly what was
happening previously, though I posted the trace again just so that those
who know more than me could verify that.

So currently I'm falling back to the previous patch I was testing, which
reverts c443305529d1d3d3bee0d68fdd14ae89835e091f on top of the stable
stream.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 16:25                         ` Jason L Tibbitts III
@ 2019-02-20 16:45                           ` Trond Myklebust
  2019-02-20 16:49                             ` Jason L Tibbitts III
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2019-02-20 16:45 UTC (permalink / raw)
  To: tibbs; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

On Wed, 2019-02-20 at 10:25 -0600, Jason L Tibbitts III wrote:
> > > > > > "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:
> 
> TM> This is not an RPC layer issue. It is a SEQ_MISORDERED error on
> slot
> TM> 0.  If the client can't recover then that will hang your NFSv4.1
> TM> session against that server.
> 
> TM> Are you sure this isn't what was happening previously?
> 
> I'm sorry if I wasn't clear.  I applied your patch and it doesn't
> seem
> to make any difference.  The behavior appears to be exactly what was
> happening previously, though I posted the trace again just so that
> those
> who know more than me could verify that.
> 

That's why I sent you a patch that addresses the SEQ_MISORDERED issue.
Are you saying that failed too?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 16:45                           ` Trond Myklebust
@ 2019-02-20 16:49                             ` Jason L Tibbitts III
  0 siblings, 0 replies; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-20 16:49 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> That's why I sent you a patch that addresses the SEQ_MISORDERED
TM> issue.  Are you saying that failed too?

If you're referring to d72ad5fe4c2206458d127cc31e690318cf2e2731
"NFSv4.1: Avoid false retries when RPC calls are interrupted" which you
just sent a few minutes ago, I'm building it now.  It will take me a
while to roll this out to user desktops and then see if anyone hits it.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-20 15:41                         ` Trond Myklebust
@ 2019-02-21 18:19                           ` Jason L Tibbitts III
  2019-02-25 19:24                             ` Jason L Tibbitts III
  0 siblings, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-21 18:19 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> Can you please try the following patch? It uses a different approach
TM> to dealing with interrupted NFSv4.1 RPC calls.

I can say that so far, after applying this last night to 50 machines and
users having logged for a bit over five hours now, things appear to be
working properly.  I haven't seen any issues this morning whereas with
an unpatched kernel I would have expected to see several hangs (or at
least user-initiated reboots) by now.

I will of course continue to monitor things, but at this point, feel
free to add

Tested-by: Jason Tibbitts <tibbs@math.uh.edu>

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-21 18:19                           ` Jason L Tibbitts III
@ 2019-02-25 19:24                             ` Jason L Tibbitts III
  2019-02-25 23:15                               ` Benjamin Coddington
  0 siblings, 1 reply; 25+ messages in thread
From: Jason L Tibbitts III @ 2019-02-25 19:24 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: bcodding, Anna.Schumaker, linux-nfs, Chuck.Lever

So I've now running this patch ("NFSv4.1: Avoid false retries when RPC
calls are interrupted") for several days with no hangs at all and no
other regressions noted.  What would the next step be?  Will this be
sent upstream?  I'm not sure how to check if this is queued for
submission in someone's tree.

I doubt there is anything I can do to help the process but please let me
know if there is.

 - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-25 19:24                             ` Jason L Tibbitts III
@ 2019-02-25 23:15                               ` Benjamin Coddington
  0 siblings, 0 replies; 25+ messages in thread
From: Benjamin Coddington @ 2019-02-25 23:15 UTC (permalink / raw)
  To: Jason L Tibbitts III
  Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

Hi Jason,

It looks like Trond has this patch on his "linux-next" and on his 
"testing" branch:

http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/linux-next

commit 3453d5708b33efe76f40eca1c0ed60923094b971
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Wed Jun 20 17:53:34 2018 -0400

     NFSv4.1: Avoid false retries when RPC calls are interrupted

     A 'false retry' in NFSv4.1 occurs when the client attempts to 
transmit a
     new RPC call using a slot+sequence number combination that 
references an
     already cached one. Currently, the Linux NFS client will do this if 
a
     user process interrupts an RPC call that is in progress.
     The problem with doing so is that we defeat the main mechanism used 
by
     the server to differentiate between a new call and a replayed one. 
Even
     if the server is able to perfectly cache the arguments of the old 
call,
     it cannot know if the client intended to replay or send a new call.

     The obvious fix is to bump the sequence number pre-emptively if an
     RPC call is interrupted, but in order to deal with the corner cases
     where the interrupted call is not actually received and processed 
by
     the server, we need to interpret the error NFS4ERR_SEQ_MISORDERED
     as a sign that we need to either wait or locate a correct sequence
     number that lies between the value we sent, and the last value that
     was acked by a SEQUENCE call on that slot.

     Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
     Tested-by: Jason Tibbitts <tibbs@math.uh.edu>

That's usually a good sign that he will include it in a pull request 
when the 5.1 merge window opens.

Anna and Trond trade off sending patches for each linux release, this 
time around Trond will send them.

Without having to take any further steps I expect this will land in 
mainline for 5.1, and fedora will eventually pick it up.

Ben

On 25 Feb 2019, at 14:24, Jason L Tibbitts III wrote:

> So I've now running this patch ("NFSv4.1: Avoid false retries when RPC
> calls are interrupted") for several days with no hangs at all and no
> other regressions noted.  What would the next step be?  Will this be
> sent upstream?  I'm not sure how to check if this is queued for
> submission in someone's tree.
>
> I doubt there is anything I can do to help the process but please let 
> me
> know if there is.
>
>  - J<

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-02-25 23:15 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
2019-01-24 19:28 ` Jason L Tibbitts III
2019-01-24 19:58 ` Trond Myklebust
2019-01-25 19:13   ` Schumaker, Anna
2019-01-26 17:59     ` Sasha Levin
2019-01-25 19:51   ` Jason L Tibbitts III
2019-02-05 18:12     ` Jason Tibbitts
2019-02-06 12:05       ` Benjamin Coddington
     [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
2019-02-07 11:13           ` Benjamin Coddington
     [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
2019-02-08 12:01               ` Benjamin Coddington
2019-02-08 15:19                 ` Chuck Lever
2019-02-08 17:17                   ` Jason L Tibbitts III
2019-02-15 20:33                 ` Jason L Tibbitts III
2019-02-16 14:46                   ` Trond Myklebust
2019-02-20  2:13                     ` Jason L Tibbitts III
2019-02-20 15:25                     ` Jason L Tibbitts III
2019-02-20 15:37                       ` Trond Myklebust
2019-02-20 15:39                         ` Chuck Lever
2019-02-20 15:41                         ` Trond Myklebust
2019-02-21 18:19                           ` Jason L Tibbitts III
2019-02-25 19:24                             ` Jason L Tibbitts III
2019-02-25 23:15                               ` Benjamin Coddington
2019-02-20 16:25                         ` Jason L Tibbitts III
2019-02-20 16:45                           ` Trond Myklebust
2019-02-20 16:49                             ` Jason L Tibbitts III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).