Linux-NFS Archive on lore.kernel.org
 help / Atom feed
* Need help debugging NFS issues new to 4.20 kernel
@ 2019-01-24 17:32 Jason L Tibbitts III
  2019-01-24 19:28 ` Jason L Tibbitts III
  2019-01-24 19:58 ` Trond Myklebust
  0 siblings, 2 replies; 13+ messages in thread
From: Jason L Tibbitts III @ 2019-01-24 17:32 UTC (permalink / raw)
  To: linux-nfs

I could use some help figuring out the cause of some serious NFS client
issues I'm having with the 4.20.3 kernel which I did not see under
4.19.15.

I have a network of about 130 desktops (plus a bunch of other machines,
VMs and the like) running Fedora 29 connecting to six NFS servers
running CentOS 7.6 (with the heavily patched vendor kernel
3.10.0-957.1.3).  All machines involved are x86_64.  We use kerberized
NFS4 with generally sec=krb5i.  The exports are generally made with
"(rw,async,sec=krb5i:krb5p)".

Since I booted those clients into 4.20.3 I've started seeing processes
getting stuck in the D state.  The system itself will seem OK (except
for the high load average) as long as I don't touch the hung NFS mount.
Nothing was logged to dmesg or to the journal.  So far booting back into
the 4.19.15 kernel has cleared up the problem.  I cannot yet reproduce
this on demand; I've tried but it is probably related to some specific
usage pattern.

Has anyone else seen issues like this?  Can anyone help me to get more
useful information that might point to the problem?  I still haven't
learned how to debug NFS issues properly.  And if there's a stress test
tool I could easily run that might help to reproduce the issue, I'd be
happy to run it.

I note that 4.20.4 is out; I see one sunrpc fix which I guess could be
related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
involved have plenty of free memory so I doubt that's it.  I'll
certainly try it anyway.

Various package versions:
kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
nfs-utils-2.3.3-1.rc2.fc29.x86_64
gssproxy-0.8.0-6.fc29.x86_64
krb5-libs-1.16.1-25.fc29.i686

Thanks in advance for any help or advice,

 - J<

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
@ 2019-01-24 19:28 ` Jason L Tibbitts III
  2019-01-24 19:58 ` Trond Myklebust
  1 sibling, 0 replies; 13+ messages in thread
From: Jason L Tibbitts III @ 2019-01-24 19:28 UTC (permalink / raw)
  To: linux-nfs

Just had another machine get into this state.  This time I thought to at
least do alt-sysrq-w.  The result is a big pile of backtraces but it
looks like basically every NFS operation is hung up.

I'm not sure how useful it will be, but I tried to categorize the
various backtraces.  Most of them are like the following, going through
nfs_file_fsync:

tcsh            D    0  9511      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 wait_on_page_bit_common+0x11a/0x270
 ? file_check_and_advance_wb_err+0xc0/0xc0
 __filemap_fdatawait_range+0xe2/0x130
 ? __filemap_fdatawrite_range+0xc8/0xf0
 filemap_write_and_wait_range+0x45/0x80
 nfs_file_fsync+0x44/0x1e0 [nfs]
 filp_close+0x31/0x60
 put_files_struct+0x6c/0xc0
 do_exit+0x2b4/0xb30
 ? __fput+0x151/0x220
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 ? do_sys_open+0x128/0x210
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4760d46ca2
Code: Bad RIP value.
RSP: 002b:00007ffc29c788d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: fffffffffffffe00 RBX: 00005652381bf990 RCX: 00007f4760d46ca2
RDX: 0000000000000000 RSI: 00005652381d4b40 RDI: 00000000ffffff9c
RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc29c78920
R10: 0000000000000000 R11: 0000000000000246 R12: 00005652381d4b40
R13: 0000000000000000 R14: 00005652381dcd30 R15: 0000000000000000


There are a few others which are hung on locking operations:

QThread         D    0  9740      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_proc_lock+0x1a9/0x320 [nfsv4]
 ? do_unlk+0x98/0xe0 [nfs]
 locks_remove_posix+0xb1/0x140
 ? __nfs_commit_inode+0xc4/0x1b0 [nfs]
 filp_close+0x50/0x60
 put_files_struct+0x6c/0xc0
 do_exit+0x2b4/0xb30
 ? mem_cgroup_try_charge+0xe6/0x180
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff3e9321421
Code: Bad RIP value.
RSP: 002b:00007ff3d7ffeac0 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
RAX: fffffffffffffdfc RBX: 00007ff3c8004db0 RCX: 00007ff3e9321421
RDX: 00000000ffffffff RSI: 0000000000000001 RDI: 00007ff3c8004db0
RBP: 0000000000000001 R08: 0000000000000000 R09: 00007ff3c8000c78
R10: 00007ff3d0003480 R11: 0000000000000293 R12: 00000000ffffffff
R13: 00007ff3e6e48f10 R14: 00000000ffffffff R15: 0000000000000001

gvfsd-metadata  D    0 24382      1 0x80000006
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_do_close+0x2bf/0x2f0 [nfsv4]
 __put_nfs_open_context+0xa9/0x100 [nfs]
 nfs_file_release+0x39/0x40 [nfs]
 __fput+0xb4/0x220
 task_work_run+0x84/0xa0
 do_exit+0x2d2/0xb30
 do_group_exit+0x3a/0xa0
 get_signal+0x276/0x590
 do_signal+0x36/0x610
 ? handle_mm_fault+0xda/0x200
 exit_to_usermode_loop+0x71/0xe0
 do_syscall_64+0x14d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cf843d421
Code: Bad RIP value.
RSP: 002b:00007ffc224ea250 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
RAX: fffffffffffffdfc RBX: 000055c9db8fa2a0 RCX: 00007f3cf843d421
RDX: 00000000ffffffff RSI: 0000000000000001 RDI: 000055c9db8fa2a0
RBP: 0000000000000001 R08: 0000000000000000 R09: 000055c9db8ea488
R10: 000055c9db8dbae0 R11: 0000000000000293 R12: 00000000ffffffff
R13: 00007f3cf85bff10 R14: 00000000ffffffff R15: 0000000000000001


And some others waiting on metadata operations:

localStorage DB D    0 24794   9621 0x00000004
Call Trace:
 ? __schedule+0x253/0x850
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 wait_on_page_bit_common+0x11a/0x270
 ? file_check_and_advance_wb_err+0xc0/0xc0
 __filemap_fdatawait_range+0xe2/0x130
 filemap_write_and_wait+0x47/0x70
 nfs_wb_all+0x1a/0x120 [nfs]
 nfs_setattr+0x1cb/0x1d0 [nfs]
 notify_change+0x2da/0x440
 do_truncate+0x74/0xc0
 do_sys_ftruncate+0xb6/0x100
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7eff1cd1382b
Code: Bad RIP value.
RSP: 002b:00007eff0c3787c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
RAX: ffffffffffffffda RBX: 0000000000bd0000 RCX: 00007eff1cd1382b
RDX: 0002000100000000 RSI: 0000000000bd0000 RDI: 000000000000005c
RBP: 00007eff034d61a0 R08: 00007ffc7d3b40b0 R09: 00007ffc7d3b4080
R10: 000000000274ded2 R11: 0000000000000246 R12: 000000000000005c
R13: 00007eff02f081c0 R14: 00007eff025fd8e0 R15: 00007eff034832e0

df              D    0 25187  25186 0x00000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 ? __rpc_atrun+0x20/0x20 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 ? trace_event_raw_event_rpc_stats_latency+0x240/0x240 [sunrpc]
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 __rpc_execute+0xe0/0x350 [sunrpc]
 ? recalibrate_cpu_khz+0x10/0x10
 ? ktime_get+0x36/0xa0
 rpc_run_task+0x121/0x180 [sunrpc]
 nfs4_call_sync_sequence+0x64/0xa0 [nfsv4]
 _nfs4_proc_getattr+0xf8/0x120 [nfsv4]
 nfs4_proc_getattr+0x75/0x100 [nfsv4]
 __nfs_revalidate_inode+0x1b7/0x2a0 [nfs]
 nfs_getattr+0x118/0x2b0 [nfs]
 ? security_inode_getattr+0x30/0x50
 vfs_statx+0x89/0xe0
 __do_sys_newstat+0x39/0x70
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f7dc6195579
Code: Bad RIP value.
RSP: 002b:00007ffd44e725a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 000055ad9832ef80 RCX: 00007f7dc6195579
RDX: 00007ffd44e72650 RSI: 00007ffd44e72650 RDI: 000055ad9832e380
RBP: 000055ad9832e310 R08: 0000000000000003 R09: aaaaaaaaaaaaaaab
R10: 000055ad98326010 R11: 0000000000000246 R12: 000055ad9832e3c0
R13: 0000000000000000 R14: 000055ad9832b9b0 R15: 0000000000000000


And others on writes:

mozStorage #1   D    0 24816   9621 0x00000004
Call Trace:
 ? __schedule+0x253/0x850
 ? __switch_to_asm+0x40/0x70
 schedule+0x28/0x80
 io_schedule+0x12/0x40
 __lock_page+0x13c/0x230
 ? file_check_and_advance_wb_err+0xc0/0xc0
 pagecache_get_page+0x1ae/0x340
 grab_cache_page_write_begin+0x1c/0x40
 nfs_write_begin+0x63/0x340 [nfs]
 generic_perform_write+0xf4/0x1b0
 ? __handle_mm_fault+0xb9d/0x1380
 nfs_file_write+0xdc/0x200 [nfs]
 __vfs_write+0x136/0x1a0
 vfs_write+0xa5/0x1a0
 ksys_write+0x4f/0xb0
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7eff1d131d57
Code: Bad RIP value.
RSP: 002b:00007eff00546380 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000002d RCX: 00007eff1d131d57
RDX: 0000000000008000 RSI: 00007efef8085000 RDI: 000000000000002d
RBP: 00007efef8085000 R08: 0000000000000000 R09: 00007ffc7d3b4080
R10: 000000000274d526 R11: 0000000000000293 R12: 0000000000008000
R13: 00007eff08604868 R14: 00000000000300c8 R15: 00007efef8085000


Not sure what this one is waiting on:

kdm-uh          D    0 25357  25345 0x00000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __switch_to_asm+0x34/0x70
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 ? __rpc_atrun+0x20/0x20 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 ? trace_event_raw_event_rpc_stats_latency+0x240/0x240 [sunrpc]
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 __rpc_execute+0xe0/0x350 [sunrpc]
 ? recalibrate_cpu_khz+0x10/0x10
 ? ktime_get+0x36/0xa0
 rpc_run_task+0x121/0x180 [sunrpc]
 nfs4_call_sync_sequence+0x64/0xa0 [nfsv4]
 _nfs4_proc_getattr+0xf8/0x120 [nfsv4]
 nfs4_proc_getattr+0x75/0x100 [nfsv4]
 __nfs_revalidate_inode+0x1b7/0x2a0 [nfs]
 nfs_do_access+0x2cd/0x400 [nfs]
 ? rpcauth_lookupcred+0x99/0xc0 [sunrpc]
 nfs_permission+0x1b9/0x1e0 [nfs]
 inode_permission+0xbe/0x180
 link_path_walk.part.49+0x3f2/0x520
 path_openat+0x9f/0x1610
 ? page_counter_try_charge+0x57/0xc0
 ? memcg_kmem_charge_memcg+0x76/0xa0
 ? get_random_u32+0x3e/0xd0
 do_filp_open+0x93/0x100
 ? __check_object_size+0x15d/0x189
 do_sys_open+0x186/0x210
 do_syscall_64+0x5b/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3f0fc5dca2
Code: Bad RIP value.
RSP: 002b:00007ffc18c62840 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000055a70650bfa0 RCX: 00007f3f0fc5dca2
RDX: 0000000000000800 RSI: 000055a70650bcf0 RDI: 00000000ffffff9c
RBP: 0000000000001e34 R08: 000055a70650c3b0 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffc18c62a30 R14: 000055a70650bcf0 R15: 00007ffc18c62afc

And there's one kernel thread, I assume for the mount since that's the
IP of the NFS server.

172.21.86.74-ma D    0 25099      2 0x80000000
Call Trace:
 ? __schedule+0x253/0x850
 ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 schedule+0x28/0x80
 rpc_wait_bit_killable+0x1e/0x90 [sunrpc]
 __wait_on_bit+0x6c/0x80
 out_of_line_wait_on_bit+0x91/0xb0
 ? init_wait_var_entry+0x40/0x40
 nfs4_proc_sequence+0x3c/0x50 [nfsv4]
 nfs4_run_state_manager+0x257/0x760 [nfsv4]
 ? nfs4_do_reclaim+0x680/0x680 [nfsv4]
 kthread+0x112/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x35/0x40


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
  2019-01-24 19:28 ` Jason L Tibbitts III
@ 2019-01-24 19:58 ` Trond Myklebust
  2019-01-25 19:13   ` Schumaker, Anna
  2019-01-25 19:51   ` Jason L Tibbitts III
  1 sibling, 2 replies; 13+ messages in thread
From: Trond Myklebust @ 2019-01-24 19:58 UTC (permalink / raw)
  To: tibbs, Anna.Schumaker, linux-nfs, Chuck.Lever

On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
> I could use some help figuring out the cause of some serious NFS
> client
> issues I'm having with the 4.20.3 kernel which I did not see under
> 4.19.15.
> 
> I have a network of about 130 desktops (plus a bunch of other
> machines,
> VMs and the like) running Fedora 29 connecting to six NFS servers
> running CentOS 7.6 (with the heavily patched vendor kernel
> 3.10.0-957.1.3).  All machines involved are x86_64.  We use
> kerberized
> NFS4 with generally sec=krb5i.  The exports are generally made with
> "(rw,async,sec=krb5i:krb5p)".
> 
> Since I booted those clients into 4.20.3 I've started seeing
> processes
> getting stuck in the D state.  The system itself will seem OK (except
> for the high load average) as long as I don't touch the hung NFS
> mount.
> Nothing was logged to dmesg or to the journal.  So far booting back
> into
> the 4.19.15 kernel has cleared up the problem.  I cannot yet
> reproduce
> this on demand; I've tried but it is probably related to some
> specific
> usage pattern.
> 
> Has anyone else seen issues like this?  Can anyone help me to get
> more
> useful information that might point to the problem?  I still haven't
> learned how to debug NFS issues properly.  And if there's a stress
> test
> tool I could easily run that might help to reproduce the issue, I'd
> be
> happy to run it.
> 
> I note that 4.20.4 is out; I see one sunrpc fix which I guess could
> be
> related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
> involved have plenty of free memory so I doubt that's it.  I'll
> certainly try it anyway.
> 
> Various package versions:
> kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
> kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
> nfs-utils-2.3.3-1.rc2.fc29.x86_64
> gssproxy-0.8.0-6.fc29.x86_64
> krb5-libs-1.16.1-25.fc29.i686
> 
> Thanks in advance for any help or advice,
> 
>  - J<

Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
regression") was supposed to be marked for stable as a fix. Chuck &
Anna?
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 19:58 ` Trond Myklebust
@ 2019-01-25 19:13   ` Schumaker, Anna
  2019-01-26 17:59     ` Sasha Levin
  2019-01-25 19:51   ` Jason L Tibbitts III
  1 sibling, 1 reply; 13+ messages in thread
From: Schumaker, Anna @ 2019-01-25 19:13 UTC (permalink / raw)
  To: tibbs, stable, trondmy, Chuck.Lever, linux-nfs

On Thu, 2019-01-24 at 19:58 +0000, Trond Myklebust wrote:
> On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
> > I could use some help figuring out the cause of some serious NFS
> > client
> > issues I'm having with the 4.20.3 kernel which I did not see under
> > 4.19.15.
> > 
> > I have a network of about 130 desktops (plus a bunch of other
> > machines,
> > VMs and the like) running Fedora 29 connecting to six NFS servers
> > running CentOS 7.6 (with the heavily patched vendor kernel
> > 3.10.0-957.1.3).  All machines involved are x86_64.  We use
> > kerberized
> > NFS4 with generally sec=krb5i.  The exports are generally made with
> > "(rw,async,sec=krb5i:krb5p)".
> > 
> > Since I booted those clients into 4.20.3 I've started seeing
> > processes
> > getting stuck in the D state.  The system itself will seem OK (except
> > for the high load average) as long as I don't touch the hung NFS
> > mount.
> > Nothing was logged to dmesg or to the journal.  So far booting back
> > into
> > the 4.19.15 kernel has cleared up the problem.  I cannot yet
> > reproduce
> > this on demand; I've tried but it is probably related to some
> > specific
> > usage pattern.
> > 
> > Has anyone else seen issues like this?  Can anyone help me to get
> > more
> > useful information that might point to the problem?  I still haven't
> > learned how to debug NFS issues properly.  And if there's a stress
> > test
> > tool I could easily run that might help to reproduce the issue, I'd
> > be
> > happy to run it.
> > 
> > I note that 4.20.4 is out; I see one sunrpc fix which I guess could
> > be
> > related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
> > involved have plenty of free memory so I doubt that's it.  I'll
> > certainly try it anyway.
> > 
> > Various package versions:
> > kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
> > kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
> > nfs-utils-2.3.3-1.rc2.fc29.x86_64
> > gssproxy-0.8.0-6.fc29.x86_64
> > krb5-libs-1.16.1-25.fc29.i686
> > 
> > Thanks in advance for any help or advice,
> > 
> >  - J<
> 
> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
> regression") was supposed to be marked for stable as a fix. Chuck &
> Anna?

Looks like I missed that, sorry!

Stable folks, can you please backport deaa5c96c2f7 ("SUNRPC: Address Kerberos
performance/behavior regression") to v4.20?

Thanks,
Anna
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-24 19:58 ` Trond Myklebust
  2019-01-25 19:13   ` Schumaker, Anna
@ 2019-01-25 19:51   ` Jason L Tibbitts III
  2019-02-05 18:12     ` Jason Tibbitts
  1 sibling, 1 reply; 13+ messages in thread
From: Jason L Tibbitts III @ 2019-01-25 19:51 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna.Schumaker\, linux-nfs\, Chuck.Lever\

>>>>> "TM" == Trond Myklebust <trondmy@hammerspace.com> writes:

TM> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
TM> regression") was supposed to be marked for stable as a fix.

I wonder, though; is that likely to be the root of the problem I'm
seeing?  The commit description talks about this as a performance
regression, but I'm seeing a complete loss of NFS functionality.

Sadly I still don't have a reproducer, so outside of just deploying the
patch and hoping, I have no way to actually test this.  So far I've been
running things like:

  stress-ng --all 0 --class 'filesystem' -t 10m --times

in an NFS4-krb5p mounted directory without being able to reproduce the
problem.  That drives the load up close to 200 but everything seems to make
progress.  So it must be some specific sequence that causes it; I just
don't know which.

I did get this to show up in the kernel log, though, when I typed "df"
while running that stress-ng command was running:

[94547.656419] NFS: server nas00 error: fileid changed
               fsid 0:57: expected fileid 0x6bc27eb, got 0x9996c6

I've never seen that one before, but it doesn't seem to hurt anything.
(This is still with 4.20.3.)

 - J<

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-25 19:13   ` Schumaker, Anna
@ 2019-01-26 17:59     ` Sasha Levin
  0 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2019-01-26 17:59 UTC (permalink / raw)
  To: Schumaker, Anna; +Cc: tibbs, stable, trondmy, Chuck.Lever, linux-nfs

On Fri, Jan 25, 2019 at 07:13:27PM +0000, Schumaker, Anna wrote:
>On Thu, 2019-01-24 at 19:58 +0000, Trond Myklebust wrote:
>> On Thu, 2019-01-24 at 11:32 -0600, Jason L Tibbitts III wrote:
>> > I could use some help figuring out the cause of some serious NFS
>> > client
>> > issues I'm having with the 4.20.3 kernel which I did not see under
>> > 4.19.15.
>> >
>> > I have a network of about 130 desktops (plus a bunch of other
>> > machines,
>> > VMs and the like) running Fedora 29 connecting to six NFS servers
>> > running CentOS 7.6 (with the heavily patched vendor kernel
>> > 3.10.0-957.1.3).  All machines involved are x86_64.  We use
>> > kerberized
>> > NFS4 with generally sec=krb5i.  The exports are generally made with
>> > "(rw,async,sec=krb5i:krb5p)".
>> >
>> > Since I booted those clients into 4.20.3 I've started seeing
>> > processes
>> > getting stuck in the D state.  The system itself will seem OK (except
>> > for the high load average) as long as I don't touch the hung NFS
>> > mount.
>> > Nothing was logged to dmesg or to the journal.  So far booting back
>> > into
>> > the 4.19.15 kernel has cleared up the problem.  I cannot yet
>> > reproduce
>> > this on demand; I've tried but it is probably related to some
>> > specific
>> > usage pattern.
>> >
>> > Has anyone else seen issues like this?  Can anyone help me to get
>> > more
>> > useful information that might point to the problem?  I still haven't
>> > learned how to debug NFS issues properly.  And if there's a stress
>> > test
>> > tool I could easily run that might help to reproduce the issue, I'd
>> > be
>> > happy to run it.
>> >
>> > I note that 4.20.4 is out; I see one sunrpc fix which I guess could
>> > be
>> > related (sunrpc: handle ENOMEM in rpcb_getport_async) but the systems
>> > involved have plenty of free memory so I doubt that's it.  I'll
>> > certainly try it anyway.
>> >
>> > Various package versions:
>> > kernel-4.20.3-200.fc29.x86_64 (the problematic kernel)
>> > kernel-4.19.15-300.fc29.x86_64 (the functional kernel)
>> > nfs-utils-2.3.3-1.rc2.fc29.x86_64
>> > gssproxy-0.8.0-6.fc29.x86_64
>> > krb5-libs-1.16.1-25.fc29.i686
>> >
>> > Thanks in advance for any help or advice,
>> >
>> >  - J<
>>
>> Commit deaa5c96c2f7 ("SUNRPC: Address Kerberos performance/behavior
>> regression") was supposed to be marked for stable as a fix. Chuck &
>> Anna?
>
>Looks like I missed that, sorry!
>
>Stable folks, can you please backport deaa5c96c2f7 ("SUNRPC: Address Kerberos
>performance/behavior regression") to v4.20?

Queued for 4.20, thank you.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-01-25 19:51   ` Jason L Tibbitts III
@ 2019-02-05 18:12     ` Jason Tibbitts
  2019-02-06 12:05       ` Benjamin Coddington
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Tibbitts @ 2019-02-05 18:12 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna.Schumaker\, linux-nfs\, Chuck.Lever\

>>>>> "JLT" == Jason L Tibbitts <tibbs@math.uh.edu> writes:

JLT> I wonder, though; is that likely to be the root of the problem I'm
JLT> seeing?  The commit description talks about this as a performance
JLT> regression, but I'm seeing a complete loss of NFS functionality.

It turns out that 4.20.6 (which has the patch in question) does not
appear to solve my issue.  At least, I've still had a few complete NFS
hangups on machines running 4.20.6 though I haven't had a chance to
closely inspect one yet.

As before, any hints about how I could best debug this issue would be
appreciated.

 - J<

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-05 18:12     ` Jason Tibbitts
@ 2019-02-06 12:05       ` Benjamin Coddington
       [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Coddington @ 2019-02-06 12:05 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

On 5 Feb 2019, at 13:12, Jason Tibbitts wrote:

>>>>>> "JLT" == Jason L Tibbitts <tibbs@math.uh.edu> writes:
>
> JLT> I wonder, though; is that likely to be the root of the problem I'm
> JLT> seeing?  The commit description talks about this as a performance
> JLT> regression, but I'm seeing a complete loss of NFS functionality.
>
> It turns out that 4.20.6 (which has the patch in question) does not
> appear to solve my issue.  At least, I've still had a few complete NFS
> hangups on machines running 4.20.6 though I haven't had a chance to
> closely inspect one yet.
>
> As before, any hints about how I could best debug this issue would be
> appreciated.

Hi Jason, you can enable the NFS and SUNRPC tracepoints to get a better idea
of what the client was doing right before the hang or what the client is
trying to do during the hang:

echo "sunrpc:*" >>  /sys/kernel/debug/tracing/set_event
echo "nfs:*" >> /sys/kernel/debug/tracing/set_event

Examine or provide the contents of /sys/kernel/debug/tracing/trace.

Ben

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
       [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
@ 2019-02-07 11:13           ` Benjamin Coddington
       [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Coddington @ 2019-02-07 11:13 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

On 6 Feb 2019, at 15:59, Jason Tibbitts wrote:

> This machine has a load of 93; a bunch of those are just df commands
> started by a monitoring process.  So far the user hasn't complained, so
> I can still poke at the machine for a while.

-10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing up in
the trace?  Ah.. my fault - add "nfs4:*" to the set_events.

Ben

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
       [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
@ 2019-02-08 12:01               ` Benjamin Coddington
  2019-02-08 15:19                 ` Chuck Lever
  2019-02-15 20:33                 ` Jason L Tibbitts III
  0 siblings, 2 replies; 13+ messages in thread
From: Benjamin Coddington @ 2019-02-08 12:01 UTC (permalink / raw)
  To: Jason Tibbitts; +Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever


On 7 Feb 2019, at 10:25, Jason Tibbitts wrote:

>>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:
>
> BC> -10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
> BC> trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing
> BC> up in the trace?  Ah.. my fault - add "nfs4:*" to the set_events.
>
> OK, attached is another trace.  Here's the same sequence I snipped
> previously:

So the client is calling SEQ over and over.. xs_stream_read_data sees
-EAGAIN.. I'm not an expert here, and not seeing what's going wrong.

Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
xs_read_stream_reply() to return recv.copied instead of "ret" to
xprt_complete_rqst()..

You could try reverting that commit and see if the problem goes away..

Ben

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 12:01               ` Benjamin Coddington
@ 2019-02-08 15:19                 ` Chuck Lever
  2019-02-08 17:17                   ` Jason L Tibbitts III
  2019-02-15 20:33                 ` Jason L Tibbitts III
  1 sibling, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2019-02-08 15:19 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Jason Tibbitts, Trond Myklebust, Anna Schumaker, Linux NFS Mailing List



> On Feb 8, 2019, at 7:01 AM, Benjamin Coddington <bcodding@redhat.com> wrote:
> 
> 
> On 7 Feb 2019, at 10:25, Jason Tibbitts wrote:
> 
>>>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:
>> 
>> BC> -10063 is -NFS4ERR_SEQ_MISORDERED.. I wonder why the
>> BC> trace_nfs4_sequence_done() in nfs41_sequence_process() isn't showing
>> BC> up in the trace?  Ah.. my fault - add "nfs4:*" to the set_events.
>> 
>> OK, attached is another trace.  Here's the same sequence I snipped
>> previously:
> 
> So the client is calling SEQ over and over.. xs_stream_read_data sees
> -EAGAIN.. I'm not an expert here, and not seeing what's going wrong.

<wag>

The server is returning SEQ_MISORDERED to a singleton SEQUENCE.
That suggests the client is trying to re-establish its lease but is
sending a slot nr the server doesn't recognize for the virtual slot
used for this purpose. Could be a problem on either side, and I
don't know enough to say how this loop could have started.

</wag>


> Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
> xs_read_stream_reply() to return recv.copied instead of "ret" to
> xprt_complete_rqst()..
> 
> You could try reverting that commit and see if the problem goes away..
> 
> Ben

--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 15:19                 ` Chuck Lever
@ 2019-02-08 17:17                   ` Jason L Tibbitts III
  0 siblings, 0 replies; 13+ messages in thread
From: Jason L Tibbitts III @ 2019-02-08 17:17 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Benjamin Coddington, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List

>>>>> "CL" == Chuck Lever <chuck.lever@oracle.com> writes:

CL> The server is returning SEQ_MISORDERED to a singleton SEQUENCE.
CL> That suggests the client is trying to re-establish its lease but is
CL> sending a slot nr the server doesn't recognize for the virtual slot
CL> used for this purpose. Could be a problem on either side, and I
CL> don't know enough to say how this loop could have started.

In case I didn't say so earlier, the server is just Centos 7.6 running
that heavily modified 3.10.0-957.1.3.el7.x86_64.  I was not able to
reproduce anything like this throughout the 4.19 cycle but it started
cropping up pretty quickly on user desktops after booting into 4.20.3
(the first of the 4.20 series I tried).  And it does not reoccur after
rebooting back to 4.19.15.  Of course I know that doesn't absolve the
server.

It is possible for me to try to push a mainline kernel to all of the
relevant file servers, but that obviously carries other overhead and
risks that I would certainly prefer to avoid.

I just wish I knew how to reproduce this other than sticking it on a
bunch of machines and waiting until people complain.

 - J<

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Need help debugging NFS issues new to 4.20 kernel
  2019-02-08 12:01               ` Benjamin Coddington
  2019-02-08 15:19                 ` Chuck Lever
@ 2019-02-15 20:33                 ` Jason L Tibbitts III
  1 sibling, 0 replies; 13+ messages in thread
From: Jason L Tibbitts III @ 2019-02-15 20:33 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Trond Myklebust, Anna.Schumaker, linux-nfs, Chuck.Lever

>>>>> "BC" == Benjamin Coddington <bcodding@redhat.com> writes:

BC> Hmm.. commit c443305529d1d3d3bee0d68fdd14ae89835e091f changed
BC> xs_read_stream_reply() to return recv.copied instead of "ret" to
BC> xprt_complete_rqst()..

BC> You could try reverting that commit and see if the problem goes
BC> away..

So patching a revert of that into 4.20.7 was beyond me, but I received
some help from Jeremy Cline on IRC (in #fedora-kernel) and ended up with
a patch I'll include at the end.  So far it does seem to be better, but
because of secure boot annoyances I haven't been able to roll it out
more generally.  However, it has been stable for a week on a few hosts
which have been problematic with stock 4.20.6.

I will continue to test, but hopefully this helps folks to understand
what's happening.

 - J<

From 322f581f514ffedb8884656f136bd6a812a53714 Mon Sep 17 00:00:00 2001
From: Jeremy Cline <jcline@redhat.com>
Date: Fri, 8 Feb 2019 13:09:41 -0500
Subject: [PATCH] Revert "SUNRPC: Fix RPC receive hangs"

This reverts commit c443305529d1d3d3bee0d68fdd14ae89835e091f.

Signed-off-by: Jeremy Cline <jcline@redhat.com>
---
 net/sunrpc/xprtsock.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9cdbb6d6e7f5..2d9f0326d55b 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -417,7 +417,7 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 		seek = 0;
 	} else {
 		seek -= buf->head[0].iov_len;
@@ -439,7 +439,7 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 		seek = 0;
 	} else {
 		seek -= want;
@@ -455,13 +455,16 @@ xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
 		if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
 			goto out;
 		if (ret != want)
-			goto out;
+			goto eagain;
 	} else
 		offset += buf->tail[0].iov_len;
 	ret = -EMSGSIZE;
 out:
 	*read = offset - seek_init;
 	return ret;
+eagain:
+	ret = -EAGAIN;
+	goto out;
 sock_err:
 	offset += seek;
 	goto out;
@@ -504,20 +507,21 @@ xs_read_stream_request(struct sock_xprt *transport, struct msghdr *msg,
 	if (transport->recv.offset == transport->recv.len) {
 		if (xs_read_stream_request_done(transport))
 			msg->msg_flags |= MSG_EOR;
-		return read;
+		return transport->recv.copied;
 	}
 
 	switch (ret) {
-	default:
-		break;
 	case -EFAULT:
 	case -EMSGSIZE:
 		msg->msg_flags |= MSG_TRUNC;
-		return read;
+		return transport->recv.copied;
 	case 0:
 		return -ESHUTDOWN;
+	default:
+		if (ret < 0)
+			return ret;
 	}
-	return ret < 0 ? ret : read;
+	return -EAGAIN;
 }
 
 static size_t
@@ -556,7 +560,7 @@ xs_read_stream_call(struct sock_xprt *transport, struct msghdr *msg, int flags)
 
 	ret = xs_read_stream_request(transport, msg, flags, req);
 	if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
-		xprt_complete_bc_request(req, transport->recv.copied);
+		xprt_complete_bc_request(req, ret);
 
 	return ret;
 }
@@ -589,7 +593,7 @@ xs_read_stream_reply(struct sock_xprt *transport, struct msghdr *msg, int flags)
 
 	spin_lock(&xprt->queue_lock);
 	if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
-		xprt_complete_rqst(req->rq_task, transport->recv.copied);
+		xprt_complete_rqst(req->rq_task, ret);
 	xprt_unpin_rqst(req);
 out:
 	spin_unlock(&xprt->queue_lock);
@@ -610,8 +614,10 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		if (ret <= 0)
 			goto out_err;
 		transport->recv.offset = ret;
-		if (transport->recv.offset != want)
-			return transport->recv.offset;
+		if (ret != want) {
+			ret = -EAGAIN;
+			goto out_err;
+		}
 		transport->recv.len = be32_to_cpu(transport->recv.fraghdr) &
 			RPC_FRAGMENT_SIZE_MASK;
 		transport->recv.offset -= sizeof(transport->recv.fraghdr);
@@ -619,9 +625,6 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 	}
 
 	switch (be32_to_cpu(transport->recv.calldir)) {
-	default:
-		msg.msg_flags |= MSG_TRUNC;
-		break;
 	case RPC_CALL:
 		ret = xs_read_stream_call(transport, &msg, flags);
 		break;
@@ -636,8 +639,6 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		goto out_err;
 	read += ret;
 	if (transport->recv.offset < transport->recv.len) {
-		if (!(msg.msg_flags & MSG_TRUNC))
-			return read;
 		msg.msg_flags = 0;
 		ret = xs_read_discard(transport->sock, &msg, flags,
 				transport->recv.len - transport->recv.offset);
@@ -646,7 +647,7 @@ xs_read_stream(struct sock_xprt *transport, int flags)
 		transport->recv.offset += ret;
 		read += ret;
 		if (transport->recv.offset != transport->recv.len)
-			return read;
+			return -EAGAIN;
 	}
 	if (xs_read_stream_request_done(transport)) {
 		trace_xs_stream_read_request(transport);
@@ -670,7 +671,7 @@ static void xs_stream_data_receive(struct sock_xprt *transport)
 		goto out;
 	for (;;) {
 		ret = xs_read_stream(transport, MSG_DONTWAIT);
-		if (ret < 0)
+		if (ret <= 0)
 			break;
 		read += ret;
 		cond_resched();
-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24 17:32 Need help debugging NFS issues new to 4.20 kernel Jason L Tibbitts III
2019-01-24 19:28 ` Jason L Tibbitts III
2019-01-24 19:58 ` Trond Myklebust
2019-01-25 19:13   ` Schumaker, Anna
2019-01-26 17:59     ` Sasha Levin
2019-01-25 19:51   ` Jason L Tibbitts III
2019-02-05 18:12     ` Jason Tibbitts
2019-02-06 12:05       ` Benjamin Coddington
     [not found]         ` <87imxwab12.fsf@hippogriff.math.uh.edu>
2019-02-07 11:13           ` Benjamin Coddington
     [not found]             ` <87d0o3aadg.fsf@hippogriff.math.uh.edu>
2019-02-08 12:01               ` Benjamin Coddington
2019-02-08 15:19                 ` Chuck Lever
2019-02-08 17:17                   ` Jason L Tibbitts III
2019-02-15 20:33                 ` Jason L Tibbitts III

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org linux-nfs@archiver.kernel.org
	public-inbox-index linux-nfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox