* [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed @ 2021-12-14 13:53 Bixuan Cui 2021-12-14 13:53 ` Bixuan Cui 0 siblings, 1 reply; 6+ messages in thread From: Bixuan Cui @ 2021-12-14 13:53 UTC (permalink / raw) To: linux-kernel, netdev, linux-nfs Cc: cuibixuan, bfields, chuck.lever, trond.myklebust, anna.schumaker, davem, kuba, pete.wl, wenan.mwa, xiaoh.peixh, weipu.zy There is a problem with nfs hang on my machine, Steps to reproduce: 1. echo "options sunrpc tcp_slot_table_entries=2" >> /etc/modprobe.d/sunrpc.conf echo "options sunrpc tcp_max_slot_table_entries=2" >> /etc/modprobe.d/sunrpc.conf and remount nfsv3: sudo mount -t nfs -o vers=3,nolock,proto=tcp 127.0.0.1:/root/nfs_server/ /root/nfs_client/ 2.Run case # cat case.sh while true do ls & done & while true do killall ls done & [root@cuibixuan nfs_client]# cd /root/nfs_client/; sh case.sh 3. Run 'killall sh' to kill case after waiting 10 seconds 4. The nfs hang and xprt->state is 0x212(XPRT_CONGESTED): [root@cuibixuan sunrpc]# cat /sys/kernel/debug/sunrpc/rpc_xprt/4/info netid: tcp addr: 127.0.0.1 port: 2049 state: 0x212 [root@cuibixuan sunrpc]# cat /sys/kernel/debug/sunrpc/rpc_clnt/4/tasks 155 0080 -11 0x4 0x0 0 rpc_default_ops [sunrpc] nfsv3 GETATTR a:call_reserveresult [sunrpc] q:xprt_backlog 15418 0080 -11 0x4 0x0 0 rpc_default_ops [sunrpc] nfsv3 GETATTR a:call_reserveresult [sunrpc] q:xprt_backlog 5. Enable ftrace # echo 1 > /sys/kernel/debug/tracing/events/sunrpc/enable # cat /sys/kernel/debug/tracing/trace ... ls-18792 [001] .... 668.541062: rpc_task_run_action: task:28885@4 flags=0080 state=0005 status=0 action=call_start [sunrpc] ls-18792 [001] .... 668.541063: rpc_request: task:28885@4 nfsv3 GETATTR (sync) ls-18792 [001] .... 668.541063: rpc_task_run_action: task:28885@4 flags=0080 state=0005 status=0 action=call_reserve [sunrpc] ls-18792 [001] .... 668.541065: rpc_task_sleep: task:28885@4 flags=0080 state=0005 status=-11 timeout=0 queue=xprt_backlog ls-18792 [000] .... 670.815066: rpc_task_wakeup: task:28885@4 flags=0180 state=0006 status=-512 timeout=0 queue=xprt_backlog ls-18792 [000] .... 670.815069: rpc_task_run_action: task:28885@4 flags=0180 state=0005 status=-512 action=rpc_exit_task [sunrpc] ls-18846 [001] .... 682.302970: rpc_task_begin: task:28886@4 flags=0080 state=0004 status=0 action= (null) ls-18846 [001] .... 682.302972: rpc_task_run_action: task:28886@4 flags=0080 state=0005 status=0 action=call_start [sunrpc] ls-18846 [001] .... 682.302973: rpc_request: task:28886@4 nfsv3 GETATTR (sync) ls-18846 [001] .... 682.302973: rpc_task_run_action: task:28886@4 flags=0080 state=0005 status=0 action=call_reserve [sunrpc] ls-18846 [001] .... 682.302974: rpc_task_sleep: task:28886@4 flags=0080 state=0005 status=-11 timeout=0 queue=xprt_backlog ls-18846 [000] .... 684.690924: rpc_task_wakeup: task:28886@4 flags=0180 state=0006 status=-512 timeout=0 queue=xprt_backlog ls-18846 [000] .... 684.690928: rpc_task_run_action: task:28886@4 flags=0180 state=0005 status=-512 action=rpc_exit_task [sunrpc] The kernel version of my machine is 4.19.91, and it can be fixed with the following patch (in v4.19): diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 793f595..2656048 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1358,6 +1358,7 @@ void xprt_release(struct rpc_task *task) if (req == NULL) { if (task->tk_client) { xprt = task->tk_xprt; + xprt_wake_up_backlog(xprt); if (xprt->snd_task == task) xprt_release_write(xprt, task); } I checked the -next code, there should be the same problem. Bixuan Cui (1): SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed net/sunrpc/xprt.c | 1 + 1 file changed, 1 insertion(+) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed 2021-12-14 13:53 [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed Bixuan Cui @ 2021-12-14 13:53 ` Bixuan Cui 2021-12-20 3:39 ` Bixuan Cui 0 siblings, 1 reply; 6+ messages in thread From: Bixuan Cui @ 2021-12-14 13:53 UTC (permalink / raw) To: linux-kernel, netdev, linux-nfs Cc: cuibixuan, bfields, chuck.lever, trond.myklebust, anna.schumaker, davem, kuba, pete.wl, wenan.mwa, xiaoh.peixh, weipu.zy When the values of tcp_max_slot_table_entries and sunrpc.tcp_slot_table_entries are lower than the number of rpc tasks, xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return -EAGAIN, and then set xprt->state to XPRT_CONGESTED: xprt_retry_reserve ->xprt_do_reserve ->xprt_alloc_slot ->xprt_dynamic_alloc_slot // return -EAGAIN and task->tk_rqstp is NULL ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt->state); When rpc task is killed, XPRT_CONGESTED bit of xprt->state will not be cleaned up and nfs hangs: rpc_exit_task ->xprt_release // if (req == NULL) is true, then XPRT_CONGESTED // bit not clean Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in xprt_release(). Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com> Signed-off-by: Xiaohui Pei <xiaoh.peixh@alibaba-inc.com> --- net/sunrpc/xprt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index a02de2b..70d11ae 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1952,6 +1952,7 @@ void xprt_release(struct rpc_task *task) if (req == NULL) { if (task->tk_client) { xprt = task->tk_xprt; + xprt_wake_up_backlog(xprt); xprt_release_write(xprt, task); } return; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed 2021-12-14 13:53 ` Bixuan Cui @ 2021-12-20 3:39 ` Bixuan Cui 2021-12-20 18:22 ` Trond Myklebust 0 siblings, 1 reply; 6+ messages in thread From: Bixuan Cui @ 2021-12-20 3:39 UTC (permalink / raw) To: linux-kernel, netdev, linux-nfs Cc: bfields, chuck.lever, trond.myklebust, anna.schumaker, davem, kuba, pete.wl, wenan.mwa, xiaoh.peixh, weipu.zy ping~ 在 2021/12/14 下午9:53, Bixuan Cui 写道: > When the values of tcp_max_slot_table_entries and > sunrpc.tcp_slot_table_entries are lower than the number of rpc tasks, > xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return -EAGAIN, and > then set xprt->state to XPRT_CONGESTED: > xprt_retry_reserve > ->xprt_do_reserve > ->xprt_alloc_slot > ->xprt_dynamic_alloc_slot // return -EAGAIN and task->tk_rqstp is NULL > ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt->state); > > When rpc task is killed, XPRT_CONGESTED bit of xprt->state will not be > cleaned up and nfs hangs: > rpc_exit_task > ->xprt_release // if (req == NULL) is true, then XPRT_CONGESTED > // bit not clean > > Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in > xprt_release(). > > Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com> > Signed-off-by: Xiaohui Pei <xiaoh.peixh@alibaba-inc.com> > --- > net/sunrpc/xprt.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index a02de2b..70d11ae 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -1952,6 +1952,7 @@ void xprt_release(struct rpc_task *task) > if (req == NULL) { > if (task->tk_client) { > xprt = task->tk_xprt; > + xprt_wake_up_backlog(xprt); > xprt_release_write(xprt, task); > } > return; ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed 2021-12-20 3:39 ` Bixuan Cui @ 2021-12-20 18:22 ` Trond Myklebust [not found] ` <efbf73f3-c6cd-90f6-ef22-bde14be708cc@linux.alibaba.com> 0 siblings, 1 reply; 6+ messages in thread From: Trond Myklebust @ 2021-12-20 18:22 UTC (permalink / raw) To: cuibixuan, linux-nfs, linux-kernel, netdev Cc: anna.schumaker, xiaoh.peixh, davem, chuck.lever, kuba, bfields, wenan.mwa, weipu.zy, pete.wl On Mon, 2021-12-20 at 11:39 +0800, Bixuan Cui wrote: > ping~ > > 在 2021/12/14 下午9:53, Bixuan Cui 写道: > > When the values of tcp_max_slot_table_entries and > > sunrpc.tcp_slot_table_entries are lower than the number of rpc > > tasks, > > xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return -EAGAIN, > > and > > then set xprt->state to XPRT_CONGESTED: > > xprt_retry_reserve > > ->xprt_do_reserve > > ->xprt_alloc_slot > > ->xprt_dynamic_alloc_slot // return -EAGAIN and task- > > >tk_rqstp is NULL > > ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt- > > >state); > > > > When rpc task is killed, XPRT_CONGESTED bit of xprt->state will not > > be > > cleaned up and nfs hangs: > > rpc_exit_task > > ->xprt_release // if (req == NULL) is true, then > > XPRT_CONGESTED > > // bit not clean > > > > Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in > > xprt_release(). I'm not seeing how this explanation makes sense. If the task doesn't hold a slot, then freeing that task isn't going to clear the congestion caused by all the slots being in use. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <efbf73f3-c6cd-90f6-ef22-bde14be708cc@linux.alibaba.com>]
* Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed [not found] ` <efbf73f3-c6cd-90f6-ef22-bde14be708cc@linux.alibaba.com> @ 2021-12-22 15:02 ` Trond Myklebust 2021-12-31 2:00 ` Bixuan Cui 0 siblings, 1 reply; 6+ messages in thread From: Trond Myklebust @ 2021-12-22 15:02 UTC (permalink / raw) To: cuibixuan, linux-nfs, linux-kernel, netdev Cc: pete.wl, xiaoh.peixh, davem, chuck.lever, kuba, bfields, weipu.zy, wenan.mwa, anna.schumaker On Wed, 2021-12-22 at 10:55 +0800, Bixuan Cui wrote: > > 在 2021/12/21 上午2:22, Trond Myklebust 写道: > > > On Mon, 2021-12-20 at 11:39 +0800, Bixuan Cui wrote: > > > > > ping~ > > > > > > 在 2021/12/14 下午9:53, Bixuan Cui 写道: > > > > > > > When the values of tcp_max_slot_table_entries and > > > > sunrpc.tcp_slot_table_entries are lower than the number of rpc > > > > tasks, > > > > xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return - > > > > EAGAIN, > > > > and > > > > then set xprt->state to XPRT_CONGESTED: > > > > xprt_retry_reserve > > > > ->xprt_do_reserve > > > > ->xprt_alloc_slot > > > > ->xprt_dynamic_alloc_slot // return -EAGAIN and task- > > > > > > > > > tk_rqstp is NULL > > > > ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt- > > > > > > > > > state); > > > > When rpc task is killed, XPRT_CONGESTED bit of xprt->state will > > > > not > > > > be > > > > cleaned up and nfs hangs: > > > > rpc_exit_task > > > > ->xprt_release // if (req == NULL) is true, then > > > > XPRT_CONGESTED > > > > // bit not clean > > > > > > > > Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in > > > > xprt_release(). > > I'm not seeing how this explanation makes sense. If the task > > doesn't > > hold a slot, then freeing that task isn't going to clear the > > congestion > > caused by all the slots being in use. > Hi, > If the rpc task is free, call xprt_release() : > void xprt_release(struct rpc_task *task) > { > if (req == NULL) { > if (task->tk_client) { > xprt = task->tk_xprt; > xprt_release_write(xprt, task); // 1. > release xprt > } > return; > } > .... > if (likely(!bc_prealloc(req))) > xprt->ops->free_slot(xprt, req); // 2. release slot > and call xprt_wake_up_backlog(xprt, req) to wakeup next task(clear > XPRT_CONGESTED bit if next is NULL) in xprt_free_slot() > else > xprt_free_bc_request(req); > } > I mean that in step 1, xprt was only released, but > xprt_wake_up_backlog was not called (I don’t know if it is necessary, > but xprt->state may still be XPRT_CONGESTED), which causes xprt to > hold up. I think it happens when the task that does not hold a slot > is the last released task,xprt_wake_up_backlog(clear XPRT_CONGESTED) > will not be executed. :-) > Thanks, > Bixuan Cui > > My point is that in that case 1, there is no slot to free, so there is no change to the congestion state. IOW: your patch is incorrect because it is trying to assign a slot in a case where there is no slot to assign. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed 2021-12-22 15:02 ` Trond Myklebust @ 2021-12-31 2:00 ` Bixuan Cui 0 siblings, 0 replies; 6+ messages in thread From: Bixuan Cui @ 2021-12-31 2:00 UTC (permalink / raw) To: Trond Myklebust, linux-nfs, linux-kernel, netdev Cc: pete.wl, xiaoh.peixh, davem, chuck.lever, kuba, bfields, weipu.zy, wenan.mwa, anna.schumaker 在 2021/12/22 下午11:02, Trond Myklebust 写道: > On Wed, 2021-12-22 at 10:55 +0800, Bixuan Cui wrote: >> 在 2021/12/21 上午2:22, Trond Myklebust 写道: >> >>> On Mon, 2021-12-20 at 11:39 +0800, Bixuan Cui wrote: >>> >>>> ping~ >>>> >>>> 在 2021/12/14 下午9:53, Bixuan Cui 写道: >>>> >>>>> When the values of tcp_max_slot_table_entries and >>>>> sunrpc.tcp_slot_table_entries are lower than the number of rpc >>>>> tasks, >>>>> xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return - >>>>> EAGAIN, >>>>> and >>>>> then set xprt->state to XPRT_CONGESTED: >>>>> xprt_retry_reserve >>>>> ->xprt_do_reserve >>>>> ->xprt_alloc_slot >>>>> ->xprt_dynamic_alloc_slot // return -EAGAIN and task- >>>>> >>>>>> tk_rqstp is NULL >>>>> ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt- >>>>> >>>>>> state); >>>>> When rpc task is killed, XPRT_CONGESTED bit of xprt->state will >>>>> not >>>>> be >>>>> cleaned up and nfs hangs: >>>>> rpc_exit_task >>>>> ->xprt_release // if (req == NULL) is true, then >>>>> XPRT_CONGESTED >>>>> // bit not clean >>>>> >>>>> Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in >>>>> xprt_release(). >>> I'm not seeing how this explanation makes sense. If the task >>> doesn't >>> hold a slot, then freeing that task isn't going to clear the >>> congestion >>> caused by all the slots being in use. >> Hi, >> If the rpc task is free, call xprt_release() : >> void xprt_release(struct rpc_task *task) >> { >> if (req == NULL) { >> if (task->tk_client) { >> xprt = task->tk_xprt; >> xprt_release_write(xprt, task); // 1. >> release xprt >> } >> return; >> } >> .... >> if (likely(!bc_prealloc(req))) >> xprt->ops->free_slot(xprt, req); // 2. release slot >> and call xprt_wake_up_backlog(xprt, req) to wakeup next task(clear >> XPRT_CONGESTED bit if next is NULL) in xprt_free_slot() >> else >> xprt_free_bc_request(req); >> } >> I mean that in step 1, xprt was only released, but >> xprt_wake_up_backlog was not called (I don’t know if it is necessary, >> but xprt->state may still be XPRT_CONGESTED), which causes xprt to >> hold up. I think it happens when the task that does not hold a slot >> is the last released task,xprt_wake_up_backlog(clear XPRT_CONGESTED) >> will not be executed. :-) >> Thanks, >> Bixuan Cui >> >> > My point is that in that case 1, there is no slot to free, so there is > no change to the congestion state. > > IOW: your patch is incorrect because it is trying to assign a slot in a > case where there is no slot to assign. Hi, I found the correct way to fix it, that is, do not free the request when there are tasks in the xprt->backlog :-) And it has been fixed by e877a88d1f06 (SUNRPC in case of backlog, hand free slots directly to waiting task) commit e877a88d1f069edced4160792f42c2a8e2dba942 Author: NeilBrown <neilb@suse.de> Date: Mon May 17 09:59:10 2021 +1000 SUNRPC in case of backlog, hand free slots directly to waiting task If sunrpc.tcp_max_slot_table_entries is small and there are tasks on the backlog queue, then when a request completes it is freed and the first task on the queue is woken. The expectation is that it will wake and claim that request. However if it was a sync task and the waiting process was killed at just that moment, it will wake and NOT claim the request. Thanks for your advice. Thanks, Bixuan Cui ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-31 2:00 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-14 13:53 [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed Bixuan Cui 2021-12-14 13:53 ` Bixuan Cui 2021-12-20 3:39 ` Bixuan Cui 2021-12-20 18:22 ` Trond Myklebust [not found] ` <efbf73f3-c6cd-90f6-ef22-bde14be708cc@linux.alibaba.com> 2021-12-22 15:02 ` Trond Myklebust 2021-12-31 2:00 ` Bixuan Cui
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).