All of lore.kernel.org
 help / color / mirror / Atom feed
* Task hung while using RTRS with rxe and using "ip" utility to down the interface
@ 2021-11-11 16:25 Haris Iqbal
  2021-11-14  6:12 ` Leon Romanovsky
  0 siblings, 1 reply; 4+ messages in thread
From: Haris Iqbal @ 2021-11-11 16:25 UTC (permalink / raw)
  To: RDMA mailing list, Jason Gunthorpe, Leon Romanovsky
  Cc: Jinpu Wang, Danil Kipnis, Aleksei Marov

Hi,

We are experiencing a hang while using RTRS with softROCE and
transitioning the network interface down using ifdown command.

Steps.
1) Map an RNBD/RTRS device through softROCE port.
2) Once mapped, transition the eth interface to down (on which the
softROCE interface was created) using command "ifconfig <ethx> down",
or "ip link set <ethx> down".
3) The device errors out, and one can see RTRS connection errors in
dmesg. So far so good.
4) After a while, we see task hung traces in dmesg.

[  550.866462] INFO: task kworker/1:2:170 blocked for more than 184 seconds.
[  550.868820]       Tainted: G           O      5.10.42-pserver+ #84
[  550.869337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  550.869963] task:kworker/1:2     state:D stack:    0 pid:  170
ppid:     2 flags:0x00004000
[  550.870619] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
[  550.871134] Call Trace:
[  550.871375]  __schedule+0x421/0x810
[  550.871683]  schedule+0x46/0xb0
[  550.871964]  schedule_timeout+0x20e/0x2a0
[  550.872300]  ? internal_add_timer+0x44/0x70
[  550.872650]  wait_for_completion+0x86/0xe0
[  550.872994]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
[  550.873357]  ? _cond_resched+0x15/0x30
[  550.873680]  ? wait_for_completion+0x33/0xe0
[  550.874036]  _destroy_id+0x57/0x210 [rdma_cm]
[  550.874395]  rtrs_srv_close_work+0xcc/0x250 [rtrs_server]
[  550.874819]  process_one_work+0x1d4/0x370
[  550.875156]  worker_thread+0x4a/0x3b0
[  550.875471]  ? process_one_work+0x370/0x370
[  550.875817]  kthread+0xfe/0x140
[  550.876098]  ? kthread_park+0x90/0x90
[  550.876453]  ret_from_fork+0x1f/0x30


Our observations till now.

1) Does not occur if we use "ifdown <ethx>" instead. There is a
difference between the commands, but we are not sure if the above one
should lead to a task hang.
https://access.redhat.com/solutions/27166
2) We have verified v5.10 and v.15.1 kernels, and both have this issue.
3) We tried the same test with NvmeOf target and host over softROCE.
We get the same task hang after doing "ifconfig .. down".

[Tue Nov  9 14:28:51 2021] INFO: task kworker/1:1:34 blocked for more
than 184 seconds.
[Tue Nov  9 14:28:51 2021]       Tainted: G           O
5.10.42-pserver+ #84
[Tue Nov  9 14:28:51 2021] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Nov  9 14:28:51 2021] task:kworker/1:1     state:D stack:    0
pid:   34 ppid:     2 flags:0x00004000
[Tue Nov  9 14:28:51 2021] Workqueue: events
nvmet_rdma_release_queue_work [nvmet_rdma]
[Tue Nov  9 14:28:51 2021] Call Trace:
[Tue Nov  9 14:28:51 2021]  __schedule+0x421/0x810
[Tue Nov  9 14:28:51 2021]  schedule+0x46/0xb0
[Tue Nov  9 14:28:51 2021]  schedule_timeout+0x20e/0x2a0
[Tue Nov  9 14:28:51 2021]  ? internal_add_timer+0x44/0x70
[Tue Nov  9 14:28:51 2021]  wait_for_completion+0x86/0xe0
[Tue Nov  9 14:28:51 2021]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
[Tue Nov  9 14:28:51 2021]  ? _cond_resched+0x15/0x30
[Tue Nov  9 14:28:51 2021]  ? wait_for_completion+0x33/0xe0
[Tue Nov  9 14:28:51 2021]  _destroy_id+0x57/0x210 [rdma_cm]
[Tue Nov  9 14:28:51 2021]  nvmet_rdma_free_queue+0x2e/0xc0 [nvmet_rdma]
[Tue Nov  9 14:28:51 2021]  nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
[Tue Nov  9 14:28:51 2021]  process_one_work+0x1d4/0x370
[Tue Nov  9 14:28:51 2021]  worker_thread+0x4a/0x3b0
[Tue Nov  9 14:28:51 2021]  ? process_one_work+0x370/0x370
[Tue Nov  9 14:28:51 2021]  kthread+0xfe/0x140
[Tue Nov  9 14:28:51 2021]  ? kthread_park+0x90/0x90
[Tue Nov  9 14:28:51 2021]  ret_from_fork+0x1f/0x30

Is this an known issue with ifconfig or the rxe driver? Thoughts?

Regards
-Haris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Task hung while using RTRS with rxe and using "ip" utility to down the interface
  2021-11-11 16:25 Task hung while using RTRS with rxe and using "ip" utility to down the interface Haris Iqbal
@ 2021-11-14  6:12 ` Leon Romanovsky
  2021-11-17 11:21   ` Haris Iqbal
  0 siblings, 1 reply; 4+ messages in thread
From: Leon Romanovsky @ 2021-11-14  6:12 UTC (permalink / raw)
  To: Haris Iqbal
  Cc: RDMA mailing list, Jason Gunthorpe, Jinpu Wang, Danil Kipnis,
	Aleksei Marov

On Thu, Nov 11, 2021 at 05:25:11PM +0100, Haris Iqbal wrote:
> Hi,
> 
> We are experiencing a hang while using RTRS with softROCE and
> transitioning the network interface down using ifdown command.
> 
> Steps.
> 1) Map an RNBD/RTRS device through softROCE port.
> 2) Once mapped, transition the eth interface to down (on which the
> softROCE interface was created) using command "ifconfig <ethx> down",
> or "ip link set <ethx> down".
> 3) The device errors out, and one can see RTRS connection errors in
> dmesg. So far so good.
> 4) After a while, we see task hung traces in dmesg.
> 
> [  550.866462] INFO: task kworker/1:2:170 blocked for more than 184 seconds.
> [  550.868820]       Tainted: G           O      5.10.42-pserver+ #84
> [  550.869337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  550.869963] task:kworker/1:2     state:D stack:    0 pid:  170
> ppid:     2 flags:0x00004000
> [  550.870619] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> [  550.871134] Call Trace:
> [  550.871375]  __schedule+0x421/0x810
> [  550.871683]  schedule+0x46/0xb0
> [  550.871964]  schedule_timeout+0x20e/0x2a0
> [  550.872300]  ? internal_add_timer+0x44/0x70
> [  550.872650]  wait_for_completion+0x86/0xe0
> [  550.872994]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
> [  550.873357]  ? _cond_resched+0x15/0x30
> [  550.873680]  ? wait_for_completion+0x33/0xe0
> [  550.874036]  _destroy_id+0x57/0x210 [rdma_cm]
> [  550.874395]  rtrs_srv_close_work+0xcc/0x250 [rtrs_server]
> [  550.874819]  process_one_work+0x1d4/0x370
> [  550.875156]  worker_thread+0x4a/0x3b0
> [  550.875471]  ? process_one_work+0x370/0x370
> [  550.875817]  kthread+0xfe/0x140
> [  550.876098]  ? kthread_park+0x90/0x90
> [  550.876453]  ret_from_fork+0x1f/0x30
> 
> 
> Our observations till now.
> 
> 1) Does not occur if we use "ifdown <ethx>" instead. There is a
> difference between the commands, but we are not sure if the above one
> should lead to a task hang.
> https://access.redhat.com/solutions/27166
> 2) We have verified v5.10 and v.15.1 kernels, and both have this issue.
> 3) We tried the same test with NvmeOf target and host over softROCE.
> We get the same task hang after doing "ifconfig .. down".
> 
> [Tue Nov  9 14:28:51 2021] INFO: task kworker/1:1:34 blocked for more
> than 184 seconds.
> [Tue Nov  9 14:28:51 2021]       Tainted: G           O
> 5.10.42-pserver+ #84
> [Tue Nov  9 14:28:51 2021] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Tue Nov  9 14:28:51 2021] task:kworker/1:1     state:D stack:    0
> pid:   34 ppid:     2 flags:0x00004000
> [Tue Nov  9 14:28:51 2021] Workqueue: events
> nvmet_rdma_release_queue_work [nvmet_rdma]
> [Tue Nov  9 14:28:51 2021] Call Trace:
> [Tue Nov  9 14:28:51 2021]  __schedule+0x421/0x810
> [Tue Nov  9 14:28:51 2021]  schedule+0x46/0xb0
> [Tue Nov  9 14:28:51 2021]  schedule_timeout+0x20e/0x2a0
> [Tue Nov  9 14:28:51 2021]  ? internal_add_timer+0x44/0x70
> [Tue Nov  9 14:28:51 2021]  wait_for_completion+0x86/0xe0
> [Tue Nov  9 14:28:51 2021]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
> [Tue Nov  9 14:28:51 2021]  ? _cond_resched+0x15/0x30
> [Tue Nov  9 14:28:51 2021]  ? wait_for_completion+0x33/0xe0
> [Tue Nov  9 14:28:51 2021]  _destroy_id+0x57/0x210 [rdma_cm]
> [Tue Nov  9 14:28:51 2021]  nvmet_rdma_free_queue+0x2e/0xc0 [nvmet_rdma]
> [Tue Nov  9 14:28:51 2021]  nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
> [Tue Nov  9 14:28:51 2021]  process_one_work+0x1d4/0x370
> [Tue Nov  9 14:28:51 2021]  worker_thread+0x4a/0x3b0
> [Tue Nov  9 14:28:51 2021]  ? process_one_work+0x370/0x370
> [Tue Nov  9 14:28:51 2021]  kthread+0xfe/0x140
> [Tue Nov  9 14:28:51 2021]  ? kthread_park+0x90/0x90
> [Tue Nov  9 14:28:51 2021]  ret_from_fork+0x1f/0x30
> 
> Is this an known issue with ifconfig or the rxe driver? Thoughts?

It doesn't look related to RXE and if to judge by the call trace the
issue is one of two: missing call to cm_deref_id() or extra call
to refcount_inc(&cm_id_priv->refcount). Both can be related to the
callers of rdma-cm interface. For example by not releasing cm_id
properly.

Thanks

> 
> Regards
> -Haris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Task hung while using RTRS with rxe and using "ip" utility to down the interface
  2021-11-14  6:12 ` Leon Romanovsky
@ 2021-11-17 11:21   ` Haris Iqbal
  2021-11-23  9:06     ` Leon Romanovsky
  0 siblings, 1 reply; 4+ messages in thread
From: Haris Iqbal @ 2021-11-17 11:21 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: RDMA mailing list, Jason Gunthorpe, Jinpu Wang, Danil Kipnis,
	Aleksei Marov

On Sun, Nov 14, 2021 at 7:12 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Nov 11, 2021 at 05:25:11PM +0100, Haris Iqbal wrote:
> > Hi,
> >
> > We are experiencing a hang while using RTRS with softROCE and
> > transitioning the network interface down using ifdown command.
> >
> > Steps.
> > 1) Map an RNBD/RTRS device through softROCE port.
> > 2) Once mapped, transition the eth interface to down (on which the
> > softROCE interface was created) using command "ifconfig <ethx> down",
> > or "ip link set <ethx> down".
> > 3) The device errors out, and one can see RTRS connection errors in
> > dmesg. So far so good.
> > 4) After a while, we see task hung traces in dmesg.
> >
> > [  550.866462] INFO: task kworker/1:2:170 blocked for more than 184 seconds.
> > [  550.868820]       Tainted: G           O      5.10.42-pserver+ #84
> > [  550.869337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  550.869963] task:kworker/1:2     state:D stack:    0 pid:  170
> > ppid:     2 flags:0x00004000
> > [  550.870619] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> > [  550.871134] Call Trace:
> > [  550.871375]  __schedule+0x421/0x810
> > [  550.871683]  schedule+0x46/0xb0
> > [  550.871964]  schedule_timeout+0x20e/0x2a0
> > [  550.872300]  ? internal_add_timer+0x44/0x70
> > [  550.872650]  wait_for_completion+0x86/0xe0
> > [  550.872994]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
> > [  550.873357]  ? _cond_resched+0x15/0x30
> > [  550.873680]  ? wait_for_completion+0x33/0xe0
> > [  550.874036]  _destroy_id+0x57/0x210 [rdma_cm]
> > [  550.874395]  rtrs_srv_close_work+0xcc/0x250 [rtrs_server]
> > [  550.874819]  process_one_work+0x1d4/0x370
> > [  550.875156]  worker_thread+0x4a/0x3b0
> > [  550.875471]  ? process_one_work+0x370/0x370
> > [  550.875817]  kthread+0xfe/0x140
> > [  550.876098]  ? kthread_park+0x90/0x90
> > [  550.876453]  ret_from_fork+0x1f/0x30
> >
> >
> > Our observations till now.
> >
> > 1) Does not occur if we use "ifdown <ethx>" instead. There is a
> > difference between the commands, but we are not sure if the above one
> > should lead to a task hang.
> > https://access.redhat.com/solutions/27166
> > 2) We have verified v5.10 and v.15.1 kernels, and both have this issue.
> > 3) We tried the same test with NvmeOf target and host over softROCE.
> > We get the same task hang after doing "ifconfig .. down".
> >
> > [Tue Nov  9 14:28:51 2021] INFO: task kworker/1:1:34 blocked for more
> > than 184 seconds.
> > [Tue Nov  9 14:28:51 2021]       Tainted: G           O
> > 5.10.42-pserver+ #84
> > [Tue Nov  9 14:28:51 2021] "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [Tue Nov  9 14:28:51 2021] task:kworker/1:1     state:D stack:    0
> > pid:   34 ppid:     2 flags:0x00004000
> > [Tue Nov  9 14:28:51 2021] Workqueue: events
> > nvmet_rdma_release_queue_work [nvmet_rdma]
> > [Tue Nov  9 14:28:51 2021] Call Trace:
> > [Tue Nov  9 14:28:51 2021]  __schedule+0x421/0x810
> > [Tue Nov  9 14:28:51 2021]  schedule+0x46/0xb0
> > [Tue Nov  9 14:28:51 2021]  schedule_timeout+0x20e/0x2a0
> > [Tue Nov  9 14:28:51 2021]  ? internal_add_timer+0x44/0x70
> > [Tue Nov  9 14:28:51 2021]  wait_for_completion+0x86/0xe0
> > [Tue Nov  9 14:28:51 2021]  cm_destroy_id+0x18c/0x5a0 [ib_cm]
> > [Tue Nov  9 14:28:51 2021]  ? _cond_resched+0x15/0x30
> > [Tue Nov  9 14:28:51 2021]  ? wait_for_completion+0x33/0xe0
> > [Tue Nov  9 14:28:51 2021]  _destroy_id+0x57/0x210 [rdma_cm]
> > [Tue Nov  9 14:28:51 2021]  nvmet_rdma_free_queue+0x2e/0xc0 [nvmet_rdma]
> > [Tue Nov  9 14:28:51 2021]  nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
> > [Tue Nov  9 14:28:51 2021]  process_one_work+0x1d4/0x370
> > [Tue Nov  9 14:28:51 2021]  worker_thread+0x4a/0x3b0
> > [Tue Nov  9 14:28:51 2021]  ? process_one_work+0x370/0x370
> > [Tue Nov  9 14:28:51 2021]  kthread+0xfe/0x140
> > [Tue Nov  9 14:28:51 2021]  ? kthread_park+0x90/0x90
> > [Tue Nov  9 14:28:51 2021]  ret_from_fork+0x1f/0x30
> >
> > Is this an known issue with ifconfig or the rxe driver? Thoughts?
>
> It doesn't look related to RXE and if to judge by the call trace the
> issue is one of two: missing call to cm_deref_id() or extra call
> to refcount_inc(&cm_id_priv->refcount). Both can be related to the
> callers of rdma-cm interface. For example by not releasing cm_id
> properly.


Hi,

@Aleksei did some more investigation and found out the cause of the hang.

1. enabled traces for rdma_cma and ib_cma events.
2. When we call ifdown (the one that does not hung) the following
trace can be observed

kworker/1:2-100     [001] ....    98.191532: cm_disconnect: cm.id=2
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0
kworker/1:2-100     [001] dN..    98.208935: icm_send_drep_err:
local_id=3791663578 remote_id=2340456478 state=TIMEWAIT
lap_state=LAP_UNINIT
kworker/1:2-100     [001] .N..    98.213924: cm_disconnect: cm.id=3
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0
kworker/1:2-100     [001] dN..    98.218595: icm_send_drep_err:
local_id=3808440794 remote_id=2323679262 state=TIMEWAIT
lap_state=LAP_UNINIT
kworker/1:2-100     [001] .N..    98.219583: cm_disconnect: cm.id=4
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0
kworker/1:2-100     [001] dN..    98.228200: icm_send_drep_err:
local_id=3758109146 remote_id=2306902046 state=TIMEWAIT
lap_state=LAP_UNINIT
kworker/1:2-100     [001] ....    98.322883: cm_qp_destroy: cm.id=2
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0 qp_num=17
kworker/1:2-100     [001] ....    98.322913: cm_id_destroy: cm.id=2
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0
kworker/1:2-100     [001] ....    98.332065: cm_qp_destroy: cm.id=3
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0 qp_num=18
kworker/1:2-100     [001] ....    98.332115: cm_id_destroy: cm.id=3
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0
kworker/1:2-100     [001] ....    98.340362: cm_qp_destroy: cm.id=4
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0 qp_num=19
kworker/1:2-100     [001] ....    98.340400: cm_id_destroy: cm.id=4
src=10.154.0.1:1234 dst=10.154.0.3:53749 tos=0


Explanation:
In this example we have 3 cm.id (2,3,4) for each of them we call
rdma_disconnect() that calls ib_send_cm_dreq which calls
cm_send_dreq_locked
but there we call

ret = cm_alloc_msg(cm_id_priv, &msg);
if (ret) {
    cm_enter_timewait(cm_id_priv);
    return ret;
}

which returns with the error before doing
refcount_inc(&cm_id_priv->refcount); inside cm_alloc_msg. Since the
error returns from rdma_create_ah.
so the refcount for each cm.id is 1 after calling rdma_disconnect.
Hence in cm_destroy_id we call once more cm_deref_id and do NOT hung
on

wait_for_completion(&cm_id_priv->comp);

because refcount is 0.

3. In case of ifconfig down (the one that hung) the trace is following

kworker/1:0-19      [001] ....   289.211230: cm_disconnect: cm.id=5
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] d...   289.222918: icm_send_dreq:
local_id=3636223036 remote_id=2994969923 state=ESTABLISHED
lap_state=LAP_UNINIT
kworker/1:0-19      [001] .N..   289.224789: cm_sent_dreq: cm.id=5
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/u5:0-63      [001] d...   289.224820: icm_mad_send_err:
state=DREQ_SENT completion status=LOC_PROT_ERR
kworker/u5:0-63      [001] ....   289.224821: cm_event_handler:
cm.id=5 src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0 DISCONNECTED
(10/-110)
kworker/u5:0-63      [001] ....   289.224822: cm_event_done: cm.id=5
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0 DISCONNECTED consumer
returns 0
kworker/1:0-19      [001] .N..   289.226797: cm_disconnect: cm.id=6
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] dN..   289.238556: icm_send_dreq:
local_id=3619445820 remote_id=3045301571 state=ESTABLISHED
lap_state=LAP_UNINIT
kworker/1:0-19      [001] .N..   289.240126: cm_sent_dreq: cm.id=6
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] .N..   289.241162: cm_disconnect: cm.id=7
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] dN..   289.251148: icm_send_dreq:
local_id=3602668604 remote_id=3028524355 state=ESTABLISHED
lap_state=LAP_UNINIT
kworker/1:0-19      [001] .N..   289.252624: cm_sent_dreq: cm.id=7
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] ....   289.340055: cm_qp_destroy: cm.id=5
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0 qp_num=19
kworker/1:0-19      [001] ....   289.340159: cm_id_destroy: cm.id=5
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0
kworker/1:0-19      [001] ....   289.367393: cm_qp_destroy: cm.id=6
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0 qp_num=20
kworker/1:0-19      [001] ....   289.367445: cm_id_destroy: cm.id=6
src=10.154.0.1:1234 dst=10.154.0.3:60177 tos=0

Explanation:
In this example we have 3 cm.id (5,6,7). for each of them we call
rdma_disconnect() that calls ib_send_cm_dreq which calls
cm_send_dreq_locked which calls cm_alloc_msg but in this case
cm_alloc_msg does not return the error and increase the refcount for
each cm.id to 2. And we do send dreq.
But later only for cm.id=5 the send_handler is called with LOC_PROT_ERR

icm_mad_send_err: state=DREQ_SENT completion status=LOC_PROT_ERR

which triggers cm_process_send_error -> cm_free_msg -> cm_deref_id
which decrease the refcount for cm.id=5 to 1.


For other cm.id (6,7) the refcounts are 2 since the dreq were sent but
nor completed nor timed out. Nothing happens with those dreq sent to
cm.id=6,7
So later, when cm_id_destroy is called for cm.id=5 (refcount=1) we
call once more cm_deref_id and do not hung on

wait_for_completion(&cm_id_priv->comp);

because refcount is 0.
but for cm.i=6,7 we do cm_deref_id but refcount goes from 2 to 1 and
we wait_for_completion forever, because those sent dreqs are still
somewhere.

4. The reason why cm_alloc_msg in one case (ifdown "interface")
returns with an error without refcount increase and in other case
(ifconfig down "interface") increase refcount and return 0 is what
happens in cm_alloc_msg. ( gid entry state in ib_gid_table. )

Explanation:
The cm_alloc_msg calls the following functions:
cm_alloc_msg -> rdma_create_ah -> rdma_lag_get_ah_roce_slave ->
rdma_read_gid_attr_ndev_rcu -> is_gid_entry_valid
which

return entry && entry->state == GID_TABLE_ENTRY_VALID;

In case of ifdown entry is either NULL or in the state
GID_TABLE_ENTRY_PENDING_DEL so the function returns false. and
cm_alloc_msg fails without refcount increase.
In case of ifconfig down entry is in the state GID_TABLE_ENTRY_VALID
so the function returns true. and cm_alloc_msg increase the refcount
for cm.id.


We think one of the below questions is the problem,

1. why cm_process_send_error was triggered only for 1 of 3 sent dreq?
Why the cm_send_handler is not called for others?
2. When RTRS receives an error and starts closing the network, during
which rdma_disconnect is called, shouldnt cm_alloc_msg fail in such a
case?

the same behavior is observed for RTRS and for NVMEof targets.



>
> Thanks
>
> >
> > Regards
> > -Haris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Task hung while using RTRS with rxe and using "ip" utility to down the interface
  2021-11-17 11:21   ` Haris Iqbal
@ 2021-11-23  9:06     ` Leon Romanovsky
  0 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2021-11-23  9:06 UTC (permalink / raw)
  To: Haris Iqbal
  Cc: RDMA mailing list, Jason Gunthorpe, Jinpu Wang, Danil Kipnis,
	Aleksei Marov

On Wed, Nov 17, 2021 at 12:21:14PM +0100, Haris Iqbal wrote:
> On Sun, Nov 14, 2021 at 7:12 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Nov 11, 2021 at 05:25:11PM +0100, Haris Iqbal wrote:
> > > Hi,
> > >
> > > We are experiencing a hang while using RTRS with softROCE and
> > > transitioning the network interface down using ifdown command.
> > >
> > > Steps.
> > > 1) Map an RNBD/RTRS device through softROCE port.
> > > 2) Once mapped, transition the eth interface to down (on which the
> > > softROCE interface was created) using command "ifconfig <ethx> down",
> > > or "ip link set <ethx> down".
> > > 3) The device errors out, and one can see RTRS connection errors in
> > > dmesg. So far so good.
> > > 4) After a while, we see task hung traces in dmesg.

<...>

> We think one of the below questions is the problem,
> 
> 1. why cm_process_send_error was triggered only for 1 of 3 sent dreq?
> Why the cm_send_handler is not called for others?
> 2. When RTRS receives an error and starts closing the network, during
> which rdma_disconnect is called, shouldnt cm_alloc_msg fail in such a
> case?

I don't know answer for first question, but for the second one, the
answer is no. The properly performed disconnect request involves DREQ
and DREP messages which are needed to be sent with cm_alloc_msg().

IBTA 12.10.8 COMMUNICATION RELEASE

Thanks

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-23  9:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11 16:25 Task hung while using RTRS with rxe and using "ip" utility to down the interface Haris Iqbal
2021-11-14  6:12 ` Leon Romanovsky
2021-11-17 11:21   ` Haris Iqbal
2021-11-23  9:06     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.