linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] nvme-tcp: block BH in sk state_change sk callback
@ 2021-03-21  7:08 Sagi Grimberg
  2021-03-21  7:08 ` [PATCH 2/2] nvmet-tcp: Fix incorrect locking in " Sagi Grimberg
  2021-04-02 16:49 ` [PATCH 1/2] nvme-tcp: block BH in sk " Christoph Hellwig
  0 siblings, 2 replies; 5+ messages in thread
From: Sagi Grimberg @ 2021-03-21  7:08 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni; +Cc: Yi Zhang

The TCP stack can run from process context for a long time
so we should disable BH here.

Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index b81649d0c12c..480e229f627d 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -874,7 +874,7 @@ static void nvme_tcp_state_change(struct sock *sk)
 {
 	struct nvme_tcp_queue *queue;
 
-	read_lock(&sk->sk_callback_lock);
+	read_lock_bh(&sk->sk_callback_lock);
 	queue = sk->sk_user_data;
 	if (!queue)
 		goto done;
@@ -895,7 +895,7 @@ static void nvme_tcp_state_change(struct sock *sk)
 
 	queue->state_change(sk);
 done:
-	read_unlock(&sk->sk_callback_lock);
+	read_unlock_bh(&sk->sk_callback_lock);
 }
 
 static inline bool nvme_tcp_queue_more(struct nvme_tcp_queue *queue)
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] nvmet-tcp: Fix incorrect locking in state_change sk callback
  2021-03-21  7:08 [PATCH 1/2] nvme-tcp: block BH in sk state_change sk callback Sagi Grimberg
@ 2021-03-21  7:08 ` Sagi Grimberg
  2021-03-24  2:06   ` Yi Zhang
  2021-04-02 16:49 ` [PATCH 1/2] nvme-tcp: block BH in sk " Christoph Hellwig
  1 sibling, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2021-03-21  7:08 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni; +Cc: Yi Zhang

We are not changing anything in the TCP connection state so
we should not take a write_lock but rather a read lock.

This caused a deadlock when running nvmet-tcp and nvme-tcp
on the same system, where state_change callbacks on the
host and on the controller side have causal relationship
and made lockdep report on this with blktests:

================================
WARNING: inconsistent lock state
5.12.0-rc3 #1 Tainted: G          I
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-R} usage.
nvme/1324 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888363151000 (clock-AF_INET){++-?}-{2:2}, at: nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
{IN-SOFTIRQ-W} state was registered at:
  __lock_acquire+0x79b/0x18d0
  lock_acquire+0x1ca/0x480
  _raw_write_lock_bh+0x39/0x80
  nvmet_tcp_state_change+0x21/0x170 [nvmet_tcp]
  tcp_fin+0x2a8/0x780
  tcp_data_queue+0xf94/0x1f20
  tcp_rcv_established+0x6ba/0x1f00
  tcp_v4_do_rcv+0x502/0x760
  tcp_v4_rcv+0x257e/0x3430
  ip_protocol_deliver_rcu+0x69/0x6a0
  ip_local_deliver_finish+0x1e2/0x2f0
  ip_local_deliver+0x1a2/0x420
  ip_rcv+0x4fb/0x6b0
  __netif_receive_skb_one_core+0x162/0x1b0
  process_backlog+0x1ff/0x770
  __napi_poll.constprop.0+0xa9/0x5c0
  net_rx_action+0x7b3/0xb30
  __do_softirq+0x1f0/0x940
  do_softirq+0xa1/0xd0
  __local_bh_enable_ip+0xd8/0x100
  ip_finish_output2+0x6b7/0x18a0
  __ip_queue_xmit+0x706/0x1aa0
  __tcp_transmit_skb+0x2068/0x2e20
  tcp_write_xmit+0xc9e/0x2bb0
  __tcp_push_pending_frames+0x92/0x310
  inet_shutdown+0x158/0x300
  __nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp]
  nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp]
  nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp]
  nvme_do_delete_ctrl+0x100/0x10c [nvme_core]
  nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
  kernfs_fop_write_iter+0x2c7/0x460
  new_sync_write+0x36c/0x610
  vfs_write+0x5c0/0x870
  ksys_write+0xf9/0x1d0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
irq event stamp: 10687
hardirqs last  enabled at (10687): [<ffffffff9ec376bd>] _raw_spin_unlock_irqrestore+0x2d/0x40
hardirqs last disabled at (10686): [<ffffffff9ec374d8>] _raw_spin_lock_irqsave+0x68/0x90
softirqs last  enabled at (10684): [<ffffffff9f000608>] __do_softirq+0x608/0x940
softirqs last disabled at (10649): [<ffffffff9cdedd31>] do_softirq+0xa1/0xd0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(clock-AF_INET);
  <Interrupt>
    lock(clock-AF_INET);

 *** DEADLOCK ***

5 locks held by nvme/1324:
 #0: ffff8884a01fe470 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0
 #1: ffff8886e435c090 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460
 #2: ffff888104d90c38 (kn->active#255){++++}-{0:0}, at: kernfs_remove_self+0x22d/0x330
 #3: ffff8884634538d0 (&queue->queue_lock){+.+.}-{3:3}, at: nvme_tcp_stop_queue+0x52/0xb0 [nvme_tcp]
 #4: ffff888363150d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_shutdown+0x59/0x300

stack backtrace:
CPU: 26 PID: 1324 Comm: nvme Tainted: G          I       5.12.0-rc3 #1
Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.10.0 11/12/2020
Call Trace:
 dump_stack+0x93/0xc2
 mark_lock_irq.cold+0x2c/0xb3
 ? verify_lock_unused+0x390/0x390
 ? stack_trace_consume_entry+0x160/0x160
 ? lock_downgrade+0x100/0x100
 ? save_trace+0x88/0x5e0
 ? _raw_spin_unlock_irqrestore+0x2d/0x40
 mark_lock+0x530/0x1470
 ? mark_lock_irq+0x1d10/0x1d10
 ? enqueue_timer+0x660/0x660
 mark_usage+0x215/0x2a0
 __lock_acquire+0x79b/0x18d0
 ? tcp_schedule_loss_probe.part.0+0x38c/0x520
 lock_acquire+0x1ca/0x480
 ? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
 ? rcu_read_unlock+0x40/0x40
 ? tcp_mtu_probe+0x1ae0/0x1ae0
 ? kmalloc_reserve+0xa0/0xa0
 ? sysfs_file_ops+0x170/0x170
 _raw_read_lock+0x3d/0xa0
 ? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
 nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
 ? sysfs_file_ops+0x170/0x170
 inet_shutdown+0x189/0x300
 __nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp]
 nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp]
 nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp]
 nvme_do_delete_ctrl+0x100/0x10c [nvme_core]
 nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
 kernfs_fop_write_iter+0x2c7/0x460
 new_sync_write+0x36c/0x610
 ? new_sync_read+0x600/0x600
 ? lock_acquire+0x1ca/0x480
 ? rcu_read_unlock+0x40/0x40
 ? lock_is_held_type+0x9a/0x110
 vfs_write+0x5c0/0x870
 ksys_write+0xf9/0x1d0
 ? __ia32_sys_read+0xa0/0xa0
 ? lockdep_hardirqs_on_prepare.part.0+0x198/0x340
 ? syscall_enter_from_user_mode+0x27/0x70
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/target/tcp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 5a62590e8804..902831eadcc2 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1472,7 +1472,7 @@ static void nvmet_tcp_state_change(struct sock *sk)
 {
 	struct nvmet_tcp_queue *queue;
 
-	write_lock_bh(&sk->sk_callback_lock);
+	read_lock_bh(&sk->sk_callback_lock);
 	queue = sk->sk_user_data;
 	if (!queue)
 		goto done;
@@ -1490,7 +1490,7 @@ static void nvmet_tcp_state_change(struct sock *sk)
 			queue->idx, sk->sk_state);
 	}
 done:
-	write_unlock_bh(&sk->sk_callback_lock);
+	read_unlock_bh(&sk->sk_callback_lock);
 }
 
 static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue)
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] nvmet-tcp: Fix incorrect locking in state_change sk callback
  2021-03-21  7:08 ` [PATCH 2/2] nvmet-tcp: Fix incorrect locking in " Sagi Grimberg
@ 2021-03-24  2:06   ` Yi Zhang
  2021-03-25 22:39     ` Sagi Grimberg
  0 siblings, 1 reply; 5+ messages in thread
From: Yi Zhang @ 2021-03-24  2:06 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

Hi Sagi
With the two patch, I reproduced another lock dependency issue, here is 
the full log:

[  143.310362] run blktests nvme/003 at 2021-03-23 21:52:15
[  143.927284] loop: module loaded
[  144.027532] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[  144.059070] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[  144.201559] nvmet: creating controller 1 for subsystem 
nqn.2014-08.org.nvmexpress.discovery for NQN 
nqn.2014-08.org.nvmexpress:uuid:e25db33098f14032b70b755db1976647.
[  144.211644] nvme nvme1: new ctrl: NQN 
"nqn.2014-08.org.nvmexpress.discovery", addr 127.0.0.1:4420
[  154.400575] nvme nvme1: Removing ctrl: NQN 
"nqn.2014-08.org.nvmexpress.discovery"

[  154.407970] ======================================================
[  154.414871] WARNING: possible circular locking dependency detected
[  154.421765] 5.12.0-rc3.fix+ #2 Not tainted
[  154.426340] ------------------------------------------------------
[  154.433232] kworker/7:2/260 is trying to acquire lock:
[  154.438972] ffff888288e92030 
((work_completion)(&queue->io_work)){+.+.}-{0:0}, at: 
__flush_work+0x118/0x1a0
[  154.449882]
                but task is already holding lock:
[  154.456395] ffffc90002b57db0 
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: 
process_one_work+0x7c1/0x1480
[  154.468263]
                which lock already depends on the new lock.

[  154.477393]
                the existing dependency chain (in reverse order) is:
[  154.485739]
                -> #2 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
[  154.494884]        __lock_acquire+0xb77/0x18d0
[  154.499853]        lock_acquire+0x1ca/0x480
[  154.504528]        process_one_work+0x813/0x1480
[  154.509688]        worker_thread+0x590/0xf80
[  154.514458]        kthread+0x368/0x440
[  154.518650]        ret_from_fork+0x22/0x30
[  154.523232]
                -> #1 ((wq_completion)events){+.+.}-{0:0}:
[  154.530633]        __lock_acquire+0xb77/0x18d0
[  154.535597]        lock_acquire+0x1ca/0x480
[  154.540272]        flush_workqueue+0x101/0x1250
[  154.545334]        nvmet_tcp_install_queue+0x22c/0x2a0 [nvmet_tcp]
[  154.552242]        nvmet_install_queue+0x2a3/0x360 [nvmet]
[  154.558387]        nvmet_execute_admin_connect+0x321/0x420 [nvmet]
[  154.565305]        nvmet_tcp_io_work+0xa04/0xcfb [nvmet_tcp]
[  154.571629]        process_one_work+0x8b2/0x1480
[  154.576787]        worker_thread+0x590/0xf80
[  154.581560]        kthread+0x368/0x440
[  154.585749]        ret_from_fork+0x22/0x30
[  154.590328]
                -> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}:
[  154.598989]        check_prev_add+0x15e/0x20f0
[  154.603953]        validate_chain+0xec9/0x19c0
[  154.608918]        __lock_acquire+0xb77/0x18d0
[  154.613883]        lock_acquire+0x1ca/0x480
[  154.618556]        __flush_work+0x139/0x1a0
[  154.623229]        nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
[  154.630621]        process_one_work+0x8b2/0x1480
[  154.635780]        worker_thread+0x590/0xf80
[  154.640549]        kthread+0x368/0x440
[  154.644741]        ret_from_fork+0x22/0x30
[  154.649321]
                other info that might help us debug this:

[  154.658257] Chain exists of:
                  (work_completion)(&queue->io_work) --> 
(wq_completion)events --> (work_completion)(&queue->release_work)

[  154.675070]  Possible unsafe locking scenario:

[  154.681679]        CPU0                    CPU1
[  154.686728]        ----                    ----
[  154.691776] lock((work_completion)(&queue->release_work));
[  154.698102] lock((wq_completion)events);
[  154.705493] lock((work_completion)(&queue->release_work));
[  154.714631]   lock((work_completion)(&queue->io_work));
[  154.720470]
                 *** DEADLOCK ***

[  154.727080] 2 locks held by kworker/7:2/260:
[  154.731849]  #0: ffff888100053148 
((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x792/0x1480
[  154.742458]  #1: ffffc90002b57db0 
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: 
process_one_work+0x7c1/0x1480
[  154.754809]
                stack backtrace:
[  154.759674] CPU: 7 PID: 260 Comm: kworker/7:2 Not tainted 
5.12.0-rc3.fix+ #2
[  154.767549] Hardware name: Dell Inc. PowerEdge 
R730xd/\xc9\xb2\xdePow, BIOS 2.12.1 12/04/2020
[  154.776197] Workqueue: events nvmet_tcp_release_queue_work [nvmet_tcp]
[  154.783497] Call Trace:
[  154.786231]  dump_stack+0x93/0xc2
[  154.789942]  check_noncircular+0x26a/0x310
[  154.794521]  ? print_circular_bug+0x460/0x460
[  154.799391]  ? deref_stack_reg+0x170/0x170
[  154.803967]  ? alloc_chain_hlocks+0x1de/0x520
[  154.808843]  check_prev_add+0x15e/0x20f0
[  154.813231]  validate_chain+0xec9/0x19c0
[  154.817611]  ? check_prev_add+0x20f0/0x20f0
[  154.822286]  ? save_trace+0x88/0x5e0
[  154.826290]  __lock_acquire+0xb77/0x18d0
[  154.830682]  lock_acquire+0x1ca/0x480
[  154.834775]  ? __flush_work+0x118/0x1a0
[  154.839066]  ? rcu_read_unlock+0x40/0x40
[  154.843455]  ? __lock_acquire+0xb77/0x18d0
[  154.848036]  __flush_work+0x139/0x1a0
[  154.852120]  ? __flush_work+0x118/0x1a0
[  154.856409]  ? start_flush_work+0x810/0x810
[  154.861084]  ? mark_lock+0xd3/0x1470
[  154.865082]  ? mark_lock_irq+0x1d10/0x1d10
[  154.869662]  ? lock_downgrade+0x100/0x100
[  154.874147]  ? mark_held_locks+0xa5/0xe0
[  154.878522]  ? sk_stream_wait_memory+0xe40/0xe40
[  154.883686]  ? lockdep_hardirqs_on_prepare.part.0+0x198/0x340
[  154.890394]  ? __local_bh_enable_ip+0xa2/0x100
[  154.895358]  ? trace_hardirqs_on+0x1c/0x160
[  154.900034]  ? sk_stream_wait_memory+0xe40/0xe40
[  154.905192]  nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
[  154.911999]  ? lock_is_held_type+0x9a/0x110
[  154.916676]  process_one_work+0x8b2/0x1480
[  154.921255]  ? pwq_dec_nr_in_flight+0x260/0x260
[  154.926315]  ? __lock_contended+0x910/0x910
[  154.930990]  ? worker_thread+0x150/0xf80
[  154.935374]  worker_thread+0x590/0xf80
[  154.939564]  ? __kthread_parkme+0xcb/0x1b0
[  154.944140]  ? process_one_work+0x1480/0x1480
[  154.949007]  kthread+0x368/0x440
[  154.952615]  ? _raw_spin_unlock_irq+0x24/0x30
[  154.957482]  ? __kthread_bind_mask+0x90/0x90
[  154.962255]  ret_from_fork+0x22/0x30


On 3/21/21 3:08 PM, Sagi Grimberg wrote:
> We are not changing anything in the TCP connection state so
> we should not take a write_lock but rather a read lock.
>
> This caused a deadlock when running nvmet-tcp and nvme-tcp
> on the same system, where state_change callbacks on the
> host and on the controller side have causal relationship
> and made lockdep report on this with blktests:


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] nvmet-tcp: Fix incorrect locking in state_change sk callback
  2021-03-24  2:06   ` Yi Zhang
@ 2021-03-25 22:39     ` Sagi Grimberg
  0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2021-03-25 22:39 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni


> Hi Sagi
> With the two patch, I reproduced another lock dependency issue, here is 

Hey Yi,

This one is different, still the fixes for the other one are valid...

Will look into this one too...

> the full log:
> 
> [  143.310362] run blktests nvme/003 at 2021-03-23 21:52:15
> [  143.927284] loop: module loaded
> [  144.027532] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [  144.059070] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> [  144.201559] nvmet: creating controller 1 for subsystem 
> nqn.2014-08.org.nvmexpress.discovery for NQN 
> nqn.2014-08.org.nvmexpress:uuid:e25db33098f14032b70b755db1976647.
> [  144.211644] nvme nvme1: new ctrl: NQN 
> "nqn.2014-08.org.nvmexpress.discovery", addr 127.0.0.1:4420
> [  154.400575] nvme nvme1: Removing ctrl: NQN 
> "nqn.2014-08.org.nvmexpress.discovery"
> 
> [  154.407970] ======================================================
> [  154.414871] WARNING: possible circular locking dependency detected
> [  154.421765] 5.12.0-rc3.fix+ #2 Not tainted
> [  154.426340] ------------------------------------------------------
> [  154.433232] kworker/7:2/260 is trying to acquire lock:
> [  154.438972] ffff888288e92030 
> ((work_completion)(&queue->io_work)){+.+.}-{0:0}, at: 
> __flush_work+0x118/0x1a0
> [  154.449882]
>                 but task is already holding lock:
> [  154.456395] ffffc90002b57db0 
> ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: 
> process_one_work+0x7c1/0x1480
> [  154.468263]
>                 which lock already depends on the new lock.
> 
> [  154.477393]
>                 the existing dependency chain (in reverse order) is:
> [  154.485739]
>                 -> #2 
> ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
> [  154.494884]        __lock_acquire+0xb77/0x18d0
> [  154.499853]        lock_acquire+0x1ca/0x480
> [  154.504528]        process_one_work+0x813/0x1480
> [  154.509688]        worker_thread+0x590/0xf80
> [  154.514458]        kthread+0x368/0x440
> [  154.518650]        ret_from_fork+0x22/0x30
> [  154.523232]
>                 -> #1 ((wq_completion)events){+.+.}-{0:0}:
> [  154.530633]        __lock_acquire+0xb77/0x18d0
> [  154.535597]        lock_acquire+0x1ca/0x480
> [  154.540272]        flush_workqueue+0x101/0x1250
> [  154.545334]        nvmet_tcp_install_queue+0x22c/0x2a0 [nvmet_tcp]
> [  154.552242]        nvmet_install_queue+0x2a3/0x360 [nvmet]
> [  154.558387]        nvmet_execute_admin_connect+0x321/0x420 [nvmet]
> [  154.565305]        nvmet_tcp_io_work+0xa04/0xcfb [nvmet_tcp]
> [  154.571629]        process_one_work+0x8b2/0x1480
> [  154.576787]        worker_thread+0x590/0xf80
> [  154.581560]        kthread+0x368/0x440
> [  154.585749]        ret_from_fork+0x22/0x30
> [  154.590328]
>                 -> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}:
> [  154.598989]        check_prev_add+0x15e/0x20f0
> [  154.603953]        validate_chain+0xec9/0x19c0
> [  154.608918]        __lock_acquire+0xb77/0x18d0
> [  154.613883]        lock_acquire+0x1ca/0x480
> [  154.618556]        __flush_work+0x139/0x1a0
> [  154.623229]        nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
> [  154.630621]        process_one_work+0x8b2/0x1480
> [  154.635780]        worker_thread+0x590/0xf80
> [  154.640549]        kthread+0x368/0x440
> [  154.644741]        ret_from_fork+0x22/0x30
> [  154.649321]
>                 other info that might help us debug this:
> 
> [  154.658257] Chain exists of:
>                   (work_completion)(&queue->io_work) --> 
> (wq_completion)events --> (work_completion)(&queue->release_work)
> 
> [  154.675070]  Possible unsafe locking scenario:
> 
> [  154.681679]        CPU0                    CPU1
> [  154.686728]        ----                    ----
> [  154.691776] lock((work_completion)(&queue->release_work));
> [  154.698102] lock((wq_completion)events);
> [  154.705493] lock((work_completion)(&queue->release_work));
> [  154.714631]   lock((work_completion)(&queue->io_work));
> [  154.720470]
>                  *** DEADLOCK ***
> 
> [  154.727080] 2 locks held by kworker/7:2/260:
> [  154.731849]  #0: ffff888100053148 
> ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x792/0x1480
> [  154.742458]  #1: ffffc90002b57db0 
> ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: 
> process_one_work+0x7c1/0x1480
> [  154.754809]
>                 stack backtrace:
> [  154.759674] CPU: 7 PID: 260 Comm: kworker/7:2 Not tainted 
> 5.12.0-rc3.fix+ #2
> [  154.767549] Hardware name: Dell Inc. PowerEdge 
> R730xd/\xc9\xb2\xdePow, BIOS 2.12.1 12/04/2020
> [  154.776197] Workqueue: events nvmet_tcp_release_queue_work [nvmet_tcp]
> [  154.783497] Call Trace:
> [  154.786231]  dump_stack+0x93/0xc2
> [  154.789942]  check_noncircular+0x26a/0x310
> [  154.794521]  ? print_circular_bug+0x460/0x460
> [  154.799391]  ? deref_stack_reg+0x170/0x170
> [  154.803967]  ? alloc_chain_hlocks+0x1de/0x520
> [  154.808843]  check_prev_add+0x15e/0x20f0
> [  154.813231]  validate_chain+0xec9/0x19c0
> [  154.817611]  ? check_prev_add+0x20f0/0x20f0
> [  154.822286]  ? save_trace+0x88/0x5e0
> [  154.826290]  __lock_acquire+0xb77/0x18d0
> [  154.830682]  lock_acquire+0x1ca/0x480
> [  154.834775]  ? __flush_work+0x118/0x1a0
> [  154.839066]  ? rcu_read_unlock+0x40/0x40
> [  154.843455]  ? __lock_acquire+0xb77/0x18d0
> [  154.848036]  __flush_work+0x139/0x1a0
> [  154.852120]  ? __flush_work+0x118/0x1a0
> [  154.856409]  ? start_flush_work+0x810/0x810
> [  154.861084]  ? mark_lock+0xd3/0x1470
> [  154.865082]  ? mark_lock_irq+0x1d10/0x1d10
> [  154.869662]  ? lock_downgrade+0x100/0x100
> [  154.874147]  ? mark_held_locks+0xa5/0xe0
> [  154.878522]  ? sk_stream_wait_memory+0xe40/0xe40
> [  154.883686]  ? lockdep_hardirqs_on_prepare.part.0+0x198/0x340
> [  154.890394]  ? __local_bh_enable_ip+0xa2/0x100
> [  154.895358]  ? trace_hardirqs_on+0x1c/0x160
> [  154.900034]  ? sk_stream_wait_memory+0xe40/0xe40
> [  154.905192]  nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
> [  154.911999]  ? lock_is_held_type+0x9a/0x110
> [  154.916676]  process_one_work+0x8b2/0x1480
> [  154.921255]  ? pwq_dec_nr_in_flight+0x260/0x260
> [  154.926315]  ? __lock_contended+0x910/0x910
> [  154.930990]  ? worker_thread+0x150/0xf80
> [  154.935374]  worker_thread+0x590/0xf80
> [  154.939564]  ? __kthread_parkme+0xcb/0x1b0
> [  154.944140]  ? process_one_work+0x1480/0x1480
> [  154.949007]  kthread+0x368/0x440
> [  154.952615]  ? _raw_spin_unlock_irq+0x24/0x30
> [  154.957482]  ? __kthread_bind_mask+0x90/0x90
> [  154.962255]  ret_from_fork+0x22/0x30
> 
> 
> On 3/21/21 3:08 PM, Sagi Grimberg wrote:
>> We are not changing anything in the TCP connection state so
>> we should not take a write_lock but rather a read lock.
>>
>> This caused a deadlock when running nvmet-tcp and nvme-tcp
>> on the same system, where state_change callbacks on the
>> host and on the controller side have causal relationship
>> and made lockdep report on this with blktests:
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: block BH in sk state_change sk callback
  2021-03-21  7:08 [PATCH 1/2] nvme-tcp: block BH in sk state_change sk callback Sagi Grimberg
  2021-03-21  7:08 ` [PATCH 2/2] nvmet-tcp: Fix incorrect locking in " Sagi Grimberg
@ 2021-04-02 16:49 ` Christoph Hellwig
  1 sibling, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2021-04-02 16:49 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni, Yi Zhang

Thanks,

applied both patches to nvme-5.13.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-02 16:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-21  7:08 [PATCH 1/2] nvme-tcp: block BH in sk state_change sk callback Sagi Grimberg
2021-03-21  7:08 ` [PATCH 2/2] nvmet-tcp: Fix incorrect locking in " Sagi Grimberg
2021-03-24  2:06   ` Yi Zhang
2021-03-25 22:39     ` Sagi Grimberg
2021-04-02 16:49 ` [PATCH 1/2] nvme-tcp: block BH in sk " Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).