NVMe Over Fabrics Disconnect Kernel error

* NVMe Over Fabrics Disconnect Kernel error
@ 2020-03-28  6:12 Anton Brekhov
  2020-03-29  4:14 ` Sagi Grimberg
  2020-03-31 13:26 ` Christoph Hellwig
  0 siblings, 2 replies; 9+ messages in thread
From: Anton Brekhov @ 2020-03-28  6:12 UTC (permalink / raw)
  To: linux-nvme

Greetings!

We're using nvme-cli technology with ZFS and Lustre Filesystem on top of it.
But we constantly come across a kernel error while disconnecting
remote disks from switched off nodes:
```
[  +0,000089] INFO: task kworker/u593:0:82293 blocked for more than 120 seconds.
[  +0,001959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  +0,001941] kworker/u593:0  D ffff90e8493fe2a0     0 82293      2 0x00000080
[  +0,000031] Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
[  +0,000003] Call Trace:
[  +0,000008]  [<ffffffff8177f229>] schedule+0x29/0x70
[  +0,000010]  [<ffffffff81358e85>] blk_mq_freeze_queue_wait+0x75/0xe0
[  +0,000007]  [<ffffffff810c61c0>] ? wake_up_atomic_t+0x30/0x30
[  +0,000006]  [<ffffffff81359cb4>] blk_freeze_queue+0x24/0x50
[  +0,000009]  [<ffffffff8134e0ef>] blk_cleanup_queue+0x7f/0x1b0
[  +0,000012]  [<ffffffffc031158e>] nvme_ns_remove+0x8e/0xb0 [nvme_core]
[  +0,000011]  [<ffffffffc031174b>] nvme_remove_namespaces+0xab/0xf0 [nvme_core]
[  +0,000012]  [<ffffffffc03117e2>] nvme_delete_ctrl_work+0x52/0x80 [nvme_core]
[  +0,000008]  [<ffffffff810bd0ff>] process_one_work+0x17f/0x440
[  +0,000006]  [<ffffffff810be368>] worker_thread+0x278/0x3c0
[  +0,000006]  [<ffffffff810be0f0>] ? manage_workers.isra.26+0x2a0/0x2a0
[  +0,000005]  [<ffffffff810c50d1>] kthread+0xd1/0xe0
[  +0,000006]  [<ffffffff810c5000>] ? insert_kthread_work+0x40/0x40
[  +0,000006]  [<ffffffff8178cd1d>] ret_from_fork_nospec_begin+0x7/0x21
[  +0,000006]  [<ffffffff810c5000>] ? insert_kthread_work+0x40/0x40
```
Nodes characteristics:
[root@s02p005 ~]# uname -srm
Linux 3.10.0-1062.1.1.el7.x86_64 x86_64
[root@s02p005 ~]# cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)

Where're using nvmet_rdma.
Is there any workaround for such error?

Best Regards,
Anton Brekhov.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread