All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: hch-jcswGhMUV9g@public.gmane.org,
	sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Deadlock on device removal event for NVMeF target
Date: Mon, 26 Jun 2017 17:59:20 -0500	[thread overview]
Message-ID: <20170626225920.GA11700@ssaleem-MOBL4.amr.corp.intel.com> (raw)

Hi Sagi/Christoph,

I am seeing a deadlock for a device removal event on NVMeF target.

The sequence of events leading to the deadlock are as follows,

1. i40iw posts IW_CM_EVENT_CLOSE events for all QPs causing the corresponding 
NVMet RDMA Queues to disconnect and schedule release of any pending work on WQ
2. i40iw triggers device removal 
	ib_unregister_device
	[..]
	cma_remove_id_dev (takes a handler lock before calling the event handler)
	nvmet_rdma_cm_handler
	nvmet_rdma_device_removal (queue->state = NVMET_RDMA_Q_DISCONNECTING due to 1.)
	flush_scheduled_work (blocks till all scheduled work is drained from WQ)
	nvmet_rdma_release_queue_work (queue->state = NVMET_RDMA_Q_DISCONNECTING)
	rdma_destroy_id (waits on the same handler lock as cma_remove_id_dev causing the deadlock)
     
So this problem can occur when there is a device removal event while the queue is in 
disconnect state with the some oustanding work that hasnt been drained from the WQ at the
time flush_scheduled_work is called.

Here is a call trace with some debug logs. The problematic cm_id is ffff8806e9924d18.

[371.577371] nvmet_rdma: nvmet_rdma_cm_handler: disconnected (10): cm_id ffff8806de8acd18
[371.577378] nvmet_rdma: __nvmet_rdma_queue_disconnect: cm_id= ffff8806de8acd18 queue->state= 1 queue->idx = 1
[371.577381] nvmet_rdma: __nvmet_rdma_queue_disconnect: rdma_disconnect cm_id= ffff8806de8acd18
[371.577501] nvmet_rdma: nvmet_rdma_cm_handler: disconnected (10): cm_id ffff8806e9924d18
[371.577502] nvmet_rdma: nvmet_rdma_cm_handler: device removal (11): cm_id ffff8806e6b6d5a8
[371.577504] nvmet_rdma: nvmet_rdma_device_removal: queue           (null)
[371.577506] nvmet_rdma: __nvmet_rdma_queue_disconnect: cm_id= ffff8806e9924d18 queue->state= 1 queue->idx = 0
[371.577507] rdma_destroy_id id ffff8806e6b6d5a8
[371.577509] nvmet_rdma: __nvmet_rdma_queue_disconnect: rdma_disconnect cm_id= ffff8806e9924d18
[371.577870] nvmet_rdma: nvmet_rdma_cm_handler: device removal (11): cm_id ffff8806e9924d18
[371.577877] nvmet_rdma: nvmet_rdma_device_removal: calling flush_scheduled_work queue state 2 idx 0 cm_id ffff8806e9924d18
[371.580174] nvmet_rdma: nvmet_rdma_queue_response_post_send: 2 opcode 2 queue_state 2 queue_idx 0
[371.580198] nvmet_rdma: sending cmd response failed
[371.596356] nvmet_rdma: nvmet_rdma_release_queue_work: calling rdma_destroy_id cm_id ffff8806de8acd18 queue_idx 1 state 2
[371.596358] rdma_destroy_id id ffff8806de8acd18
[371.596512] nvmet_rdma: nvmet_rdma_release_queue_work: calling rdma_destroy_id cm_id ffff8806e9924d18 queue_idx 0 state 2
[371.596514] rdma_destroy_id id ffff8806e9924d18                    ------------------------------------------------> Hang                                                             
[614.390162] INFO: task kworker/6:1:4429 blocked for more than 120 seconds.
[614.390165]       Tainted: G           O    4.11.0-rc8+ #11
[614.390166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[614.390167] kworker/6:1     D30016  4429      2 0x00000000
[614.390176] Workqueue: events nvmet_rdma_release_queue_work [nvmet_rdma]
[614.390178] Call Trace:
[614.390182]  __schedule+0x378/0xbf0
[614.390185]  schedule+0x38/0x90
[614.390186]  schedule_preempt_disabled+0x10/0x20
[614.390188]  __mutex_lock+0x2ac/0x960
[614.390190]  ? __mutex_lock+0xb6/0x960
[614.390193]  ? rdma_destroy_id+0x6a/0x2e0 [rdma_cm]
[614.390196]  mutex_lock_nested+0x16/0x20
[614.390198]  rdma_destroy_id+0x6a/0x2e0 [rdma_cm]
[614.390201]  nvmet_rdma_release_queue_work+0x78/0x80 [nvmet_rdma]
[614.390203]  process_one_work+0x203/0x710
[614.390204]  ? process_one_work+0x16f/0x710
[614.390207]  worker_thread+0x126/0x4a0
[614.390210]  kthread+0x112/0x150
[614.390212]  ? process_one_work+0x710/0x710
[614.390213]  ? kthread_create_on_node+0x40/0x40
[614.390216]  ret_from_fork+0x2e/0x40
[614.390219] INFO: lockdep is turned off.
[614.390239] INFO: task rmmod:8443 blocked for more than 120 seconds.
[614.390241]       Tainted: G           O    4.11.0-rc8+ #11
[614.390242] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[614.390243] rmmod           D29880  8443   7036 0x00000080
[614.390248] Call Trace:
[614.390250]  __schedule+0x378/0xbf0
[614.390252]  schedule+0x38/0x90
[614.390254]  schedule_timeout+0x235/0x4f0
[614.390257]  ? mark_held_locks+0x71/0x90
[614.390259]  ? _raw_spin_unlock_irq+0x27/0x40
[614.390261]  wait_for_completion+0xb4/0x120
[614.390264]  ? wake_up_q+0x70/0x70
[614.390267]  flush_workqueue+0x211/0x6c0
[614.390269]  ? flush_workqueue+0x100/0x6c0
[614.390271]  ? vprintk_emit+0x211/0x2e0
[614.390275]  ? printk+0x43/0x45
[614.390277]  ? wait_for_completion+0x36/0x120
[614.390280]  nvmet_rdma_cm_handler+0x34f/0xf50 [nvmet_rdma]
[614.390282]  ? cma_comp+0x39/0x50 [rdma_cm]
[614.390284]  ? mark_held_locks+0x71/0x90
[614.390286]  ? _raw_spin_unlock_irqrestore+0x31/0x50
[614.390289]  ? trace_hardirqs_on+0xd/0x10
[614.390292]  cma_remove_one+0x1fc/0x220 [rdma_cm]
[614.390299]  ib_unregister_device+0xd8/0x170 [ib_core]
[614.390303]  i40iw_destroy_rdma_device+0x55/0x160 [i40iw]
[614.390307]  i40iw_deinit_device+0x96/0x460 [i40iw]
[614.390311]  i40iw_close+0x54/0xc0 [i40iw]
[614.390316]  i40e_unregister_client+0xaf/0x1b0 [i40e]
[614.390320]  i40iw_exit_module+0x10/0xaa4 [i40iw]
[614.390322]  SyS_delete_module+0x16a/0x230
[614.390325]  do_syscall_64+0x59/0x1a0
[614.390327]  entry_SYSCALL64_slow_path+0x25/0x25
[614.390329] RIP: 0033:0x7f955841d697
[614.390339] INFO: lockdep is turned off

This patch fixed the problem but I am not sure if it is the correct fix.

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 9e45cde..d0fb307 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1349,6 +1349,12 @@ static int nvmet_rdma_device_removal(struct rdma_cm_id *cm_id,
                spin_lock_irqsave(&queue->state_lock, flags);
                if (queue->state != NVMET_RDMA_Q_DISCONNECTING)
                        queue->state = NVMET_RDMA_IN_DEVICE_REMOVAL;
+               else {
+                       /*queue is disconnecting; so cm_id and queues will be destroyed*/
+                       spin_unlock_irqrestore(&queue->state_lock, flags);
+                       return 0;
+               }
+
                spin_unlock_irqrestore(&queue->state_lock, flags);
                nvmet_rdma_queue_disconnect(queue);
                flush_scheduled_work();


Shiraz
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2017-06-26 22:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-26 22:59 Shiraz Saleem [this message]
     [not found] ` <20170626225920.GA11700-GOXS9JX10wfOxmVO0tvppfooFf0ArEBIu+b9c/7xato@public.gmane.org>
2017-06-27  6:37   ` Deadlock on device removal event for NVMeF target Sagi Grimberg
2017-06-27  6:37     ` Sagi Grimberg
     [not found]     ` <56030fcd-b8a0-fc0e-18e5-985ebf16a82e-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-06-27 19:31       ` Shiraz Saleem
2017-06-27 19:31         ` Shiraz Saleem
     [not found]         ` <20170627193157.GA29768-GOXS9JX10wfOxmVO0tvppfooFf0ArEBIu+b9c/7xato@public.gmane.org>
2017-06-28  6:50           ` Sagi Grimberg
2017-06-28  6:50             ` Sagi Grimberg
     [not found]             ` <61858a46-ebf1-a5bd-5213-65dadaadb84d-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-06-29 13:30               ` Robert LeBlanc
2017-06-29 13:30                 ` Robert LeBlanc
     [not found]                 ` <CAANLjFr++5daZ6Vn8TYxcM0oMyU4PuMztcM5KKM6mOy7HEs7KA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-29 14:32                   ` Sagi Grimberg
2017-06-29 14:32                     ` Sagi Grimberg
     [not found]                     ` <3e559faf-9ea4-081e-c9cd-cb1c36b4673f-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-06-29 16:18                       ` Robert LeBlanc
2017-06-29 16:18                         ` Robert LeBlanc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170626225920.GA11700@ssaleem-MOBL4.amr.corp.intel.com \
    --to=shiraz.saleem-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.