All of lore.kernel.org
 help / color / mirror / Atom feed
* crash on device removal
@ 2016-07-12 16:34 ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 16:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: sagig, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hey Christoph, 

I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
device removal logic in the nvmf rdma transport.  The crash is here:

(gdb) list *nvme_rdma_free_qe+0x18
0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
191     }
192
193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,
194                     size_t capsule_size, enum dma_data_direction dir)
195     {
196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
197             kfree(qe->data);
198     }
199
200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,

Apparently qe is NULL.

Looking at the device removal path, the logic appears correct (see
nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
concurrently to the host device removal path cleaning up queues, the target is
disconnecting all of its queues due to the first disconnect event from the host
causing some cleanup race on the host side?  Although since the removal path
executing in the cma event handler upcall, I don't think another thread would be
handling a disconnect event.  Maybe the qp async event handler flow?

Thoughts?

Here is the Oops:

[  710.929451] iw_cxgb4:0000:83:00.4: Detach
[  711.242989] iw_cxgb4:0000:82:00.4: Detach
[  711.247039] nvme nvme1: Got rdma device removal event, deleting ctrl
[  711.298244] BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
[  711.306162] IP: [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.313286] PGD 0
[  711.315348] Oops: 0000 [#1] SMP
[  711.318519] Modules linked in: nvme_rdma nvme_fabrics brd iw_cxgb4 cxgb4
ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM
iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc cachefiles
fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad ocrdma be2net
iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib mlx5_core mlx4_en
ib_mthca binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan
vhost tun kvm irqbypass uinput iTCO_wdt iTCO_vendor_support mxm_wmi pcspkr
mlx4_ib ib_core mlx4_core dm_mod i2c_i801 sg ipmi_ssif ipmi_si ipmi_msghandler
nvme nvme_core lpc_ich mfd_core mei_me mei igb dca ptp pps_core wmi ext4(E)
mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E)
drm_kms_helper(E) drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E)
syscopyarea(E) i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4]
[  711.412158] CPU: 0 PID: 4213 Comm: reboot Tainted: G            E
4.7.0-rc2-block-for-next+ #77
[  711.421064] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
[  711.428058] task: ffff881033b495c0 ti: ffff88100fc24000 task.ti:
ffff88100fc24000
[  711.435563] RIP: 0010:[<ffffffffa039a1e8>]  [<ffffffffa039a1e8>]
nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.445104] RSP: 0018:ffff88100fc279a8  EFLAGS: 00010292
[  711.450442] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[  711.457608] RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff881034168000
[  711.464775] RBP: ffff88100fc279b8 R08: 0000000000000001 R09: ffffea0001e51d10
[  711.471943] R10: ffffea0001e51d18 R11: 0000000000000000 R12: 0000000000000000
[  711.479112] R13: 0000000000000020 R14: ffff881034168000 R15: ffff8810345b8140
[  711.486285] FS:  00007feac7042700(0000) GS:ffff88103ee00000(0000)
knlGS:0000000000000000
[  711.494405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  711.500175] CR2: 0000000000000010 CR3: 00000010229d7000 CR4: 00000000000406f0
[  711.507341] Stack:
[  711.509367]  ffff881034285000 0000000000000001 ffff88100fc279f8
ffffffffa039adcf
[  711.516868]  ffff88100fc279d8 ffff881034285000 ffff881037f9f000
ffff881034272c00
[  711.524384]  ffff88100fc27b18 ffff881034272dd8 ffff88100fc27a88
ffffffffa039c8f5
[  711.531897] Call Trace:
[  711.534371]  [<ffffffffa039adcf>] nvme_rdma_destroy_queue_ib+0x5f/0x90
[nvme_rdma]
[  711.541972]  [<ffffffffa039c8f5>] nvme_rdma_cm_handler+0x2c5/0x340
[nvme_rdma]
[  711.549228]  [<ffffffff811ff71d>] ? kmem_cache_free+0x1dd/0x200
[  711.555177]  [<ffffffffa070e669>] ? cma_comp+0x49/0x60 [rdma_cm]
[  711.561217]  [<ffffffffa071310f>] cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
[  711.567860]  [<ffffffffa07131d7>] cma_process_remove+0xb7/0x100 [rdma_cm]
[  711.574678]  [<ffffffff812a4de4>] ? __kernfs_remove+0x114/0x1d0
[  711.580626]  [<ffffffffa071325e>] cma_remove_one+0x3e/0x60 [rdma_cm]
[  711.587015]  [<ffffffffa03b8ca0>] ib_unregister_device+0xb0/0x150 [ib_core]
[  711.595252]  [<ffffffffa0816034>] c4iw_unregister_device+0x64/0x90 [iw_cxgb4]
[  711.603648]  [<ffffffffa0809357>] c4iw_remove+0x27/0x60 [iw_cxgb4]
[  711.611069]  [<ffffffffa080a061>] c4iw_uld_state_change+0x111/0x250
[iw_cxgb4]
[  711.619532]  [<ffffffff816da18d>] ? _cond_resched+0x1d/0x30
[  711.626317]  [<ffffffff81371971>] ? list_del+0x11/0x40
[  711.632678]  [<ffffffffa07ce71a>] detach_ulds+0x4a/0xf0 [cxgb4]
[  711.639822]  [<ffffffffa07ce94d>] remove_one+0x18d/0x1b0 [cxgb4]
[  711.647060]  [<ffffffff81397c21>] pci_device_shutdown+0x41/0x90
[  711.654189]  [<ffffffff814861f5>] device_shutdown+0x45/0x1b0
[  711.661051]  [<ffffffff810ac746>] kernel_restart_prepare+0x36/0x40
[  711.668414]  [<ffffffff810ac8c6>] kernel_restart+0x16/0x60
[  711.675084]  [<ffffffff810acb15>] SYSC_reboot+0x1a5/0x230
[  711.681645]  [<ffffffff81245ad1>] ? mntput+0x21/0x30
[  711.687738]  [<ffffffff812267a7>] ? __fput+0x177/0x240
[  711.693964]  [<ffffffff8122691e>] ? ____fput+0xe/0x10
[  711.700097]  [<ffffffff81003476>] ? do_audit_syscall_entry+0x66/0x70
[  711.707481]  [<ffffffff81003578>] ? syscall_trace_enter_phase1+0xf8/0x120
[  711.715273]  [<ffffffff81003344>] ? exit_to_usermode_loop+0x74/0xf0
[  711.722514]  [<ffffffff810acbae>] SyS_reboot+0xe/0x10
[  711.728517]  [<ffffffff81003f08>] do_syscall_64+0x78/0x1d0
[  711.734931]  [<ffffffff8106e327>] ? do_page_fault+0x37/0x90
[  711.741410]  [<ffffffff816ddee1>] entry_SYSCALL64_slow_path+0x25/0x25
[  711.748731] Code: 01 00 00 c9 c3 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 55
48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 f0 02 00 00 48 89 f3 <48> 8b 76
10 48 85 c0 74 13 ff 50 10 48 8b 7b 08 e8 93 4d e6 e0
[  711.770832] RIP  [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.778904]  RSP <ffff88100fc279a8>
[  711.783290] CR2: 0000000000000010

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 16:34 ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 16:34 UTC (permalink / raw)


Hey Christoph, 

I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
device removal logic in the nvmf rdma transport.  The crash is here:

(gdb) list *nvme_rdma_free_qe+0x18
0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
191     }
192
193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,
194                     size_t capsule_size, enum dma_data_direction dir)
195     {
196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
197             kfree(qe->data);
198     }
199
200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,

Apparently qe is NULL.

Looking at the device removal path, the logic appears correct (see
nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
concurrently to the host device removal path cleaning up queues, the target is
disconnecting all of its queues due to the first disconnect event from the host
causing some cleanup race on the host side?  Although since the removal path
executing in the cma event handler upcall, I don't think another thread would be
handling a disconnect event.  Maybe the qp async event handler flow?

Thoughts?

Here is the Oops:

[  710.929451] iw_cxgb4:0000:83:00.4: Detach
[  711.242989] iw_cxgb4:0000:82:00.4: Detach
[  711.247039] nvme nvme1: Got rdma device removal event, deleting ctrl
[  711.298244] BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
[  711.306162] IP: [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.313286] PGD 0
[  711.315348] Oops: 0000 [#1] SMP
[  711.318519] Modules linked in: nvme_rdma nvme_fabrics brd iw_cxgb4 cxgb4
ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM
iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc cachefiles
fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad ocrdma be2net
iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib mlx5_core mlx4_en
ib_mthca binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan
vhost tun kvm irqbypass uinput iTCO_wdt iTCO_vendor_support mxm_wmi pcspkr
mlx4_ib ib_core mlx4_core dm_mod i2c_i801 sg ipmi_ssif ipmi_si ipmi_msghandler
nvme nvme_core lpc_ich mfd_core mei_me mei igb dca ptp pps_core wmi ext4(E)
mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E)
drm_kms_helper(E) drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E)
syscopyarea(E) i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4]
[  711.412158] CPU: 0 PID: 4213 Comm: reboot Tainted: G            E
4.7.0-rc2-block-for-next+ #77
[  711.421064] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
[  711.428058] task: ffff881033b495c0 ti: ffff88100fc24000 task.ti:
ffff88100fc24000
[  711.435563] RIP: 0010:[<ffffffffa039a1e8>]  [<ffffffffa039a1e8>]
nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.445104] RSP: 0018:ffff88100fc279a8  EFLAGS: 00010292
[  711.450442] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[  711.457608] RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff881034168000
[  711.464775] RBP: ffff88100fc279b8 R08: 0000000000000001 R09: ffffea0001e51d10
[  711.471943] R10: ffffea0001e51d18 R11: 0000000000000000 R12: 0000000000000000
[  711.479112] R13: 0000000000000020 R14: ffff881034168000 R15: ffff8810345b8140
[  711.486285] FS:  00007feac7042700(0000) GS:ffff88103ee00000(0000)
knlGS:0000000000000000
[  711.494405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  711.500175] CR2: 0000000000000010 CR3: 00000010229d7000 CR4: 00000000000406f0
[  711.507341] Stack:
[  711.509367]  ffff881034285000 0000000000000001 ffff88100fc279f8
ffffffffa039adcf
[  711.516868]  ffff88100fc279d8 ffff881034285000 ffff881037f9f000
ffff881034272c00
[  711.524384]  ffff88100fc27b18 ffff881034272dd8 ffff88100fc27a88
ffffffffa039c8f5
[  711.531897] Call Trace:
[  711.534371]  [<ffffffffa039adcf>] nvme_rdma_destroy_queue_ib+0x5f/0x90
[nvme_rdma]
[  711.541972]  [<ffffffffa039c8f5>] nvme_rdma_cm_handler+0x2c5/0x340
[nvme_rdma]
[  711.549228]  [<ffffffff811ff71d>] ? kmem_cache_free+0x1dd/0x200
[  711.555177]  [<ffffffffa070e669>] ? cma_comp+0x49/0x60 [rdma_cm]
[  711.561217]  [<ffffffffa071310f>] cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
[  711.567860]  [<ffffffffa07131d7>] cma_process_remove+0xb7/0x100 [rdma_cm]
[  711.574678]  [<ffffffff812a4de4>] ? __kernfs_remove+0x114/0x1d0
[  711.580626]  [<ffffffffa071325e>] cma_remove_one+0x3e/0x60 [rdma_cm]
[  711.587015]  [<ffffffffa03b8ca0>] ib_unregister_device+0xb0/0x150 [ib_core]
[  711.595252]  [<ffffffffa0816034>] c4iw_unregister_device+0x64/0x90 [iw_cxgb4]
[  711.603648]  [<ffffffffa0809357>] c4iw_remove+0x27/0x60 [iw_cxgb4]
[  711.611069]  [<ffffffffa080a061>] c4iw_uld_state_change+0x111/0x250
[iw_cxgb4]
[  711.619532]  [<ffffffff816da18d>] ? _cond_resched+0x1d/0x30
[  711.626317]  [<ffffffff81371971>] ? list_del+0x11/0x40
[  711.632678]  [<ffffffffa07ce71a>] detach_ulds+0x4a/0xf0 [cxgb4]
[  711.639822]  [<ffffffffa07ce94d>] remove_one+0x18d/0x1b0 [cxgb4]
[  711.647060]  [<ffffffff81397c21>] pci_device_shutdown+0x41/0x90
[  711.654189]  [<ffffffff814861f5>] device_shutdown+0x45/0x1b0
[  711.661051]  [<ffffffff810ac746>] kernel_restart_prepare+0x36/0x40
[  711.668414]  [<ffffffff810ac8c6>] kernel_restart+0x16/0x60
[  711.675084]  [<ffffffff810acb15>] SYSC_reboot+0x1a5/0x230
[  711.681645]  [<ffffffff81245ad1>] ? mntput+0x21/0x30
[  711.687738]  [<ffffffff812267a7>] ? __fput+0x177/0x240
[  711.693964]  [<ffffffff8122691e>] ? ____fput+0xe/0x10
[  711.700097]  [<ffffffff81003476>] ? do_audit_syscall_entry+0x66/0x70
[  711.707481]  [<ffffffff81003578>] ? syscall_trace_enter_phase1+0xf8/0x120
[  711.715273]  [<ffffffff81003344>] ? exit_to_usermode_loop+0x74/0xf0
[  711.722514]  [<ffffffff810acbae>] SyS_reboot+0xe/0x10
[  711.728517]  [<ffffffff81003f08>] do_syscall_64+0x78/0x1d0
[  711.734931]  [<ffffffff8106e327>] ? do_page_fault+0x37/0x90
[  711.741410]  [<ffffffff816ddee1>] entry_SYSCALL64_slow_path+0x25/0x25
[  711.748731] Code: 01 00 00 c9 c3 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 55
48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 f0 02 00 00 48 89 f3 <48> 8b 76
10 48 85 c0 74 13 ff 50 10 48 8b 7b 08 e8 93 4d e6 e0
[  711.770832] RIP  [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.778904]  RSP <ffff88100fc279a8>
[  711.783290] CR2: 0000000000000010

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: crash on device removal
  2016-07-12 16:34 ` Steve Wise
@ 2016-07-12 20:40   ` Ming Lin
  -1 siblings, 0 replies; 20+ messages in thread
From: Ming Lin @ 2016-07-12 20:40 UTC (permalink / raw)
  To: Steve Wise
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA, sagig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, 2016-07-12 at 11:34 -0500, Steve Wise wrote:
> Hey Christoph, 
> 
> I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
> device removal logic in the nvmf rdma transport.  The crash is here:
> 
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 
> Apparently qe is NULL.
> 
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
> concurrently to the host device removal path cleaning up queues, the target is
> disconnecting all of its queues due to the first disconnect event from the host
> causing some cleanup race on the host side?  Although since the removal path
> executing in the cma event handler upcall, I don't think another thread would be
> handling a disconnect event.  Maybe the qp async event handler flow?
> 
> Thoughts?

We actually missed a kref_get in nvme_get_ns_from_disk().

This should fix it. Could you help to verify?

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4babdf0..b146f52 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct gendisk *disk)
 	}
 	spin_unlock(&dev_list_lock);
 
+	kref_get(&ns->ctrl->kref);
+
 	return ns;
 
 fail_put_ns:

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 20:40   ` Ming Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Ming Lin @ 2016-07-12 20:40 UTC (permalink / raw)


On Tue, 2016-07-12@11:34 -0500, Steve Wise wrote:
> Hey Christoph, 
> 
> I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
> device removal logic in the nvmf rdma transport.  The crash is here:
> 
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 
> Apparently qe is NULL.
> 
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
> concurrently to the host device removal path cleaning up queues, the target is
> disconnecting all of its queues due to the first disconnect event from the host
> causing some cleanup race on the host side?  Although since the removal path
> executing in the cma event handler upcall, I don't think another thread would be
> handling a disconnect event.  Maybe the qp async event handler flow?
> 
> Thoughts?

We actually missed a kref_get in nvme_get_ns_from_disk().

This should fix it. Could you help to verify?

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4babdf0..b146f52 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct gendisk *disk)
 	}
 	spin_unlock(&dev_list_lock);
 
+	kref_get(&ns->ctrl->kref);
+
 	return ns;
 
 fail_put_ns:

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* RE: crash on device removal
  2016-07-12 20:40   ` Ming Lin
@ 2016-07-12 21:09     ` Steve Wise
  -1 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 21:09 UTC (permalink / raw)
  To: 'Ming Lin'
  Cc: 'Christoph Hellwig',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'sagig',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

> On Tue, 2016-07-12 at 11:34 -0500, Steve Wise wrote:
> > Hey Christoph,
> >
> > I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> the
> > device removal logic in the nvmf rdma transport.  The crash is here:
> >
> > (gdb) list *nvme_rdma_free_qe+0x18
> > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > 191     }
> > 192
> > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> > 194                     size_t capsule_size, enum dma_data_direction dir)
> > 195     {
> > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > 197             kfree(qe->data);
> > 198     }
> > 199
> > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> >
> > Apparently qe is NULL.
> >
> > Looking at the device removal path, the logic appears correct (see
> > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> if
> > concurrently to the host device removal path cleaning up queues, the target is
> > disconnecting all of its queues due to the first disconnect event from the host
> > causing some cleanup race on the host side?  Although since the removal path
> > executing in the cma event handler upcall, I don't think another thread would be
> > handling a disconnect event.  Maybe the qp async event handler flow?
> >
> > Thoughts?
> 
> We actually missed a kref_get in nvme_get_ns_from_disk().
> 
> This should fix it. Could you help to verify?
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4babdf0..b146f52 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> gendisk *disk)
>  	}
>  	spin_unlock(&dev_list_lock);
> 
> +	kref_get(&ns->ctrl->kref);
> +
>  	return ns;
> 
>  fail_put_ns:

Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:

[  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
[  978.463636] nvme nvme0: creating 32 I/O queues.
[  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
[  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
[  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
[  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]

This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 21:09     ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 21:09 UTC (permalink / raw)


> On Tue, 2016-07-12@11:34 -0500, Steve Wise wrote:
> > Hey Christoph,
> >
> > I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> the
> > device removal logic in the nvmf rdma transport.  The crash is here:
> >
> > (gdb) list *nvme_rdma_free_qe+0x18
> > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > 191     }
> > 192
> > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> > 194                     size_t capsule_size, enum dma_data_direction dir)
> > 195     {
> > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > 197             kfree(qe->data);
> > 198     }
> > 199
> > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> >
> > Apparently qe is NULL.
> >
> > Looking at the device removal path, the logic appears correct (see
> > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> if
> > concurrently to the host device removal path cleaning up queues, the target is
> > disconnecting all of its queues due to the first disconnect event from the host
> > causing some cleanup race on the host side?  Although since the removal path
> > executing in the cma event handler upcall, I don't think another thread would be
> > handling a disconnect event.  Maybe the qp async event handler flow?
> >
> > Thoughts?
> 
> We actually missed a kref_get in nvme_get_ns_from_disk().
> 
> This should fix it. Could you help to verify?
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4babdf0..b146f52 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> gendisk *disk)
>  	}
>  	spin_unlock(&dev_list_lock);
> 
> +	kref_get(&ns->ctrl->kref);
> +
>  	return ns;
> 
>  fail_put_ns:

Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:

[  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
[  978.463636] nvme nvme0: creating 32 I/O queues.
[  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
[  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
[  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
[  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]

This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

Steve.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: crash on device removal
  2016-07-12 21:09     ` Steve Wise
@ 2016-07-12 21:47       ` Ming Lin
  -1 siblings, 0 replies; 20+ messages in thread
From: Ming Lin @ 2016-07-12 21:47 UTC (permalink / raw)
  To: Steve Wise
  Cc: 'Christoph Hellwig',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'sagig',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, 2016-07-12 at 16:09 -0500, Steve Wise wrote:
> > On Tue, 2016-07-12 at 11:34 -0500, Steve Wise wrote:
> > > Hey Christoph,
> > >
> > > I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> > > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> > the
> > > device removal logic in the nvmf rdma transport.  The crash is here:
> > >
> > > (gdb) list *nvme_rdma_free_qe+0x18
> > > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > > 191     }
> > > 192
> > > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > > nvme_rdma_qe *qe,
> > > 194                     size_t capsule_size, enum dma_data_direction dir)
> > > 195     {
> > > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > > 197             kfree(qe->data);
> > > 198     }
> > > 199
> > > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > > nvme_rdma_qe *qe,
> > >
> > > Apparently qe is NULL.
> > >
> > > Looking at the device removal path, the logic appears correct (see
> > > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> > if
> > > concurrently to the host device removal path cleaning up queues, the target is
> > > disconnecting all of its queues due to the first disconnect event from the host
> > > causing some cleanup race on the host side?  Although since the removal path
> > > executing in the cma event handler upcall, I don't think another thread would be
> > > handling a disconnect event.  Maybe the qp async event handler flow?
> > >
> > > Thoughts?
> > 
> > We actually missed a kref_get in nvme_get_ns_from_disk().
> > 
> > This should fix it. Could you help to verify?
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 4babdf0..b146f52 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> > gendisk *disk)
> >  	}
> >  	spin_unlock(&dev_list_lock);
> > 
> > +	kref_get(&ns->ctrl->kref);
> > +
> >  	return ns;
> > 
> >  fail_put_ns:
> 
> Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:
> 
> [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> [  978.463636] nvme nvme0: creating 32 I/O queues.
> [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
> 
> This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

Sorry, the previous patch was wrong.
Here is the right one.

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 1ad47c5..f13e3a6 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -845,6 +845,7 @@ static ssize_t nvmf_dev_write(struct file *file, const char __user *ubuf,
 		goto out_unlock;
 	}
 
+	kref_get(&ctrl->kref);
 	seq_file->private = ctrl;
 
 out_unlock:


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 21:47       ` Ming Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Ming Lin @ 2016-07-12 21:47 UTC (permalink / raw)


On Tue, 2016-07-12@16:09 -0500, Steve Wise wrote:
> > On Tue, 2016-07-12@11:34 -0500, Steve Wise wrote:
> > > Hey Christoph,
> > >
> > > I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> > > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> > the
> > > device removal logic in the nvmf rdma transport.  The crash is here:
> > >
> > > (gdb) list *nvme_rdma_free_qe+0x18
> > > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > > 191     }
> > > 192
> > > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > > nvme_rdma_qe *qe,
> > > 194                     size_t capsule_size, enum dma_data_direction dir)
> > > 195     {
> > > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > > 197             kfree(qe->data);
> > > 198     }
> > > 199
> > > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > > nvme_rdma_qe *qe,
> > >
> > > Apparently qe is NULL.
> > >
> > > Looking at the device removal path, the logic appears correct (see
> > > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> > if
> > > concurrently to the host device removal path cleaning up queues, the target is
> > > disconnecting all of its queues due to the first disconnect event from the host
> > > causing some cleanup race on the host side?  Although since the removal path
> > > executing in the cma event handler upcall, I don't think another thread would be
> > > handling a disconnect event.  Maybe the qp async event handler flow?
> > >
> > > Thoughts?
> > 
> > We actually missed a kref_get in nvme_get_ns_from_disk().
> > 
> > This should fix it. Could you help to verify?
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 4babdf0..b146f52 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> > gendisk *disk)
> >  	}
> >  	spin_unlock(&dev_list_lock);
> > 
> > +	kref_get(&ns->ctrl->kref);
> > +
> >  	return ns;
> > 
> >  fail_put_ns:
> 
> Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:
> 
> [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> [  978.463636] nvme nvme0: creating 32 I/O queues.
> [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
> 
> This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

Sorry, the previous patch was wrong.
Here is the right one.

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 1ad47c5..f13e3a6 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -845,6 +845,7 @@ static ssize_t nvmf_dev_write(struct file *file, const char __user *ubuf,
 		goto out_unlock;
 	}
 
+	kref_get(&ctrl->kref);
 	seq_file->private = ctrl;
 
 out_unlock:

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* RE: crash on device removal
  2016-07-12 21:47       ` Ming Lin
@ 2016-07-12 22:17         ` Steve Wise
  -1 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 22:17 UTC (permalink / raw)
  To: 'Ming Lin'
  Cc: 'Christoph Hellwig',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'sagig',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

> On Tue, 2016-07-12 at 16:09 -0500, Steve Wise wrote:
> > > On Tue, 2016-07-12 at 11:34 -0500, Steve Wise wrote:
> > > > Hey Christoph,
> > > >
> > > > I see a crash when shutting down a nvme host node via 'reboot' that has 1
> target
> > > > device attached.  The shutdown causes iw_cxgb4 to be removed which
> triggers
> > > the
> > > > device removal logic in the nvmf rdma transport.  The crash is here:
> > > >
> > > > (gdb) list *nvme_rdma_free_qe+0x18
> > > > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > > > 191     }
> > > > 192
> > > > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > > > nvme_rdma_qe *qe,
> > > > 194                     size_t capsule_size, enum dma_data_direction dir)
> > > > 195     {
> > > > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > > > 197             kfree(qe->data);
> > > > 198     }
> > > > 199
> > > > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > > > nvme_rdma_qe *qe,
> > > >
> > > > Apparently qe is NULL.
> > > >
> > > > Looking at the device removal path, the logic appears correct (see
> > > > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm
> wondering
> > > if
> > > > concurrently to the host device removal path cleaning up queues, the target is
> > > > disconnecting all of its queues due to the first disconnect event from the host
> > > > causing some cleanup race on the host side?  Although since the removal path
> > > > executing in the cma event handler upcall, I don't think another thread would
> be
> > > > handling a disconnect event.  Maybe the qp async event handler flow?
> > > >
> > > > Thoughts?
> > >
> > > We actually missed a kref_get in nvme_get_ns_from_disk().
> > >
> > > This should fix it. Could you help to verify?
> > >
> > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > index 4babdf0..b146f52 100644
> > > --- a/drivers/nvme/host/core.c
> > > +++ b/drivers/nvme/host/core.c
> > > @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> > > gendisk *disk)
> > >  	}
> > >  	spin_unlock(&dev_list_lock);
> > >
> > > +	kref_get(&ns->ctrl->kref);
> > > +
> > >  	return ns;
> > >
> > >  fail_put_ns:
> >
> > Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another
> crash:
> >
> > [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-
> 08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> > [  978.463636] nvme nvme0: creating 32 I/O queues.
> > [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> > [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> > [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> > [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
> >
> > This looks like another problem with freeing the tag sets before stopping the QP.
> I thought we fixed that once and for all, but perhaps there is some other path we
> missed. :(
> 
> Sorry, the previous patch was wrong.
> Here is the right one.
> 
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 1ad47c5..f13e3a6 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -845,6 +845,7 @@ static ssize_t nvmf_dev_write(struct file *file, const char
> __user *ubuf,
>  		goto out_unlock;
>  	}
> 
> +	kref_get(&ctrl->kref);
>  	seq_file->private = ctrl;
> 
>  out_unlock:

I still see the ib_poll_cq crash with this patch.  I'm adding new debug to figure out the flush issue...


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 22:17         ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 22:17 UTC (permalink / raw)


> On Tue, 2016-07-12@16:09 -0500, Steve Wise wrote:
> > > On Tue, 2016-07-12@11:34 -0500, Steve Wise wrote:
> > > > Hey Christoph,
> > > >
> > > > I see a crash when shutting down a nvme host node via 'reboot' that has 1
> target
> > > > device attached.  The shutdown causes iw_cxgb4 to be removed which
> triggers
> > > the
> > > > device removal logic in the nvmf rdma transport.  The crash is here:
> > > >
> > > > (gdb) list *nvme_rdma_free_qe+0x18
> > > > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > > > 191     }
> > > > 192
> > > > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > > > nvme_rdma_qe *qe,
> > > > 194                     size_t capsule_size, enum dma_data_direction dir)
> > > > 195     {
> > > > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > > > 197             kfree(qe->data);
> > > > 198     }
> > > > 199
> > > > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > > > nvme_rdma_qe *qe,
> > > >
> > > > Apparently qe is NULL.
> > > >
> > > > Looking at the device removal path, the logic appears correct (see
> > > > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm
> wondering
> > > if
> > > > concurrently to the host device removal path cleaning up queues, the target is
> > > > disconnecting all of its queues due to the first disconnect event from the host
> > > > causing some cleanup race on the host side?  Although since the removal path
> > > > executing in the cma event handler upcall, I don't think another thread would
> be
> > > > handling a disconnect event.  Maybe the qp async event handler flow?
> > > >
> > > > Thoughts?
> > >
> > > We actually missed a kref_get in nvme_get_ns_from_disk().
> > >
> > > This should fix it. Could you help to verify?
> > >
> > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > index 4babdf0..b146f52 100644
> > > --- a/drivers/nvme/host/core.c
> > > +++ b/drivers/nvme/host/core.c
> > > @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
> > > gendisk *disk)
> > >  	}
> > >  	spin_unlock(&dev_list_lock);
> > >
> > > +	kref_get(&ns->ctrl->kref);
> > > +
> > >  	return ns;
> > >
> > >  fail_put_ns:
> >
> > Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another
> crash:
> >
> > [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-
> 08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> > [  978.463636] nvme nvme0: creating 32 I/O queues.
> > [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> > [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> > [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> > [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
> >
> > This looks like another problem with freeing the tag sets before stopping the QP.
> I thought we fixed that once and for all, but perhaps there is some other path we
> missed. :(
> 
> Sorry, the previous patch was wrong.
> Here is the right one.
> 
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 1ad47c5..f13e3a6 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -845,6 +845,7 @@ static ssize_t nvmf_dev_write(struct file *file, const char
> __user *ubuf,
>  		goto out_unlock;
>  	}
> 
> +	kref_get(&ctrl->kref);
>  	seq_file->private = ctrl;
> 
>  out_unlock:

I still see the ib_poll_cq crash with this patch.  I'm adding new debug to figure out the flush issue...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: crash on device removal
  2016-07-12 16:34 ` Steve Wise
@ 2016-07-13 10:02   ` Sagi Grimberg
  -1 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-07-13 10:02 UTC (permalink / raw)
  To: Steve Wise, Christoph Hellwig
  Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA



On 12/07/16 19:34, Steve Wise wrote:
> Hey Christoph,
>
> I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
> device removal logic in the nvmf rdma transport.  The crash is here:
>
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
>
> Apparently qe is NULL.
>
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
> concurrently to the host device removal path cleaning up queues, the target is
> disconnecting all of its queues due to the first disconnect event from the host
> causing some cleanup race on the host side?  Although since the removal path
> executing in the cma event handler upcall, I don't think another thread would be
> handling a disconnect event.  Maybe the qp async event handler flow?

Hey Steve,

I never got this error (but didn't test with cxgb4, did this happen
with mlx4/5?). Can you track which qe is it? is it a request qe? is
it a rsp qe? is it the async qe?

Also, it would be beneficial to know which queue handled the event
(admin/io)?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-13 10:02   ` Sagi Grimberg
  0 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-07-13 10:02 UTC (permalink / raw)




On 12/07/16 19:34, Steve Wise wrote:
> Hey Christoph,
>
> I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
> device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
> device removal logic in the nvmf rdma transport.  The crash is here:
>
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
>
> Apparently qe is NULL.
>
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
> concurrently to the host device removal path cleaning up queues, the target is
> disconnecting all of its queues due to the first disconnect event from the host
> causing some cleanup race on the host side?  Although since the removal path
> executing in the cma event handler upcall, I don't think another thread would be
> handling a disconnect event.  Maybe the qp async event handler flow?

Hey Steve,

I never got this error (but didn't test with cxgb4, did this happen
with mlx4/5?). Can you track which qe is it? is it a request qe? is
it a rsp qe? is it the async qe?

Also, it would be beneficial to know which queue handled the event
(admin/io)?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: crash on device removal
  2016-07-12 21:09     ` Steve Wise
@ 2016-07-13 10:06       ` Sagi Grimberg
  -1 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-07-13 10:06 UTC (permalink / raw)
  To: Steve Wise, 'Ming Lin'
  Cc: 'Christoph Hellwig',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


>> We actually missed a kref_get in nvme_get_ns_from_disk().
>>
>> This should fix it. Could you help to verify?
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 4babdf0..b146f52 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
>> gendisk *disk)
>>   	}
>>   	spin_unlock(&dev_list_lock);
>>
>> +	kref_get(&ns->ctrl->kref);
>> +
>>   	return ns;
>>
>>   fail_put_ns:
>
> Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:
>
> [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> [  978.463636] nvme nvme0: creating 32 I/O queues.
> [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
>
> This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

The fix doesn't look right to me. But I wander how you got this crash
now? if at all, this would delay the controller removal...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-13 10:06       ` Sagi Grimberg
  0 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-07-13 10:06 UTC (permalink / raw)



>> We actually missed a kref_get in nvme_get_ns_from_disk().
>>
>> This should fix it. Could you help to verify?
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 4babdf0..b146f52 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -183,6 +183,8 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct
>> gendisk *disk)
>>   	}
>>   	spin_unlock(&dev_list_lock);
>>
>> +	kref_get(&ns->ctrl->kref);
>> +
>>   	return ns;
>>
>>   fail_put_ns:
>
> Hey Ming.  This avoids the crash in nvme_rdma_free_qe(), but now I see another crash:
>
> [  975.633436] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.1.14:4420
> [  978.463636] nvme nvme0: creating 32 I/O queues.
> [  979.187826] nvme nvme0: new ctrl: NQN "testnqn", addr 10.0.1.14:4420
> [  987.778287] nvme nvme0: Got rdma device removal event, deleting ctrl
> [  987.882202] BUG: unable to handle kernel paging request at ffff880e770e01f8
> [  987.890024] IP: [<ffffffffa03a1a46>] __ib_process_cq+0x46/0xc0 [ib_core]
>
> This looks like another problem with freeing the tag sets before stopping the QP.  I thought we fixed that once and for all, but perhaps there is some other path we missed. :(

The fix doesn't look right to me. But I wander how you got this crash
now? if at all, this would delay the controller removal...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: crash on device removal
  2016-07-13 10:02   ` Sagi Grimberg
@ 2016-07-13 15:05       ` Steve Wise
  -1 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-13 15:05 UTC (permalink / raw)
  To: 'Sagi Grimberg', 'Christoph Hellwig'
  Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> On 12/07/16 19:34, Steve Wise wrote:
> > Hey Christoph,
> >
> > I see a crash when shutting down a nvme host node via 'reboot' that has 1
target
> > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> the
> > device removal logic in the nvmf rdma transport.  The crash is here:
> >
> > (gdb) list *nvme_rdma_free_qe+0x18
> > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > 191     }
> > 192
> > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> > 194                     size_t capsule_size, enum dma_data_direction dir)
> > 195     {
> > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > 197             kfree(qe->data);
> > 198     }
> > 199
> > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> >
> > Apparently qe is NULL.
> >
> > Looking at the device removal path, the logic appears correct (see
> > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> if
> > concurrently to the host device removal path cleaning up queues, the target
is
> > disconnecting all of its queues due to the first disconnect event from the
host
> > causing some cleanup race on the host side?  Although since the removal path
> > executing in the cma event handler upcall, I don't think another thread
would be
> > handling a disconnect event.  Maybe the qp async event handler flow?
> 
> Hey Steve,
> 
> I never got this error (but didn't test with cxgb4, did this happen
> with mlx4/5?).

Hey Sagi, I don't see this with mlx4. 

> Can you track which qe is it? is it a request qe? is
> it a rsp qe? is it the async qe?
> 

It happens due to the call to nvme_rdma_destroy_queue_ib() in
nvme_rdma_device_unplug().  So it is queue->rsp_ring.


> Also, it would be beneficial to know which queue handled the event
> (admin/io)?

How do I know this?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-13 15:05       ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-13 15:05 UTC (permalink / raw)


> On 12/07/16 19:34, Steve Wise wrote:
> > Hey Christoph,
> >
> > I see a crash when shutting down a nvme host node via 'reboot' that has 1
target
> > device attached.  The shutdown causes iw_cxgb4 to be removed which triggers
> the
> > device removal logic in the nvmf rdma transport.  The crash is here:
> >
> > (gdb) list *nvme_rdma_free_qe+0x18
> > 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> > 191     }
> > 192
> > 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> > 194                     size_t capsule_size, enum dma_data_direction dir)
> > 195     {
> > 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> > 197             kfree(qe->data);
> > 198     }
> > 199
> > 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> > nvme_rdma_qe *qe,
> >
> > Apparently qe is NULL.
> >
> > Looking at the device removal path, the logic appears correct (see
> > nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering
> if
> > concurrently to the host device removal path cleaning up queues, the target
is
> > disconnecting all of its queues due to the first disconnect event from the
host
> > causing some cleanup race on the host side?  Although since the removal path
> > executing in the cma event handler upcall, I don't think another thread
would be
> > handling a disconnect event.  Maybe the qp async event handler flow?
> 
> Hey Steve,
> 
> I never got this error (but didn't test with cxgb4, did this happen
> with mlx4/5?).

Hey Sagi, I don't see this with mlx4. 

> Can you track which qe is it? is it a request qe? is
> it a rsp qe? is it the async qe?
> 

It happens due to the call to nvme_rdma_destroy_queue_ib() in
nvme_rdma_device_unplug().  So it is queue->rsp_ring.


> Also, it would be beneficial to know which queue handled the event
> (admin/io)?

How do I know this?

Steve.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: crash on device removal
  2016-07-13 15:05       ` Steve Wise
@ 2016-07-13 16:17         ` Steve Wise
  -1 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-13 16:17 UTC (permalink / raw)
  To: 'Sagi Grimberg', 'Christoph Hellwig'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

> > Hey Steve,
> >
> > I never got this error (but didn't test with cxgb4, did this happen
> > with mlx4/5?).
> 
> Hey Sagi, I don't see this with mlx4.
> 

Correction, I ran the mlx4_ib rmmod test again and hit it.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-13 16:17         ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-13 16:17 UTC (permalink / raw)


> > Hey Steve,
> >
> > I never got this error (but didn't test with cxgb4, did this happen
> > with mlx4/5?).
> 
> Hey Sagi, I don't see this with mlx4.
> 

Correction, I ran the mlx4_ib rmmod test again and hit it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: crash on device removal
       [not found] <00cc01d1dc5b$51c7fa90$f557efb0$@opengridcomputing.com>
@ 2016-07-12 16:38   ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 16:38 UTC (permalink / raw)
  To: 'Christoph Hellwig'
  Cc: 'sagig',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> Hey Christoph,
> 
> I see a crash when shutting down a nvme host node via 'reboot' that has 1
> target device attached.  The shutdown causes iw_cxgb4 to be removed which
> triggers the device removal logic in the nvmf rdma transport.  The crash
> is here:
> 
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 
> Apparently qe is NULL.
> 
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm
> wondering if concurrently to the host device removal path cleaning up
> queues, the target is disconnecting all of its queues due to the first
> disconnect event from the host causing some cleanup race on the host side?
> Although since the removal path executing in the cma event handler upcall,
> I don't think another thread would be handling a disconnect event.  Maybe
> the qp async event handler flow?
>

I see the async event handler, nvme_rdma_qp_event() does nothing but a
pr_debug(), so no race with the cm event handler thread and the async event
handler thread...

 
> Thoughts?
> 
> Here is the Oops:
> 
> [  710.929451] iw_cxgb4:0000:83:00.4: Detach
> [  711.242989] iw_cxgb4:0000:82:00.4: Detach
> [  711.247039] nvme nvme1: Got rdma device removal event, deleting ctrl
> [  711.298244] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010
> [  711.306162] IP: [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80
> [nvme_rdma]
> [  711.313286] PGD 0
> [  711.315348] Oops: 0000 [#1] SMP
> [  711.318519] Modules linked in: nvme_rdma nvme_fabrics brd iw_cxgb4
> cxgb4 ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
> nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM
> iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc
> cachefiles fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad
> ocrdma be2net iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib
> mlx5_core mlx4_en ib_mthca binfmt_misc dm_mirror dm_region_hash dm_log
> vhost_net macvtap macvlan vhost tun kvm irqbypass uinput iTCO_wdt
> iTCO_vendor_support mxm_wmi pcspkr mlx4_ib ib_core mlx4_core dm_mod
> i2c_i801 sg ipmi_ssif ipmi_si ipmi_msghandler nvme nvme_core lpc_ich
> mfd_core mei_me mei igb dca ptp pps_core wmi ext4(E) mbcache(E) jbd2(E)
> sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E) drm_kms_helper(E)
> drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E) syscopyarea(E)
> i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4]
> [  711.412158] CPU: 0 PID: 4213 Comm: reboot Tainted: G            E
> 4.7.0-rc2-block-for-next+ #77
> [  711.421064] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a
> 07/09/2015
> [  711.428058] task: ffff881033b495c0 ti: ffff88100fc24000 task.ti:
> ffff88100fc24000
> [  711.435563] RIP: 0010:[<ffffffffa039a1e8>]  [<ffffffffa039a1e8>]
> nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
> [  711.445104] RSP: 0018:ffff88100fc279a8  EFLAGS: 00010292
> [  711.450442] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000002
> [  711.457608] RDX: 0000000000000010 RSI: 0000000000000000 RDI:
> ffff881034168000
> [  711.464775] RBP: ffff88100fc279b8 R08: 0000000000000001 R09:
> ffffea0001e51d10
> [  711.471943] R10: ffffea0001e51d18 R11: 0000000000000000 R12:
> 0000000000000000
> [  711.479112] R13: 0000000000000020 R14: ffff881034168000 R15:
> ffff8810345b8140
> [  711.486285] FS:  00007feac7042700(0000) GS:ffff88103ee00000(0000)
> knlGS:0000000000000000
> [  711.494405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  711.500175] CR2: 0000000000000010 CR3: 00000010229d7000 CR4:
> 00000000000406f0
> [  711.507341] Stack:
> [  711.509367]  ffff881034285000 0000000000000001 ffff88100fc279f8
> ffffffffa039adcf
> [  711.516868]  ffff88100fc279d8 ffff881034285000 ffff881037f9f000
> ffff881034272c00
> [  711.524384]  ffff88100fc27b18 ffff881034272dd8 ffff88100fc27a88
> ffffffffa039c8f5
> [  711.531897] Call Trace:
> [  711.534371]  [<ffffffffa039adcf>] nvme_rdma_destroy_queue_ib+0x5f/0x90
> [nvme_rdma]
> [  711.541972]  [<ffffffffa039c8f5>] nvme_rdma_cm_handler+0x2c5/0x340
> [nvme_rdma]
> [  711.549228]  [<ffffffff811ff71d>] ? kmem_cache_free+0x1dd/0x200
> [  711.555177]  [<ffffffffa070e669>] ? cma_comp+0x49/0x60 [rdma_cm]
> [  711.561217]  [<ffffffffa071310f>] cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
> [  711.567860]  [<ffffffffa07131d7>] cma_process_remove+0xb7/0x100
> [rdma_cm]
> [  711.574678]  [<ffffffff812a4de4>] ? __kernfs_remove+0x114/0x1d0
> [  711.580626]  [<ffffffffa071325e>] cma_remove_one+0x3e/0x60 [rdma_cm]
> [  711.587015]  [<ffffffffa03b8ca0>] ib_unregister_device+0xb0/0x150
> [ib_core]
> [  711.595252]  [<ffffffffa0816034>] c4iw_unregister_device+0x64/0x90
> [iw_cxgb4]
> [  711.603648]  [<ffffffffa0809357>] c4iw_remove+0x27/0x60 [iw_cxgb4]
> [  711.611069]  [<ffffffffa080a061>] c4iw_uld_state_change+0x111/0x250
> [iw_cxgb4]
> [  711.619532]  [<ffffffff816da18d>] ? _cond_resched+0x1d/0x30
> [  711.626317]  [<ffffffff81371971>] ? list_del+0x11/0x40
> [  711.632678]  [<ffffffffa07ce71a>] detach_ulds+0x4a/0xf0 [cxgb4]
> [  711.639822]  [<ffffffffa07ce94d>] remove_one+0x18d/0x1b0 [cxgb4]
> [  711.647060]  [<ffffffff81397c21>] pci_device_shutdown+0x41/0x90
> [  711.654189]  [<ffffffff814861f5>] device_shutdown+0x45/0x1b0
> [  711.661051]  [<ffffffff810ac746>] kernel_restart_prepare+0x36/0x40
> [  711.668414]  [<ffffffff810ac8c6>] kernel_restart+0x16/0x60
> [  711.675084]  [<ffffffff810acb15>] SYSC_reboot+0x1a5/0x230
> [  711.681645]  [<ffffffff81245ad1>] ? mntput+0x21/0x30
> [  711.687738]  [<ffffffff812267a7>] ? __fput+0x177/0x240
> [  711.693964]  [<ffffffff8122691e>] ? ____fput+0xe/0x10
> [  711.700097]  [<ffffffff81003476>] ? do_audit_syscall_entry+0x66/0x70
> [  711.707481]  [<ffffffff81003578>] ?
> syscall_trace_enter_phase1+0xf8/0x120
> [  711.715273]  [<ffffffff81003344>] ? exit_to_usermode_loop+0x74/0xf0
> [  711.722514]  [<ffffffff810acbae>] SyS_reboot+0xe/0x10
> [  711.728517]  [<ffffffff81003f08>] do_syscall_64+0x78/0x1d0
> [  711.734931]  [<ffffffff8106e327>] ? do_page_fault+0x37/0x90
> [  711.741410]  [<ffffffff816ddee1>] entry_SYSCALL64_slow_path+0x25/0x25
> [  711.748731] Code: 01 00 00 c9 c3 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00
> 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 f0 02 00 00 48 89 f3
> <48> 8b 76 10 48 85 c0 74 13 ff 50 10 48 8b 7b 08 e8 93 4d e6 e0
> [  711.770832] RIP  [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80
> [nvme_rdma]
> [  711.778904]  RSP <ffff88100fc279a8>
> [  711.783290] CR2: 0000000000000010

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* crash on device removal
@ 2016-07-12 16:38   ` Steve Wise
  0 siblings, 0 replies; 20+ messages in thread
From: Steve Wise @ 2016-07-12 16:38 UTC (permalink / raw)


> 
> Hey Christoph,
> 
> I see a crash when shutting down a nvme host node via 'reboot' that has 1
> target device attached.  The shutdown causes iw_cxgb4 to be removed which
> triggers the device removal logic in the nvmf rdma transport.  The crash
> is here:
> 
> (gdb) list *nvme_rdma_free_qe+0x18
> 0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
> 191     }
> 192
> 193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 194                     size_t capsule_size, enum dma_data_direction dir)
> 195     {
> 196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
> 197             kfree(qe->data);
> 198     }
> 199
> 200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
> nvme_rdma_qe *qe,
> 
> Apparently qe is NULL.
> 
> Looking at the device removal path, the logic appears correct (see
> nvme_rdma_device_unplug() and the nice function comment :) ).  I'm
> wondering if concurrently to the host device removal path cleaning up
> queues, the target is disconnecting all of its queues due to the first
> disconnect event from the host causing some cleanup race on the host side?
> Although since the removal path executing in the cma event handler upcall,
> I don't think another thread would be handling a disconnect event.  Maybe
> the qp async event handler flow?
>

I see the async event handler, nvme_rdma_qp_event() does nothing but a
pr_debug(), so no race with the cm event handler thread and the async event
handler thread...

 
> Thoughts?
> 
> Here is the Oops:
> 
> [  710.929451] iw_cxgb4:0000:83:00.4: Detach
> [  711.242989] iw_cxgb4:0000:82:00.4: Detach
> [  711.247039] nvme nvme1: Got rdma device removal event, deleting ctrl
> [  711.298244] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010
> [  711.306162] IP: [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80
> [nvme_rdma]
> [  711.313286] PGD 0
> [  711.315348] Oops: 0000 [#1] SMP
> [  711.318519] Modules linked in: nvme_rdma nvme_fabrics brd iw_cxgb4
> cxgb4 ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
> nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM
> iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc
> cachefiles fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad
> ocrdma be2net iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib
> mlx5_core mlx4_en ib_mthca binfmt_misc dm_mirror dm_region_hash dm_log
> vhost_net macvtap macvlan vhost tun kvm irqbypass uinput iTCO_wdt
> iTCO_vendor_support mxm_wmi pcspkr mlx4_ib ib_core mlx4_core dm_mod
> i2c_i801 sg ipmi_ssif ipmi_si ipmi_msghandler nvme nvme_core lpc_ich
> mfd_core mei_me mei igb dca ptp pps_core wmi ext4(E) mbcache(E) jbd2(E)
> sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E) drm_kms_helper(E)
> drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E) syscopyarea(E)
> i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4]
> [  711.412158] CPU: 0 PID: 4213 Comm: reboot Tainted: G            E
> 4.7.0-rc2-block-for-next+ #77
> [  711.421064] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a
> 07/09/2015
> [  711.428058] task: ffff881033b495c0 ti: ffff88100fc24000 task.ti:
> ffff88100fc24000
> [  711.435563] RIP: 0010:[<ffffffffa039a1e8>]  [<ffffffffa039a1e8>]
> nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
> [  711.445104] RSP: 0018:ffff88100fc279a8  EFLAGS: 00010292
> [  711.450442] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000002
> [  711.457608] RDX: 0000000000000010 RSI: 0000000000000000 RDI:
> ffff881034168000
> [  711.464775] RBP: ffff88100fc279b8 R08: 0000000000000001 R09:
> ffffea0001e51d10
> [  711.471943] R10: ffffea0001e51d18 R11: 0000000000000000 R12:
> 0000000000000000
> [  711.479112] R13: 0000000000000020 R14: ffff881034168000 R15:
> ffff8810345b8140
> [  711.486285] FS:  00007feac7042700(0000) GS:ffff88103ee00000(0000)
> knlGS:0000000000000000
> [  711.494405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  711.500175] CR2: 0000000000000010 CR3: 00000010229d7000 CR4:
> 00000000000406f0
> [  711.507341] Stack:
> [  711.509367]  ffff881034285000 0000000000000001 ffff88100fc279f8
> ffffffffa039adcf
> [  711.516868]  ffff88100fc279d8 ffff881034285000 ffff881037f9f000
> ffff881034272c00
> [  711.524384]  ffff88100fc27b18 ffff881034272dd8 ffff88100fc27a88
> ffffffffa039c8f5
> [  711.531897] Call Trace:
> [  711.534371]  [<ffffffffa039adcf>] nvme_rdma_destroy_queue_ib+0x5f/0x90
> [nvme_rdma]
> [  711.541972]  [<ffffffffa039c8f5>] nvme_rdma_cm_handler+0x2c5/0x340
> [nvme_rdma]
> [  711.549228]  [<ffffffff811ff71d>] ? kmem_cache_free+0x1dd/0x200
> [  711.555177]  [<ffffffffa070e669>] ? cma_comp+0x49/0x60 [rdma_cm]
> [  711.561217]  [<ffffffffa071310f>] cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
> [  711.567860]  [<ffffffffa07131d7>] cma_process_remove+0xb7/0x100
> [rdma_cm]
> [  711.574678]  [<ffffffff812a4de4>] ? __kernfs_remove+0x114/0x1d0
> [  711.580626]  [<ffffffffa071325e>] cma_remove_one+0x3e/0x60 [rdma_cm]
> [  711.587015]  [<ffffffffa03b8ca0>] ib_unregister_device+0xb0/0x150
> [ib_core]
> [  711.595252]  [<ffffffffa0816034>] c4iw_unregister_device+0x64/0x90
> [iw_cxgb4]
> [  711.603648]  [<ffffffffa0809357>] c4iw_remove+0x27/0x60 [iw_cxgb4]
> [  711.611069]  [<ffffffffa080a061>] c4iw_uld_state_change+0x111/0x250
> [iw_cxgb4]
> [  711.619532]  [<ffffffff816da18d>] ? _cond_resched+0x1d/0x30
> [  711.626317]  [<ffffffff81371971>] ? list_del+0x11/0x40
> [  711.632678]  [<ffffffffa07ce71a>] detach_ulds+0x4a/0xf0 [cxgb4]
> [  711.639822]  [<ffffffffa07ce94d>] remove_one+0x18d/0x1b0 [cxgb4]
> [  711.647060]  [<ffffffff81397c21>] pci_device_shutdown+0x41/0x90
> [  711.654189]  [<ffffffff814861f5>] device_shutdown+0x45/0x1b0
> [  711.661051]  [<ffffffff810ac746>] kernel_restart_prepare+0x36/0x40
> [  711.668414]  [<ffffffff810ac8c6>] kernel_restart+0x16/0x60
> [  711.675084]  [<ffffffff810acb15>] SYSC_reboot+0x1a5/0x230
> [  711.681645]  [<ffffffff81245ad1>] ? mntput+0x21/0x30
> [  711.687738]  [<ffffffff812267a7>] ? __fput+0x177/0x240
> [  711.693964]  [<ffffffff8122691e>] ? ____fput+0xe/0x10
> [  711.700097]  [<ffffffff81003476>] ? do_audit_syscall_entry+0x66/0x70
> [  711.707481]  [<ffffffff81003578>] ?
> syscall_trace_enter_phase1+0xf8/0x120
> [  711.715273]  [<ffffffff81003344>] ? exit_to_usermode_loop+0x74/0xf0
> [  711.722514]  [<ffffffff810acbae>] SyS_reboot+0xe/0x10
> [  711.728517]  [<ffffffff81003f08>] do_syscall_64+0x78/0x1d0
> [  711.734931]  [<ffffffff8106e327>] ? do_page_fault+0x37/0x90
> [  711.741410]  [<ffffffff816ddee1>] entry_SYSCALL64_slow_path+0x25/0x25
> [  711.748731] Code: 01 00 00 c9 c3 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00
> 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 f0 02 00 00 48 89 f3
> <48> 8b 76 10 48 85 c0 74 13 ff 50 10 48 8b 7b 08 e8 93 4d e6 e0
> [  711.770832] RIP  [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80
> [nvme_rdma]
> [  711.778904]  RSP <ffff88100fc279a8>
> [  711.783290] CR2: 0000000000000010

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-07-13 16:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-12 16:34 crash on device removal Steve Wise
2016-07-12 16:34 ` Steve Wise
2016-07-12 20:40 ` Ming Lin
2016-07-12 20:40   ` Ming Lin
2016-07-12 21:09   ` Steve Wise
2016-07-12 21:09     ` Steve Wise
2016-07-12 21:47     ` Ming Lin
2016-07-12 21:47       ` Ming Lin
2016-07-12 22:17       ` Steve Wise
2016-07-12 22:17         ` Steve Wise
2016-07-13 10:06     ` Sagi Grimberg
2016-07-13 10:06       ` Sagi Grimberg
2016-07-13 10:02 ` Sagi Grimberg
2016-07-13 10:02   ` Sagi Grimberg
     [not found]   ` <578611AA.4000103-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-07-13 15:05     ` Steve Wise
2016-07-13 15:05       ` Steve Wise
2016-07-13 16:17       ` Steve Wise
2016-07-13 16:17         ` Steve Wise
     [not found] <00cc01d1dc5b$51c7fa90$f557efb0$@opengridcomputing.com>
2016-07-12 16:38 ` Steve Wise
2016-07-12 16:38   ` Steve Wise

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.