* [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
@ 2019-11-02 0:27 Anton Eidelman
2019-11-03 14:31 ` Sagi Grimberg
2019-11-04 23:55 ` Christoph Hellwig
0 siblings, 2 replies; 3+ messages in thread
From: Anton Eidelman @ 2019-11-02 0:27 UTC (permalink / raw)
To: linux-nvme, hch, keith.busch, sagi, hare; +Cc: Anton Eidelman
nvme_mpath_clear_ctrl_paths() iterates through
the ctrl->namespaces list while holding ctrl->scan_lock.
This does not seem to be the correct way of protecting
from concurrent list modification.
Specifically, nvme_scan_work() sorts ctrl->namespaces
AFTER unlocking scan_lock.
This may result in the following (rare) crash in ctrl disconnect
during scan_work:
BUG: kernel NULL pointer dereference, address: 0000000000000050
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic
RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core]
...
Call Trace:
nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core]
nvme_remove_namespaces+0x35/0xe0 [nvme_core]
nvme_do_delete_ctrl+0x47/0x90 [nvme_core]
nvme_sysfs_delete+0x49/0x60 [nvme_core]
dev_attr_store+0x17/0x30
sysfs_kf_write+0x3e/0x50
kernfs_fop_write+0x11e/0x1a0
__vfs_write+0x1b/0x40
vfs_write+0xb9/0x1a0
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x5a/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f8d02bfb154
Fix:
After taking scan_lock in nvme_mpath_clear_ctrl_paths()
down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe.
This will not cause deadlocks because taking scan_lock never happens
while holding the namespaces_rwsem.
Moreover, scan work downs namespaces_rwsem in the same order.
Alternative: sort ctrl->namespaces in nvme_scan_work()
while still holding the scan_lock.
This would leave nvme_mpath_clear_ctrl_paths() without correct protection
against ctrl->namespaces modification by anyone other than scan_work.
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
drivers/nvme/host/multipath.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 682be6195a95..cb00afac516d 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -159,9 +159,11 @@ void nvme_mpath_clear_ctrl_paths(struct nvme_ctrl *ctrl)
struct nvme_ns *ns;
mutex_lock(&ctrl->scan_lock);
+ down_read(&ctrl->namespaces_rwsem);
list_for_each_entry(ns, &ctrl->namespaces, list)
if (nvme_mpath_clear_current_path(ns))
kblockd_schedule_work(&ns->head->requeue_work);
+ up_read(&ctrl->namespaces_rwsem);
mutex_unlock(&ctrl->scan_lock);
}
--
2.14.1
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
2019-11-02 0:27 [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths Anton Eidelman
@ 2019-11-03 14:31 ` Sagi Grimberg
2019-11-04 23:55 ` Christoph Hellwig
1 sibling, 0 replies; 3+ messages in thread
From: Sagi Grimberg @ 2019-11-03 14:31 UTC (permalink / raw)
To: Anton Eidelman, linux-nvme, hch, keith.busch, hare
> nvme_mpath_clear_ctrl_paths() iterates through
> the ctrl->namespaces list while holding ctrl->scan_lock.
> This does not seem to be the correct way of protecting
> from concurrent list modification.
>
> Specifically, nvme_scan_work() sorts ctrl->namespaces
> AFTER unlocking scan_lock.
>
> This may result in the following (rare) crash in ctrl disconnect
> during scan_work:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000050
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic
> RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core]
> ...
> Call Trace:
> nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core]
> nvme_remove_namespaces+0x35/0xe0 [nvme_core]
> nvme_do_delete_ctrl+0x47/0x90 [nvme_core]
> nvme_sysfs_delete+0x49/0x60 [nvme_core]
> dev_attr_store+0x17/0x30
> sysfs_kf_write+0x3e/0x50
> kernfs_fop_write+0x11e/0x1a0
> __vfs_write+0x1b/0x40
> vfs_write+0xb9/0x1a0
> ksys_write+0x67/0xe0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x5a/0x130
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f8d02bfb154
>
> Fix:
> After taking scan_lock in nvme_mpath_clear_ctrl_paths()
> down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe.
> This will not cause deadlocks because taking scan_lock never happens
> while holding the namespaces_rwsem.
> Moreover, scan work downs namespaces_rwsem in the same order.
Thanks Anton, this looks good to me,
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
2019-11-02 0:27 [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths Anton Eidelman
2019-11-03 14:31 ` Sagi Grimberg
@ 2019-11-04 23:55 ` Christoph Hellwig
1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2019-11-04 23:55 UTC (permalink / raw)
To: Anton Eidelman; +Cc: keith.busch, hare, hch, linux-nvme, sagi
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-11-04 23:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-02 0:27 [PATCH] nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths Anton Eidelman
2019-11-03 14:31 ` Sagi Grimberg
2019-11-04 23:55 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).