Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created
@ 2020-03-24 15:29 Israel Rukshin
  2020-03-24 15:29 ` [PATCH 1/7] nvme: Remove unused return code from nvme_delete_ctrl_sync Israel Rukshin
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

The patchset starts with small cleanups, continue with the fixs and ends
with adding new warnings.
Calling nvme_sysfs_delete() when the controller is in the middle of
creation may cause several bugs. If the controller is in NEW state we
remove delete_controller file and don't delete the controller. The user
will not be able to use nvme disconnect command on that controller again,
although the controller may be active. Other bugs may happen if the
controller is in the middle of create_ctrl callback and
nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at
nvme_do_delete_ctrl() before it was created or controller use after free
at create_ctrl callback.

Changes from v2:
 - Added Reviewed-by signatures
 - Patches reordering
 - Don't move the calls from the ->remove PCI driver method to the
   nvme_ctrl_ops ->free_ctrl method ((Patch 2/7))
 - Add more details to commit message (Patch 3/7)
 - Split patch "Fix ctrl use-after-free during sysfs deletion"
   (Patches 3/7 and 4/7)

Changes from v1:
 - Added Reviewed-by signatures
 - Patches reordering
 - New patches (2/6 and 3/6)
 - Split to 2 WARN_ON_ONCE (Patches 5/6 and 6/6)

Israel Rukshin (7):
  nvme: Remove unused return code from nvme_delete_ctrl_sync
  nvme-pci: Re-order nvme_pci_free_ctrl
  nvme: Fix ctrl use-after-free during sysfs deletion
  nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
  nvme: Fix controller creation races with teardown flow
  nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl

 drivers/nvme/host/core.c   | 18 ++++++++++--------
 drivers/nvme/host/fc.c     |  3 ---
 drivers/nvme/host/nvme.h   |  1 +
 drivers/nvme/host/pci.c    |  8 +++-----
 drivers/nvme/host/rdma.c   |  9 ++++++---
 drivers/nvme/host/tcp.c    |  9 ++++++---
 drivers/nvme/target/loop.c |  3 ---
 7 files changed, 26 insertions(+), 25 deletions(-)

-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] nvme: Remove unused return code from nvme_delete_ctrl_sync
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 15:29 ` [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl Israel Rukshin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

The return code of nvme_delete_ctrl_sync is never used, so change it to
void.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 68e7c75..a461220 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -192,21 +192,16 @@ int nvme_delete_ctrl(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_delete_ctrl);
 
-static int nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
+static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
 {
-	int ret = 0;
-
 	/*
 	 * Keep a reference until nvme_do_delete_ctrl() complete,
 	 * since ->delete_ctrl can free the controller.
 	 */
 	nvme_get_ctrl(ctrl);
-	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
-		ret = -EBUSY;
-	if (!ret)
+	if (nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
 		nvme_do_delete_ctrl(ctrl);
 	nvme_put_ctrl(ctrl);
-	return ret;
 }
 
 static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
  2020-03-24 15:29 ` [PATCH 1/7] nvme: Remove unused return code from nvme_delete_ctrl_sync Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 15:42   ` Christoph Hellwig
  2020-03-24 15:29 ` [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion Israel Rukshin
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

Destroy the resources in the same order like in nvme_probe error flow to
improve code readability.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/host/pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e6fa0c7..ff0bd2d 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2470,13 +2470,13 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
 	struct nvme_dev *dev = to_nvme_dev(ctrl);
 
 	nvme_dbbuf_dma_free(dev);
-	put_device(dev->dev);
 	nvme_free_tagset(dev);
 	if (dev->ctrl.admin_q)
 		blk_put_queue(dev->ctrl.admin_q);
-	kfree(dev->queues);
 	free_opal_dev(dev->ctrl.opal_dev);
 	mempool_destroy(dev->iod_mempool);
+	put_device(dev->dev);
+	kfree(dev->queues);
 	kfree(dev);
 }
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
  2020-03-24 15:29 ` [PATCH 1/7] nvme: Remove unused return code from nvme_delete_ctrl_sync Israel Rukshin
  2020-03-24 15:29 ` [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 15:42   ` Christoph Hellwig
  2020-03-24 15:29 ` [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl Israel Rukshin
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

In case nvme_sysfs_delete() is called by the user before taking the ctrl
reference count, the ctrl may be freed during the creation and cause the
bug. Take the reference as soon as the controller is externally visible,
which is done by cdev_device_add() in nvme_init_ctrl(). Also take the
reference count at the core layer instead of taking it on each transport
separately.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/host/core.c   | 2 ++
 drivers/nvme/host/fc.c     | 4 +---
 drivers/nvme/host/pci.c    | 1 -
 drivers/nvme/host/rdma.c   | 3 +--
 drivers/nvme/host/tcp.c    | 3 +--
 drivers/nvme/target/loop.c | 3 +--
 6 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a461220..5ccbcac 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4130,6 +4130,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	if (ret)
 		goto out_release_instance;
 
+	nvme_get_ctrl(ctrl);
 	cdev_init(&ctrl->cdev, &nvme_dev_fops);
 	ctrl->cdev.owner = ops->module;
 	ret = cdev_device_add(&ctrl->cdev, ctrl->device);
@@ -4148,6 +4149,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 
 	return 0;
 out_free_name:
+	nvme_put_ctrl(ctrl);
 	kfree_const(ctrl->device->kobj.name);
 out_release_instance:
 	ida_simple_remove(&nvme_instance_ida, ctrl->instance);
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 5a70ac3..59d2e2b 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3181,10 +3181,7 @@ enum {
 		goto fail_ctrl;
 	}
 
-	nvme_get_ctrl(&ctrl->ctrl);
-
 	if (!queue_delayed_work(nvme_wq, &ctrl->connect_work, 0)) {
-		nvme_put_ctrl(&ctrl->ctrl);
 		dev_err(ctrl->ctrl.device,
 			"NVME-FC{%d}: failed to schedule initial connect\n",
 			ctrl->cnum);
@@ -3209,6 +3206,7 @@ enum {
 
 	/* initiate nvme ctrl ref counting teardown */
 	nvme_uninit_ctrl(&ctrl->ctrl);
+	nvme_put_ctrl(&ctrl->ctrl);
 
 	/* Remove core ctrl ref. */
 	nvme_put_ctrl(&ctrl->ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ff0bd2d..4e062c3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2802,7 +2802,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
 
 	nvme_reset_ctrl(&dev->ctrl);
-	nvme_get_ctrl(&dev->ctrl);
 	async_schedule(nvme_async_probe, dev);
 
 	return 0;
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3e85c5c..ca782de 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2043,8 +2043,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
 		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
 
-	nvme_get_ctrl(&ctrl->ctrl);
-
 	mutex_lock(&nvme_rdma_ctrl_mutex);
 	list_add_tail(&ctrl->list, &nvme_rdma_ctrl_list);
 	mutex_unlock(&nvme_rdma_ctrl_mutex);
@@ -2054,6 +2052,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
 	nvme_put_ctrl(&ctrl->ctrl);
+	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
 		ret = -EIO;
 	return ERR_PTR(ret);
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 814ea23..202f08e2 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2429,8 +2429,6 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISp\n",
 		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
 
-	nvme_get_ctrl(&ctrl->ctrl);
-
 	mutex_lock(&nvme_tcp_ctrl_mutex);
 	list_add_tail(&ctrl->list, &nvme_tcp_ctrl_list);
 	mutex_unlock(&nvme_tcp_ctrl_mutex);
@@ -2440,6 +2438,7 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
 	nvme_put_ctrl(&ctrl->ctrl);
+	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
 		ret = -EIO;
 	return ERR_PTR(ret);
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 4df4ebd..a425e28 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -618,8 +618,6 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
 	dev_info(ctrl->ctrl.device,
 		 "new ctrl: \"%s\"\n", ctrl->ctrl.opts->subsysnqn);
 
-	nvme_get_ctrl(&ctrl->ctrl);
-
 	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
 	WARN_ON_ONCE(!changed);
 
@@ -637,6 +635,7 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
 	kfree(ctrl->queues);
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
+	nvme_put_ctrl(&ctrl->ctrl);
 out_put_ctrl:
 	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
                   ` (2 preceding siblings ...)
  2020-03-24 15:29 ` [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 15:42   ` Christoph Hellwig
  2020-03-24 15:29 ` [PATCH 5/7] nvme: Fix controller creation races with teardown flow Israel Rukshin
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

Put the ctrl reference count at nvme_uninit_ctrl as opposed to
nvme_init_ctrl which takes it. This decrease the reference count at the
core layer instead of decreasing it on each transport separately.
Also move the call of nvme_uninit_ctrl at PCI driver after calling to
nvme_release_prp_pools and nvme_dev_unmap, in order to put the reference
count after using the dev. This is safe because those functions use
nvme_dev which is freed only later at nvme_pci_free_ctrl.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/nvme/host/core.c   | 2 +-
 drivers/nvme/host/fc.c     | 1 -
 drivers/nvme/host/pci.c    | 3 +--
 drivers/nvme/host/rdma.c   | 1 -
 drivers/nvme/host/tcp.c    | 1 -
 drivers/nvme/target/loop.c | 2 --
 6 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5ccbcac..ba064fd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -171,7 +171,6 @@ static void nvme_do_delete_ctrl(struct nvme_ctrl *ctrl)
 	nvme_remove_namespaces(ctrl);
 	ctrl->ops->delete_ctrl(ctrl);
 	nvme_uninit_ctrl(ctrl);
-	nvme_put_ctrl(ctrl);
 }
 
 static void nvme_delete_ctrl_work(struct work_struct *work)
@@ -4048,6 +4047,7 @@ void nvme_uninit_ctrl(struct nvme_ctrl *ctrl)
 	nvme_fault_inject_fini(&ctrl->fault_inject);
 	dev_pm_qos_hide_latency_tolerance(ctrl->device);
 	cdev_device_del(&ctrl->cdev, ctrl->device);
+	nvme_put_ctrl(ctrl);
 }
 EXPORT_SYMBOL_GPL(nvme_uninit_ctrl);
 
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 59d2e2b..a8bf2fb 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3206,7 +3206,6 @@ enum {
 
 	/* initiate nvme ctrl ref counting teardown */
 	nvme_uninit_ctrl(&ctrl->ctrl);
-	nvme_put_ctrl(&ctrl->ctrl);
 
 	/* Remove core ctrl ref. */
 	nvme_put_ctrl(&ctrl->ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4e062c3..4e79e41 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2873,10 +2873,9 @@ static void nvme_remove(struct pci_dev *pdev)
 	nvme_free_host_mem(dev);
 	nvme_dev_remove_admin(dev);
 	nvme_free_queues(dev, 0);
-	nvme_uninit_ctrl(&dev->ctrl);
 	nvme_release_prp_pools(dev);
 	nvme_dev_unmap(dev);
-	nvme_put_ctrl(&dev->ctrl);
+	nvme_uninit_ctrl(&dev->ctrl);
 }
 
 #ifdef CONFIG_PM_SLEEP
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index ca782de..c99a882 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2052,7 +2052,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
 	nvme_put_ctrl(&ctrl->ctrl);
-	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
 		ret = -EIO;
 	return ERR_PTR(ret);
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 202f08e2..2fc2687 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2438,7 +2438,6 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
 	nvme_put_ctrl(&ctrl->ctrl);
-	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
 		ret = -EIO;
 	return ERR_PTR(ret);
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index a425e28..0d54e73 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -485,7 +485,6 @@ static void nvme_loop_reset_ctrl_work(struct work_struct *work)
 out_disable:
 	dev_warn(ctrl->ctrl.device, "Removing after reset failure\n");
 	nvme_uninit_ctrl(&ctrl->ctrl);
-	nvme_put_ctrl(&ctrl->ctrl);
 }
 
 static const struct nvme_ctrl_ops nvme_loop_ctrl_ops = {
@@ -635,7 +634,6 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
 	kfree(ctrl->queues);
 out_uninit_ctrl:
 	nvme_uninit_ctrl(&ctrl->ctrl);
-	nvme_put_ctrl(&ctrl->ctrl);
 out_put_ctrl:
 	nvme_put_ctrl(&ctrl->ctrl);
 	if (ret > 0)
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/7] nvme: Fix controller creation races with teardown flow
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
                   ` (3 preceding siblings ...)
  2020-03-24 15:29 ` [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 15:29 ` [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl Israel Rukshin
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

Calling nvme_sysfs_delete() when the controller is in the middle of
creation may cause several bugs. If the controller is in NEW state we
remove delete_controller file and don't delete the controller. The user
will not be able to use nvme disconnect command on that controller again,
although the controller may be active. Other bugs may happen if the
controller is in the middle of create_ctrl callback and
nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at
nvme_do_delete_ctrl() before it was allocated at create_ctrl callback.

To fix all those races don't allow the user to delete the controller
before it was fully created.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 5 +++++
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ba064fd..9961d0e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3228,6 +3228,10 @@ static ssize_t nvme_sysfs_delete(struct device *dev,
 {
 	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
 
+	/* Can't delete non-created controllers */
+	if (!ctrl->created)
+		return -EBUSY;
+
 	if (device_remove_file_self(dev, attr))
 		nvme_delete_ctrl_sync(ctrl);
 	return count;
@@ -4039,6 +4043,7 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 		nvme_queue_scan(ctrl);
 		nvme_start_queues(ctrl);
 	}
+	ctrl->created = true;
 }
 EXPORT_SYMBOL_GPL(nvme_start_ctrl);
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index d800b9a..2e04a36 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -259,6 +259,7 @@ struct nvme_ctrl {
 	struct nvme_command ka_cmd;
 	struct work_struct fw_act_work;
 	unsigned long events;
+	bool created;
 
 #ifdef CONFIG_NVME_MULTIPATH
 	/* asymmetric namespace access: */
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
                   ` (4 preceding siblings ...)
  2020-03-24 15:29 ` [PATCH 5/7] nvme: Fix controller creation races with teardown flow Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-25  0:19   ` Sagi Grimberg
  2020-03-24 15:29 ` [PATCH 7/7] nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl Israel Rukshin
  2020-03-24 16:17 ` [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Keith Busch
  7 siblings, 1 reply; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

The transition to LIVE state should not fail in case of a new controller.
Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the
resources may leads to NULL dereference at teardown flow (e.g., IO tagset,
admin_q, connect_q).

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/rdma.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c99a882..3ae3011 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1022,8 +1022,13 @@ static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
 
 	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
 	if (!changed) {
-		/* state change failure is ok if we're in DELETING state */
+		/*
+		 * state change failure is ok if we're in DELETING state,
+		 * unless we're during creation of a new controller to
+		 * avoid races with teardown flow.
+		 */
 		WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING);
+		WARN_ON_ONCE(new);
 		ret = -EINVAL;
 		goto destroy_io;
 	}
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 7/7] nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
                   ` (5 preceding siblings ...)
  2020-03-24 15:29 ` [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl Israel Rukshin
@ 2020-03-24 15:29 ` Israel Rukshin
  2020-03-24 16:17 ` [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Keith Busch
  7 siblings, 0 replies; 14+ messages in thread
From: Israel Rukshin @ 2020-03-24 15:29 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

The transition to LIVE state should not fail in case of a new controller.
Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the
resources may leads to NULL dereference at teardown flow (e.g., IO tagset,
admin_q, connect_q).

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/tcp.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 2fc2687..7aa2adc 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1931,8 +1931,13 @@ static int nvme_tcp_setup_ctrl(struct nvme_ctrl *ctrl, bool new)
 	}
 
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE)) {
-		/* state change failure is ok if we're in DELETING state */
+		/*
+		 * state change failure is ok if we're in DELETING state,
+		 * unless we're during creation of a new controller to
+		 * avoid races with teardown flow.
+		 */
 		WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING);
+		WARN_ON_ONCE(new);
 		ret = -EINVAL;
 		goto destroy_io;
 	}
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl
  2020-03-24 15:29 ` [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl Israel Rukshin
@ 2020-03-24 15:42   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-03-24 15:42 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Shlomi Nimrodi, Max Gurtovoy, Sagi Grimberg, Linux-nvme,
	Christoph Hellwig

On Tue, Mar 24, 2020 at 05:29:40PM +0200, Israel Rukshin wrote:
> Destroy the resources in the same order like in nvme_probe error flow to
> improve code readability.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion
  2020-03-24 15:29 ` [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion Israel Rukshin
@ 2020-03-24 15:42   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-03-24 15:42 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Shlomi Nimrodi, Max Gurtovoy, Sagi Grimberg, Linux-nvme,
	Christoph Hellwig

On Tue, Mar 24, 2020 at 05:29:41PM +0200, Israel Rukshin wrote:
> In case nvme_sysfs_delete() is called by the user before taking the ctrl
> reference count, the ctrl may be freed during the creation and cause the
> bug. Take the reference as soon as the controller is externally visible,
> which is done by cdev_device_add() in nvme_init_ctrl(). Also take the
> reference count at the core layer instead of taking it on each transport
> separately.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
  2020-03-24 15:29 ` [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl Israel Rukshin
@ 2020-03-24 15:42   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-03-24 15:42 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Shlomi Nimrodi, Max Gurtovoy, Sagi Grimberg, Linux-nvme,
	Christoph Hellwig

On Tue, Mar 24, 2020 at 05:29:42PM +0200, Israel Rukshin wrote:
> Put the ctrl reference count at nvme_uninit_ctrl as opposed to
> nvme_init_ctrl which takes it. This decrease the reference count at the
> core layer instead of decreasing it on each transport separately.
> Also move the call of nvme_uninit_ctrl at PCI driver after calling to
> nvme_release_prp_pools and nvme_dev_unmap, in order to put the reference
> count after using the dev. This is safe because those functions use
> nvme_dev which is freed only later at nvme_pci_free_ctrl.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created
  2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
                   ` (6 preceding siblings ...)
  2020-03-24 15:29 ` [PATCH 7/7] nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl Israel Rukshin
@ 2020-03-24 16:17 ` Keith Busch
  7 siblings, 0 replies; 14+ messages in thread
From: Keith Busch @ 2020-03-24 16:17 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Max Gurtovoy, Shlomi Nimrodi, Sagi Grimberg, Linux-nvme,
	Christoph Hellwig

Thank you, applied to nvme-5.7.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  2020-03-24 15:29 ` [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl Israel Rukshin
@ 2020-03-25  0:19   ` Sagi Grimberg
  2020-03-25 10:07     ` Israel Rukshin
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2020-03-25  0:19 UTC (permalink / raw)
  To: Israel Rukshin, Linux-nvme, Christoph Hellwig
  Cc: Max Gurtovoy, Shlomi Nimrodi


> The transition to LIVE state should not fail in case of a new controller.
> Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the
> resources may leads to NULL dereference at teardown flow (e.g., IO tagset,
> admin_q, connect_q).
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/rdma.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index c99a882..3ae3011 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1022,8 +1022,13 @@ static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
>   
>   	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
>   	if (!changed) {
> -		/* state change failure is ok if we're in DELETING state */
> +		/*
> +		 * state change failure is ok if we're in DELETING state,
> +		 * unless we're during creation of a new controller to
> +		 * avoid races with teardown flow.
> +		 */
>   		WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING);
> +		WARN_ON_ONCE(new);
>   		ret = -EINVAL;

I still don't understand this. Why are we warning on new controller?

Are you saying that a the state change must succeed in case its
a new controller? because if its expected to not succeed in case
of a failure then there is no point of the warning.

I am missing something here..

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  2020-03-25  0:19   ` Sagi Grimberg
@ 2020-03-25 10:07     ` Israel Rukshin
  0 siblings, 0 replies; 14+ messages in thread
From: Israel Rukshin @ 2020-03-25 10:07 UTC (permalink / raw)
  To: Sagi Grimberg, Linux-nvme, Christoph Hellwig; +Cc: Max Gurtovoy, Shlomi Nimrodi

On 3/25/2020 2:19 AM, Sagi Grimberg wrote:
>
>> The transition to LIVE state should not fail in case of a new 
>> controller.
>> Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the
>> resources may leads to NULL dereference at teardown flow (e.g., IO 
>> tagset,
>> admin_q, connect_q).
>>
>> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   drivers/nvme/host/rdma.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index c99a882..3ae3011 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -1022,8 +1022,13 @@ static int nvme_rdma_setup_ctrl(struct 
>> nvme_rdma_ctrl *ctrl, bool new)
>>         changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
>>       if (!changed) {
>> -        /* state change failure is ok if we're in DELETING state */
>> +        /*
>> +         * state change failure is ok if we're in DELETING state,
>> +         * unless we're during creation of a new controller to
>> +         * avoid races with teardown flow.
>> +         */
>>           WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING);
>> +        WARN_ON_ONCE(new);
>>           ret = -EINVAL;
>
> I still don't understand this. Why are we warning on new controller?
>
> Are you saying that a the state change must succeed in case its
> a new controller? because if its expected to not succeed in case
> of a failure then there is no point of the warning.
>
> I am missing something here..
Yes, the state change must succeed in case of a new controller.

Moving to DELETING state on new controller is forbidden, to avoid races 
with delete flow.

The state machine allows us to move to DELETING state from CONNECTING state,

but the io tagest for example may not be allocated yet which may cause a 
crash at the delete function.


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 15:29 [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Israel Rukshin
2020-03-24 15:29 ` [PATCH 1/7] nvme: Remove unused return code from nvme_delete_ctrl_sync Israel Rukshin
2020-03-24 15:29 ` [PATCH 2/7] nvme-pci: Re-order nvme_pci_free_ctrl Israel Rukshin
2020-03-24 15:42   ` Christoph Hellwig
2020-03-24 15:29 ` [PATCH 3/7] nvme: Fix ctrl use-after-free during sysfs deletion Israel Rukshin
2020-03-24 15:42   ` Christoph Hellwig
2020-03-24 15:29 ` [PATCH 4/7] nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl Israel Rukshin
2020-03-24 15:42   ` Christoph Hellwig
2020-03-24 15:29 ` [PATCH 5/7] nvme: Fix controller creation races with teardown flow Israel Rukshin
2020-03-24 15:29 ` [PATCH 6/7] nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl Israel Rukshin
2020-03-25  0:19   ` Sagi Grimberg
2020-03-25 10:07     ` Israel Rukshin
2020-03-24 15:29 ` [PATCH 7/7] nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl Israel Rukshin
2020-03-24 16:17 ` [PATCH 0/7 V3] nvme: Fixes for deleting a ctrl before it was created Keith Busch

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git