All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] nvme-multipath: fix sysfs dangerously created links
@ 2018-02-28  7:06 ` Baegjae Sung
  0 siblings, 0 replies; 6+ messages in thread
From: Baegjae Sung @ 2018-02-28  7:06 UTC (permalink / raw)
  To: keith.busch, axboe, hch, sagi, baegjae; +Cc: linux-nvme, linux-kernel

If multipathing is enabled, each NVMe subsystem creates a head
namespace (e.g., nvme0n1) and multiple private namespaces
(e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for
private namespaces, links of head namespace are used, so the
namespace creation order must be followed (e.g., nvme0n1 ->
nvme0c1n1). If the order is not followed, links of sysfs will be
incomplete or kernel panic will occur.

The kernel panic was:
  kernel BUG at fs/sysfs/symlink.c:27!
  Call Trace:
    nvme_mpath_add_disk_links+0x5d/0x80 [nvme_core]
    nvme_validate_ns+0x5c2/0x850 [nvme_core]
    nvme_scan_work+0x1af/0x2d0 [nvme_core]

Correct order
Context A     Context B
nvme0n1
nvme0c0n1     nvme0c1n1

Incorrect order
Context A     Context B
              nvme0c1n1
nvme0n1
nvme0c0n1

The nvme_mpath_add_disk (for creating head namespace) is called
just before the nvme_mpath_add_disk_links (for creating private
namespaces). In nvme_mpath_add_disk, the first context acquires
the lock of subsystem and creates a head namespace, and other
contexts do nothing by checking GENHD_FL_UP of a head namespace
after waiting to acquire the lock. We verified the code with or
without multipathing using three vendors of dual-port NVMe SSDs.

Signed-off-by: Baegjae Sung <baegjae@gmail.com>
---
 drivers/nvme/host/core.c      | 12 +++---------
 drivers/nvme/host/multipath.c | 15 ++++++++++-----
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0fe7ea35c221..817e5e2766da 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2844,7 +2844,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 }
 
 static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
-		struct nvme_id_ns *id, bool *new)
+		struct nvme_id_ns *id)
 {
 	struct nvme_ctrl *ctrl = ns->ctrl;
 	bool is_shared = id->nmic & (1 << 0);
@@ -2860,8 +2860,6 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 			ret = PTR_ERR(head);
 			goto out_unlock;
 		}
-
-		*new = true;
 	} else {
 		struct nvme_ns_ids ids;
 
@@ -2873,8 +2871,6 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 			ret = -EINVAL;
 			goto out_unlock;
 		}
-
-		*new = false;
 	}
 
 	list_add_tail(&ns->siblings, &head->list);
@@ -2945,7 +2941,6 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	struct nvme_id_ns *id;
 	char disk_name[DISK_NAME_LEN];
 	int node = dev_to_node(ctrl->dev), flags = GENHD_FL_EXT_DEVT;
-	bool new = true;
 
 	ns = kzalloc_node(sizeof(*ns), GFP_KERNEL, node);
 	if (!ns)
@@ -2971,7 +2966,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	if (id->ncap == 0)
 		goto out_free_id;
 
-	if (nvme_init_ns_head(ns, nsid, id, &new))
+	if (nvme_init_ns_head(ns, nsid, id))
 		goto out_free_id;
 	nvme_setup_streams_ns(ctrl, ns);
 	
@@ -3037,8 +3032,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 		pr_warn("%s: failed to register lightnvm sysfs group for identification\n",
 			ns->disk->disk_name);
 
-	if (new)
-		nvme_mpath_add_disk(ns->head);
+	nvme_mpath_add_disk(ns->head);
 	nvme_mpath_add_disk_links(ns);
 	return;
  out_unlink_ns:
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 3b211d9e58b8..b7e5c6db4d92 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -198,11 +198,16 @@ void nvme_mpath_add_disk(struct nvme_ns_head *head)
 {
 	if (!head->disk)
 		return;
-	device_add_disk(&head->subsys->dev, head->disk);
-	if (sysfs_create_group(&disk_to_dev(head->disk)->kobj,
-			&nvme_ns_id_attr_group))
-		pr_warn("%s: failed to create sysfs group for identification\n",
-			head->disk->disk_name);
+
+	mutex_lock(&head->subsys->lock);
+	if (!(head->disk->flags & GENHD_FL_UP)) {
+		device_add_disk(&head->subsys->dev, head->disk);
+		if (sysfs_create_group(&disk_to_dev(head->disk)->kobj,
+				&nvme_ns_id_attr_group))
+			pr_warn("%s: failed to create sysfs group for identification\n",
+				head->disk->disk_name);
+	}
+	mutex_unlock(&head->subsys->lock);
 }
 
 void nvme_mpath_add_disk_links(struct nvme_ns *ns)
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2] nvme-multipath: fix sysfs dangerously created links
@ 2018-02-28  7:06 ` Baegjae Sung
  0 siblings, 0 replies; 6+ messages in thread
From: Baegjae Sung @ 2018-02-28  7:06 UTC (permalink / raw)


If multipathing is enabled, each NVMe subsystem creates a head
namespace (e.g., nvme0n1) and multiple private namespaces
(e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for
private namespaces, links of head namespace are used, so the
namespace creation order must be followed (e.g., nvme0n1 ->
nvme0c1n1). If the order is not followed, links of sysfs will be
incomplete or kernel panic will occur.

The kernel panic was:
  kernel BUG at fs/sysfs/symlink.c:27!
  Call Trace:
    nvme_mpath_add_disk_links+0x5d/0x80 [nvme_core]
    nvme_validate_ns+0x5c2/0x850 [nvme_core]
    nvme_scan_work+0x1af/0x2d0 [nvme_core]

Correct order
Context A     Context B
nvme0n1
nvme0c0n1     nvme0c1n1

Incorrect order
Context A     Context B
              nvme0c1n1
nvme0n1
nvme0c0n1

The nvme_mpath_add_disk (for creating head namespace) is called
just before the nvme_mpath_add_disk_links (for creating private
namespaces). In nvme_mpath_add_disk, the first context acquires
the lock of subsystem and creates a head namespace, and other
contexts do nothing by checking GENHD_FL_UP of a head namespace
after waiting to acquire the lock. We verified the code with or
without multipathing using three vendors of dual-port NVMe SSDs.

Signed-off-by: Baegjae Sung <baegjae at gmail.com>
---
 drivers/nvme/host/core.c      | 12 +++---------
 drivers/nvme/host/multipath.c | 15 ++++++++++-----
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0fe7ea35c221..817e5e2766da 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2844,7 +2844,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 }
 
 static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
-		struct nvme_id_ns *id, bool *new)
+		struct nvme_id_ns *id)
 {
 	struct nvme_ctrl *ctrl = ns->ctrl;
 	bool is_shared = id->nmic & (1 << 0);
@@ -2860,8 +2860,6 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 			ret = PTR_ERR(head);
 			goto out_unlock;
 		}
-
-		*new = true;
 	} else {
 		struct nvme_ns_ids ids;
 
@@ -2873,8 +2871,6 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 			ret = -EINVAL;
 			goto out_unlock;
 		}
-
-		*new = false;
 	}
 
 	list_add_tail(&ns->siblings, &head->list);
@@ -2945,7 +2941,6 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	struct nvme_id_ns *id;
 	char disk_name[DISK_NAME_LEN];
 	int node = dev_to_node(ctrl->dev), flags = GENHD_FL_EXT_DEVT;
-	bool new = true;
 
 	ns = kzalloc_node(sizeof(*ns), GFP_KERNEL, node);
 	if (!ns)
@@ -2971,7 +2966,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	if (id->ncap == 0)
 		goto out_free_id;
 
-	if (nvme_init_ns_head(ns, nsid, id, &new))
+	if (nvme_init_ns_head(ns, nsid, id))
 		goto out_free_id;
 	nvme_setup_streams_ns(ctrl, ns);
 	
@@ -3037,8 +3032,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 		pr_warn("%s: failed to register lightnvm sysfs group for identification\n",
 			ns->disk->disk_name);
 
-	if (new)
-		nvme_mpath_add_disk(ns->head);
+	nvme_mpath_add_disk(ns->head);
 	nvme_mpath_add_disk_links(ns);
 	return;
  out_unlink_ns:
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 3b211d9e58b8..b7e5c6db4d92 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -198,11 +198,16 @@ void nvme_mpath_add_disk(struct nvme_ns_head *head)
 {
 	if (!head->disk)
 		return;
-	device_add_disk(&head->subsys->dev, head->disk);
-	if (sysfs_create_group(&disk_to_dev(head->disk)->kobj,
-			&nvme_ns_id_attr_group))
-		pr_warn("%s: failed to create sysfs group for identification\n",
-			head->disk->disk_name);
+
+	mutex_lock(&head->subsys->lock);
+	if (!(head->disk->flags & GENHD_FL_UP)) {
+		device_add_disk(&head->subsys->dev, head->disk);
+		if (sysfs_create_group(&disk_to_dev(head->disk)->kobj,
+				&nvme_ns_id_attr_group))
+			pr_warn("%s: failed to create sysfs group for identification\n",
+				head->disk->disk_name);
+	}
+	mutex_unlock(&head->subsys->lock);
 }
 
 void nvme_mpath_add_disk_links(struct nvme_ns *ns)
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] nvme-multipath: fix sysfs dangerously created links
  2018-02-28  7:06 ` Baegjae Sung
@ 2018-02-28 16:45   ` Christoph Hellwig
  -1 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2018-02-28 16:45 UTC (permalink / raw)
  To: Baegjae Sung; +Cc: keith.busch, axboe, hch, sagi, linux-nvme, linux-kernel

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] nvme-multipath: fix sysfs dangerously created links
@ 2018-02-28 16:45   ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2018-02-28 16:45 UTC (permalink / raw)


Looks fine,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] nvme-multipath: fix sysfs dangerously created links
  2018-02-28 16:45   ` Christoph Hellwig
@ 2018-02-28 17:11     ` Keith Busch
  -1 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2018-02-28 17:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Baegjae Sung, axboe, sagi, linux-nvme, linux-kernel

Thanks, applied.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] nvme-multipath: fix sysfs dangerously created links
@ 2018-02-28 17:11     ` Keith Busch
  0 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2018-02-28 17:11 UTC (permalink / raw)


Thanks, applied.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-02-28 17:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28  7:06 [PATCH v2] nvme-multipath: fix sysfs dangerously created links Baegjae Sung
2018-02-28  7:06 ` Baegjae Sung
2018-02-28 16:45 ` Christoph Hellwig
2018-02-28 16:45   ` Christoph Hellwig
2018-02-28 17:11   ` Keith Busch
2018-02-28 17:11     ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.