All of lore.kernel.org
 help / color / mirror / Atom feed
* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
@ 2018-04-10 20:49 Mike Snitzer
  2018-04-11 13:45 ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2018-04-10 20:49 UTC (permalink / raw)


This isn't new since the 4.17 merge or anything, I first noticed this
issue existed while using a 4.16-rc4 kernel.

modprobe nvme_core multipath=N

Using mptest's nvme_4port_create.sh

git clone git://github.com/snitm/mptest.git
cd mptest/lib/unittests
perl -pi -e 's|/dev/pmem0|/dev/some_test_device|' nvme_4port_create.sh
./nvme_4port_create.sh

[ 2309.845915] nvmet: adding nsid 1 to subsystem mptestnqn
[ 2310.019820] nvmet: creating controller 1 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:71a69a69-edb9-4d51-8cbf-1c29778534ab.
[ 2310.034201] nvme nvme1: NVME-FC{0}: new ctrl: NQN "mptestnqn"
[ 2310.041705] nvmet: creating controller 2 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:71a69a69-edb9-4d51-8cbf-1c29778534ab.
[ 2310.056005] nvme nvme2: NVME-FC{1}: new ctrl: NQN "mptestnqn"
[ 2310.056314] sysfs: cannot create duplicate filename '/class/block/nvme1n1'
[ 2310.063386] nvmet: creating controller 3 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:71a69a69-edb9-4d51-8cbf-1c29778534ab.
[ 2310.068637] CPU: 3 PID: 1791 Comm: kworker/u497:1 Not tainted 4.16.0.snitm+ #13
[ 2310.088705] Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
[ 2310.096274] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[ 2310.101669] Call Trace:
[ 2310.104129]  dump_stack+0x5a/0x73
[ 2310.107444]  sysfs_warn_dup+0x58/0x70
[ 2310.111109]  sysfs_do_create_link_sd.isra.2+0xa3/0xb0
[ 2310.116162]  device_add+0x2ac/0x5f0
[ 2310.119658]  __device_add_disk+0x19c/0x4a0
[ 2310.123754]  nvme_validate_ns+0x4d2/0x860 [nvme_core]
[ 2310.128807]  ? wake_up_q+0x70/0x70
[ 2310.132214]  nvme_scan_work+0x211/0x2d0 [nvme_core]
[ 2310.137095]  process_one_work+0x158/0x360
[ 2310.141103]  worker_thread+0x47/0x3e0
[ 2310.144772]  kthread+0xf8/0x130
[ 2310.147917]  ? max_active_store+0x80/0x80
[ 2310.151929]  ? kthread_bind+0x10/0x10
[ 2310.155595]  ? do_syscall_64+0x74/0x1a0
[ 2310.159435]  ? SyS_exit_group+0x10/0x10
[ 2310.163273]  ret_from_fork+0x35/0x40
[ 2310.166872] nvme nvme3: NVME-FC{2}: new ctrl: NQN "mptestnqn"
[ 2310.166875] ------------[ cut here ]------------
[ 2310.167189] sysfs: cannot create duplicate filename '/class/block/nvme1n1'
[ 2310.167190] CPU: 3 PID: 8111 Comm: kworker/u497:3 Not tainted 4.16.0.snitm+ #13
[ 2310.167191] Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
[ 2310.167194] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[ 2310.167194] Call Trace:
[ 2310.167196]  dump_stack+0x5a/0x73
[ 2310.167198]  sysfs_warn_dup+0x58/0x70
[ 2310.167199]  sysfs_do_create_link_sd.isra.2+0xa3/0xb0
[ 2310.167200]  device_add+0x2ac/0x5f0
[ 2310.167202]  __device_add_disk+0x19c/0x4a0
[ 2310.167205]  nvme_validate_ns+0x4d2/0x860 [nvme_core]
[ 2310.167206]  ? wake_up_q+0x70/0x70
[ 2310.167208]  nvme_scan_work+0x211/0x2d0 [nvme_core]
[ 2310.167210]  process_one_work+0x158/0x360
[ 2310.167211]  worker_thread+0x47/0x3e0
[ 2310.167212]  kthread+0xf8/0x130
[ 2310.167213]  ? max_active_store+0x80/0x80
[ 2310.167214]  ? kthread_bind+0x10/0x10
[ 2310.167215]  ret_from_fork+0x35/0x40
[ 2310.167225] ------------[ cut here ]------------
[ 2310.167226] kernel BUG at fs/sysfs/group.c:111!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-10 20:49 'modprobe nvme_core multipath=N' crashes in face of multipath fabric Mike Snitzer
@ 2018-04-11 13:45 ` Keith Busch
  2018-04-11 14:17   ` Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Keith Busch @ 2018-04-11 13:45 UTC (permalink / raw)


On Tue, Apr 10, 2018@04:49:53PM -0400, Mike Snitzer wrote:
> This isn't new since the 4.17 merge or anything, I first noticed this
> issue existed while using a 4.16-rc4 kernel.
> 
> modprobe nvme_core multipath=N

Thanks for the notice.

There is definitely a bug here when CONFIG_NVME_MULTIPATH=y but nvme_core
multipath is disabled in the presence of shared namespaces. I think we'd
need each namespace to get a different "head" out of the subsystem in
this case, but it may take a moment for me to detangle this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-11 13:45 ` Keith Busch
@ 2018-04-11 14:17   ` Mike Snitzer
  2018-04-11 17:58     ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2018-04-11 14:17 UTC (permalink / raw)


On Wed, Apr 11 2018 at  9:45am -0400,
Keith Busch <keith.busch@intel.com> wrote:

> On Tue, Apr 10, 2018@04:49:53PM -0400, Mike Snitzer wrote:
> > This isn't new since the 4.17 merge or anything, I first noticed this
> > issue existed while using a 4.16-rc4 kernel.
> > 
> > modprobe nvme_core multipath=N
> 
> Thanks for the notice.
> 
> There is definitely a bug here when CONFIG_NVME_MULTIPATH=y but nvme_core
> multipath is disabled in the presence of shared namespaces. I think we'd
> need each namespace to get a different "head" out of the subsystem in
> this case, but it may take a moment for me to detangle this.

No problem, it certainly isn't something I could tackle any quicker ;)

Thanks for looking to resolve this though.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-11 14:17   ` Mike Snitzer
@ 2018-04-11 17:58     ` Keith Busch
  2018-04-11 21:46       ` Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Keith Busch @ 2018-04-11 17:58 UTC (permalink / raw)


On Wed, Apr 11, 2018@10:17:21AM -0400, Mike Snitzer wrote:
> On Wed, Apr 11 2018 at  9:45am -0400,
> Keith Busch <keith.busch@intel.com> wrote:
> 
> > On Tue, Apr 10, 2018@04:49:53PM -0400, Mike Snitzer wrote:
> > > This isn't new since the 4.17 merge or anything, I first noticed this
> > > issue existed while using a 4.16-rc4 kernel.
> > > 
> > > modprobe nvme_core multipath=N
> > 
> > Thanks for the notice.
> > 
> > There is definitely a bug here when CONFIG_NVME_MULTIPATH=y but nvme_core
> > multipath is disabled in the presence of shared namespaces. I think we'd
> > need each namespace to get a different "head" out of the subsystem in
> > this case, but it may take a moment for me to detangle this.
> 
> No problem, it certainly isn't something I could tackle any quicker ;)
> 
> Thanks for looking to resolve this though.

It doesn't look like we'll be able to allocate new namespace 'heads'
here without complicating this even more.

The below should fix the naming collision by getting new instances for
each namespace that attaches to a head. I'm not sure this is much better,
but maybe Christoph will have a better suggestion.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 38b450ec00eb..683ccaa70cdc 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -350,6 +350,7 @@ static void nvme_free_ns_head(struct kref *ref)
 	ida_simple_remove(&head->subsys->ns_ida, head->instance);
 	list_del_init(&head->entry);
 	cleanup_srcu_struct(&head->srcu);
+	ida_destroy(&head->ns_ida);
 	kfree(head);
 }
 
@@ -366,6 +367,7 @@ static void nvme_free_ns(struct kref *kref)
 		nvme_nvm_unregister(ns);
 
 	put_disk(ns->disk);
+	ida_simple_remove(&ns->head->ns_ida, ns->instance);
 	nvme_put_ns_head(ns->head);
 	nvme_put_ctrl(ns->ctrl);
 	kfree(ns);
@@ -2836,6 +2838,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 	head->subsys = ctrl->subsys;
 	head->ns_id = nsid;
 	kref_init(&head->ref);
+	ida_init(&head->ns_ida);
 
 	nvme_report_ns_ids(ctrl, nsid, id, &head->ids);
 
@@ -2856,6 +2859,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 	cleanup_srcu_struct(&head->srcu);
 out_ida_remove:
 	ida_simple_remove(&ctrl->subsys->ns_ida, head->instance);
+	ida_destroy(&head->ns_ida);
 out_free_head:
 	kfree(head);
 out:
@@ -2895,6 +2899,12 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
 	list_add_tail(&ns->siblings, &head->list);
 	ns->head = head;
 
+	ret = ida_simple_get(&head->ns_ida, 1, 0, GFP_KERNEL);
+	if (ret >= 0) {
+		ns->instance = ret;
+		ret = 0;
+	}
+
 out_unlock:
 	mutex_unlock(&ctrl->subsys->lock);
 	return ret;
@@ -3003,7 +3013,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 		flags = GENHD_FL_HIDDEN;
 	} else {
 		sprintf(disk_name, "nvme%dn%d", ctrl->subsys->instance,
-				ns->head->instance);
+				ns->instance);
 	}
 #else
 	/*
@@ -3059,6 +3069,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	mutex_lock(&ctrl->subsys->lock);
 	list_del_rcu(&ns->siblings);
 	mutex_unlock(&ctrl->subsys->lock);
+	ida_simple_remove(&ns->head->ns_ida, ns->instance);
  out_free_id:
 	kfree(id);
  out_free_queue:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ced5b3d3269d..3f95b2f7ecf3 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -252,6 +252,7 @@ struct nvme_ns_head {
 	spinlock_t		requeue_lock;
 	struct work_struct	requeue_work;
 #endif
+	struct ida		ns_ida;
 	struct list_head	list;
 	struct srcu_struct      srcu;
 	struct nvme_subsystem	*subsys;
@@ -282,6 +283,7 @@ struct nvme_ns {
 	struct kref kref;
 	struct nvme_ns_head *head;
 
+	int instance;
 	int lba_shift;
 	u16 ms;
 	u16 sgs;
--

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-11 17:58     ` Keith Busch
@ 2018-04-11 21:46       ` Mike Snitzer
  2018-04-12  8:12         ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2018-04-11 21:46 UTC (permalink / raw)


On Wed, Apr 11 2018 at  1:58pm -0400,
Keith Busch <keith.busch@intel.com> wrote:

> On Wed, Apr 11, 2018@10:17:21AM -0400, Mike Snitzer wrote:
> > On Wed, Apr 11 2018 at  9:45am -0400,
> > Keith Busch <keith.busch@intel.com> wrote:
> > 
> > > On Tue, Apr 10, 2018@04:49:53PM -0400, Mike Snitzer wrote:
> > > > This isn't new since the 4.17 merge or anything, I first noticed this
> > > > issue existed while using a 4.16-rc4 kernel.
> > > > 
> > > > modprobe nvme_core multipath=N
> > > 
> > > Thanks for the notice.
> > > 
> > > There is definitely a bug here when CONFIG_NVME_MULTIPATH=y but nvme_core
> > > multipath is disabled in the presence of shared namespaces. I think we'd
> > > need each namespace to get a different "head" out of the subsystem in
> > > this case, but it may take a moment for me to detangle this.
> > 
> > No problem, it certainly isn't something I could tackle any quicker ;)
> > 
> > Thanks for looking to resolve this though.
> 
> It doesn't look like we'll be able to allocate new namespace 'heads'
> here without complicating this even more.
> 
> The below should fix the naming collision by getting new instances for
> each namespace that attaches to a head. I'm not sure this is much better,
> but maybe Christoph will have a better suggestion.

This patch fixed the issue for me.  Verified that modprobe nvme_core
multipath=N and multipath=Y works with the mptest case I shared in my
original report.

If you do go with this patch, please feel free to add:

Tested-by: Mike Snitzer <snitzer at redhat.com>

Thanks,
Mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-11 21:46       ` Mike Snitzer
@ 2018-04-12  8:12         ` Sagi Grimberg
  2018-04-12 22:28           ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2018-04-12  8:12 UTC (permalink / raw)




On 04/12/2018 12:46 AM, Mike Snitzer wrote:
> On Wed, Apr 11 2018 at  1:58pm -0400,
> Keith Busch <keith.busch@intel.com> wrote:
> 
>> On Wed, Apr 11, 2018@10:17:21AM -0400, Mike Snitzer wrote:
>>> On Wed, Apr 11 2018 at  9:45am -0400,
>>> Keith Busch <keith.busch@intel.com> wrote:
>>>
>>>> On Tue, Apr 10, 2018@04:49:53PM -0400, Mike Snitzer wrote:
>>>>> This isn't new since the 4.17 merge or anything, I first noticed this
>>>>> issue existed while using a 4.16-rc4 kernel.
>>>>>
>>>>> modprobe nvme_core multipath=N
>>>>
>>>> Thanks for the notice.
>>>>
>>>> There is definitely a bug here when CONFIG_NVME_MULTIPATH=y but nvme_core
>>>> multipath is disabled in the presence of shared namespaces. I think we'd
>>>> need each namespace to get a different "head" out of the subsystem in
>>>> this case, but it may take a moment for me to detangle this.
>>>
>>> No problem, it certainly isn't something I could tackle any quicker ;)
>>>
>>> Thanks for looking to resolve this though.
>>
>> It doesn't look like we'll be able to allocate new namespace 'heads'
>> here without complicating this even more.
>>
>> The below should fix the naming collision by getting new instances for
>> each namespace that attaches to a head. I'm not sure this is much better,
>> but maybe Christoph will have a better suggestion.
> 
> This patch fixed the issue for me.  Verified that modprobe nvme_core
> multipath=N and multipath=Y works with the mptest case I shared in my
> original report.
> 
> If you do go with this patch, please feel free to add:
> 
> Tested-by: Mike Snitzer <snitzer at redhat.com>

This looks good to me too, you can add my

Reviewed-by: Sagi Grimberg <sagi at grimberg.me>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 'modprobe nvme_core multipath=N' crashes in face of multipath fabric
  2018-04-12  8:12         ` Sagi Grimberg
@ 2018-04-12 22:28           ` Keith Busch
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2018-04-12 22:28 UTC (permalink / raw)


On Thu, Apr 12, 2018@11:12:12AM +0300, Sagi Grimberg wrote:
> On 04/12/2018 12:46 AM, Mike Snitzer wrote:
> > This patch fixed the issue for me.  Verified that modprobe nvme_core
> > multipath=N and multipath=Y works with the mptest case I shared in my
> > original report.
> > 
> > If you do go with this patch, please feel free to add:
> > 
> > Tested-by: Mike Snitzer <snitzer at redhat.com>
> 
> This looks good to me too, you can add my
> 
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>

I'm going to have to make one minor change here since this will still
break if you've a multi-path subsystem with multiple namespaces. I'll
just need to move the IDA up to the subsystem level instead of on the
subsystem heads, but otherwise much the same.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-04-12 22:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-10 20:49 'modprobe nvme_core multipath=N' crashes in face of multipath fabric Mike Snitzer
2018-04-11 13:45 ` Keith Busch
2018-04-11 14:17   ` Mike Snitzer
2018-04-11 17:58     ` Keith Busch
2018-04-11 21:46       ` Mike Snitzer
2018-04-12  8:12         ` Sagi Grimberg
2018-04-12 22:28           ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.