* [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device
@ 2022-05-27 15:49 wenxiong
2022-05-27 20:52 ` Keith Busch
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: wenxiong @ 2022-05-27 15:49 UTC (permalink / raw)
To: hch; +Cc: kbusch, linux-nvme, hare, wenxiong, Wen Xiong
From: Wen Xiong <wenxiong@linux.ibm.com>
If we have an active partition/namespace on nvme device, nvme device driver
won’t release these controller IDs/Namespaces IDs when dlpar remove.
So we got the wrong nvme devices names if nvme device driver still use old
subsystem id as controller id when dlpar add it back. We expect to see new
nvme devices with new controller IDs/Namespace IDs.
For example, we have a nvme U2 device with 4 namespaces in linux,
After system boots up,
Controller id: 0
Subsystem id: 0
namespaces ids: 1, 2, 3 and 4
nvme devices : nvme0n1, nvme0n2, nvme0n3 and nvme0n4
Dlpar remove with active partitions on nvme0n1,
Controller id : ID=0 still hold reference, ID=0 won’t be removed, but /dev/nvme0 not exists anymore
Subsystem id : 0(nvme-subsys0 still exist in nvme list-subsys)
Namespace ids: ID=1 still hold reference, ID=1 won’t be removed, but /dev/nvme0n1 not exists anymore
Nvme devices: all nvme devices are gone in /dev
Dlpar Add back
Controller id: 1
Subsystem id : 0 still use old controller id=0
namespaces ids: 2, 3, 4 and 5
nvme devices: We saw "nvme0n2, nvme0n3, nvme0n4, nvme0n5" in /dev
/dev/nvme0 not exist anymore,
We expect to see: nvme1n2, nvme1n3, nvme1n4 and nvme1n5
The patch fixes the issue.
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
---
drivers/nvme/host/multipath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d464fdf978fb..945974ceef3c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -495,7 +495,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
head->disk->fops = &nvme_ns_head_ops;
head->disk->private_data = head;
sprintf(head->disk->disk_name, "nvme%dn%d",
- ctrl->subsys->instance, head->instance);
+ ctrl->instance, head->instance);
blk_queue_flag_set(QUEUE_FLAG_NONROT, head->disk->queue);
blk_queue_flag_set(QUEUE_FLAG_NOWAIT, head->disk->queue);
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device
2022-05-27 15:49 [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device wenxiong
@ 2022-05-27 20:52 ` Keith Busch
2022-05-28 5:17 ` Christoph Hellwig
2022-05-28 12:16 ` Hannes Reinecke
2 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2022-05-27 20:52 UTC (permalink / raw)
To: wenxiong; +Cc: hch, linux-nvme, hare, wenxiong
On Fri, May 27, 2022 at 10:49:37AM -0500, wenxiong@linux.ibm.com wrote:
> From: Wen Xiong <wenxiong@linux.ibm.com>
>
> If we have an active partition/namespace on nvme device, nvme device driver
> won’t release these controller IDs/Namespaces IDs when dlpar remove.
> So we got the wrong nvme devices names if nvme device driver still use old
> subsystem id as controller id when dlpar add it back. We expect to see new
> nvme devices with new controller IDs/Namespace IDs.
It was done this way on purpose. You can dynamically detach and attach
namespaces to any controller in the subsystem, so naming the namespace after
the controller instance doesn't help.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device
2022-05-27 15:49 [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device wenxiong
2022-05-27 20:52 ` Keith Busch
@ 2022-05-28 5:17 ` Christoph Hellwig
2022-05-28 12:16 ` Hannes Reinecke
2 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2022-05-28 5:17 UTC (permalink / raw)
To: wenxiong; +Cc: hch, kbusch, linux-nvme, hare, wenxiong
On Fri, May 27, 2022 at 10:49:37AM -0500, wenxiong@linux.ibm.com wrote:
> From: Wen Xiong <wenxiong@linux.ibm.com>
>
> If we have an active partition/namespace on nvme device, nvme device driver
> won’t release these controller IDs/Namespaces IDs when dlpar remove.
What is a dlpar?
> So we got the wrong nvme devices names if nvme device driver still use old
> subsystem id as controller id when dlpar add it back. We expect to see new
> nvme devices with new controller IDs/Namespace IDs.
What is wrong?
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index d464fdf978fb..945974ceef3c 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -495,7 +495,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
> head->disk->fops = &nvme_ns_head_ops;
> head->disk->private_data = head;
> sprintf(head->disk->disk_name, "nvme%dn%d",
> - ctrl->subsys->instance, head->instance);
> + ctrl->instance, head->instance);
No, this is broken. The name needs to refer to the subsystem instance
as otherwise we will run into naming conflicts.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device
2022-05-27 15:49 [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device wenxiong
2022-05-27 20:52 ` Keith Busch
2022-05-28 5:17 ` Christoph Hellwig
@ 2022-05-28 12:16 ` Hannes Reinecke
2 siblings, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2022-05-28 12:16 UTC (permalink / raw)
To: wenxiong, hch; +Cc: kbusch, linux-nvme, wenxiong
On 5/27/22 17:49, wenxiong@linux.ibm.com wrote:
> From: Wen Xiong <wenxiong@linux.ibm.com>
>
> If we have an active partition/namespace on nvme device, nvme device driver
> won’t release these controller IDs/Namespaces IDs when dlpar remove.
> So we got the wrong nvme devices names if nvme device driver still use old
> subsystem id as controller id when dlpar add it back. We expect to see new
> nvme devices with new controller IDs/Namespace IDs.
>
> For example, we have a nvme U2 device with 4 namespaces in linux,
>
> After system boots up,
> Controller id: 0
> Subsystem id: 0
> namespaces ids: 1, 2, 3 and 4
> nvme devices : nvme0n1, nvme0n2, nvme0n3 and nvme0n4
>
> Dlpar remove with active partitions on nvme0n1,
> Controller id : ID=0 still hold reference, ID=0 won’t be removed, but /dev/nvme0 not exists anymore
> Subsystem id : 0(nvme-subsys0 still exist in nvme list-subsys)
> Namespace ids: ID=1 still hold reference, ID=1 won’t be removed, but /dev/nvme0n1 not exists anymore
> Nvme devices: all nvme devices are gone in /dev
>
> Dlpar Add back
> Controller id: 1
> Subsystem id : 0 still use old controller id=0
> namespaces ids: 2, 3, 4 and 5
> nvme devices: We saw "nvme0n2, nvme0n3, nvme0n4, nvme0n5" in /dev
> /dev/nvme0 not exist anymore,
> We expect to see: nvme1n2, nvme1n3, nvme1n4 and nvme1n5
>
> The patch fixes the issue.
>
> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
> ---
> drivers/nvme/host/multipath.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index d464fdf978fb..945974ceef3c 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -495,7 +495,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
> head->disk->fops = &nvme_ns_head_ops;
> head->disk->private_data = head;
> sprintf(head->disk->disk_name, "nvme%dn%d",
> - ctrl->subsys->instance, head->instance);
> + ctrl->instance, head->instance);
>
> blk_queue_flag_set(QUEUE_FLAG_NONROT, head->disk->queue);
> blk_queue_flag_set(QUEUE_FLAG_NOWAIT, head->disk->queue);
No, this is wrong.
The nvme subsystem is an internal construct within the linux subsystem,
and the lifetime of that is determined by the lifetime of the associated
controllers and namespaces.
So as long as a controller object (!) or namespace object exists which
refer to the subsystem NQN, that subsystem object will exist.
When a controller is removed/detached by the dlpar the device nodes will
be removed from the system, and the controller object itself will be
freed/removed after the last reference drops. And only then will the
reference to the subsystem be dropped, causing the subsystem object to
be removed.
So if a new controller referring to the same subsystem NQN is connected
while the subsystem object is still present it will naturally select
that subsystem.
Everything is working as designed.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-05-28 12:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-27 15:49 [PATCH 1/1] nvme-multipath: show wrong nvme ns after dlpar a nvme device wenxiong
2022-05-27 20:52 ` Keith Busch
2022-05-28 5:17 ` Christoph Hellwig
2022-05-28 12:16 ` Hannes Reinecke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.