* [PATCH] nvme: generate uevent once a multipath namespace is operational again
@ 2021-05-05 10:33 Hannes Reinecke
2021-05-06 7:37 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Hannes Reinecke @ 2021-05-05 10:33 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke
In an all paths down scenario I/O will be requeued or aborted, so no
further I/O will be ongoing on this namespace.
This leaves upper layers like MD unable to determine if the namespace
becomes operational again after a successful controller reset.
This patch will send an uevent per multipathed namespace once the
underlying controller is LIVE, allowing MD to start resync.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/host/multipath.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 0551796517e6..f099897aea59 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -100,8 +100,12 @@ void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
down_read(&ctrl->namespaces_rwsem);
list_for_each_entry(ns, &ctrl->namespaces, list) {
- if (ns->head->disk)
- kblockd_schedule_work(&ns->head->requeue_work);
+ if (!ns->head->disk)
+ continue;
+ kblockd_schedule_work(&ns->head->requeue_work);
+ if (ctrl->state == NVME_CTRL_LIVE)
+ kobject_uevent(&disk_to_dev(ns->head->disk)->kobj,
+ KOBJ_CHANGE);
}
up_read(&ctrl->namespaces_rwsem);
}
--
2.29.2
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme: generate uevent once a multipath namespace is operational again
2021-05-05 10:33 [PATCH] nvme: generate uevent once a multipath namespace is operational again Hannes Reinecke
@ 2021-05-06 7:37 ` Christoph Hellwig
2021-05-06 8:48 ` Hannes Reinecke
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2021-05-06 7:37 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme
On Wed, May 05, 2021 at 12:33:05PM +0200, Hannes Reinecke wrote:
> In an all paths down scenario I/O will be requeued or aborted, so no
> further I/O will be ongoing on this namespace.
> This leaves upper layers like MD unable to determine if the namespace
> becomes operational again after a successful controller reset.
> This patch will send an uevent per multipathed namespace once the
> underlying controller is LIVE, allowing MD to start resync.
Do we have any documentation or other exampes for this KOBJ_CHANGED
magic? I've seen it in a few places, but it always seemed rather
cargo cult to me. If you have a more insights any chance you could
document it?
> + if (ctrl->state == NVME_CTRL_LIVE)
> + kobject_uevent(&disk_to_dev(ns->head->disk)->kobj,
> + KOBJ_CHANGE);
Also this should probably use disk_uevent to also notify partitions.
Maybe also for other existing callers.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] nvme: generate uevent once a multipath namespace is operational again
2021-05-06 7:37 ` Christoph Hellwig
@ 2021-05-06 8:48 ` Hannes Reinecke
0 siblings, 0 replies; 3+ messages in thread
From: Hannes Reinecke @ 2021-05-06 8:48 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme
On 5/6/21 9:37 AM, Christoph Hellwig wrote:
> On Wed, May 05, 2021 at 12:33:05PM +0200, Hannes Reinecke wrote:
>> In an all paths down scenario I/O will be requeued or aborted, so no
>> further I/O will be ongoing on this namespace.
>> This leaves upper layers like MD unable to determine if the namespace
>> becomes operational again after a successful controller reset.
>> This patch will send an uevent per multipathed namespace once the
>> underlying controller is LIVE, allowing MD to start resync.
>
> Do we have any documentation or other exampes for this KOBJ_CHANGED
> magic? I've seen it in a few places, but it always seemed rather
> cargo cult to me. If you have a more insights any chance you could
> document it?
>
It's precisely cargo cult, but definitely not well documented.
Currently there is an ambiguity between 'KOBJ_ADD' and 'KOBJ_CHANGED';
some devices will only send KOBJ_ADD, other (most notably S/390 DASDs
and device-mapper devices) will send KOBJ_CHANGED to indicate that this
device is now live and ready for use.
Sadly there is no indicator telling you if that particular device
implements KOBJ_CHANGED at all, so one really has to know what to look
out for.
But the general rule is that 'KOBJ_CHANGED' indicates that a device is
not ready for use; KOBJ_ADD indicates that a device has been added to
the system and _might_ indicate that the device is ready to use.
>> + if (ctrl->state == NVME_CTRL_LIVE)
>> + kobject_uevent(&disk_to_dev(ns->head->disk)->kobj,
>> + KOBJ_CHANGE);
>
> Also this should probably use disk_uevent to also notify partitions.
> Maybe also for other existing callers.
>
Indeed, you are correct. Will be fixing it.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-05-06 8:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-05 10:33 [PATCH] nvme: generate uevent once a multipath namespace is operational again Hannes Reinecke
2021-05-06 7:37 ` Christoph Hellwig
2021-05-06 8:48 ` Hannes Reinecke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.