On Jul 7 18:56, Klaus Jensen wrote: >On Jul 7 17:57, Hannes Reinecke wrote: >>On 7/7/21 5:49 PM, Klaus Jensen wrote: >>>From: Klaus Jensen >>> >>>Prior to this patch the nvme-ns devices are always children of the >>>NvmeBus owned by the NvmeCtrl. This causes the namespaces to be >>>unrealized when the parent device is removed. However, when subsystems >>>are involved, this is not what we want since the namespaces may be >>>attached to other controllers as well. >>> >>>This patch adds an additional NvmeBus on the subsystem device. When >>>nvme-ns devices are realized, if the parent controller device is linked >>>to a subsystem, the parent bus is set to the subsystem one instead. This >>>makes sure that namespaces are kept alive and not unrealized. >>> >>>Signed-off-by: Klaus Jensen >>>--- >>> hw/nvme/nvme.h | 15 ++++++++------- >>> hw/nvme/ctrl.c | 14 ++++++-------- >>> hw/nvme/ns.c | 18 ++++++++++++++++++ >>> hw/nvme/subsys.c | 3 +++ >>> 4 files changed, 35 insertions(+), 15 deletions(-) >>> >>>diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h >>>index c4065467d877..83ffabade4cf 100644 >>>--- a/hw/nvme/nvme.h >>>+++ b/hw/nvme/nvme.h >>>@@ -33,12 +33,20 @@ QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES > NVME_NSID_BROADCAST - 1); >>> typedef struct NvmeCtrl NvmeCtrl; >>> typedef struct NvmeNamespace NvmeNamespace; >>>+#define TYPE_NVME_BUS "nvme-bus" >>>+OBJECT_DECLARE_SIMPLE_TYPE(NvmeBus, NVME_BUS) >>>+ >>>+typedef struct NvmeBus { >>>+ BusState parent_bus; >>>+} NvmeBus; >>>+ >>> #define TYPE_NVME_SUBSYS "nvme-subsys" >>> #define NVME_SUBSYS(obj) \ >>> OBJECT_CHECK(NvmeSubsystem, (obj), TYPE_NVME_SUBSYS) >>> typedef struct NvmeSubsystem { >>> DeviceState parent_obj; >>>+ NvmeBus bus; >>> uint8_t subnqn[256]; >>> NvmeCtrl *ctrls[NVME_MAX_CONTROLLERS]; >>>@@ -365,13 +373,6 @@ typedef struct NvmeCQueue { >>> QTAILQ_HEAD(, NvmeRequest) req_list; >>> } NvmeCQueue; >>>-#define TYPE_NVME_BUS "nvme-bus" >>>-#define NVME_BUS(obj) OBJECT_CHECK(NvmeBus, (obj), TYPE_NVME_BUS) >>>- >>>-typedef struct NvmeBus { >>>- BusState parent_bus; >>>-} NvmeBus; >>>- >>> #define TYPE_NVME "nvme" >>> #define NVME(obj) \ >>> OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME) >>>diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c >>>index 90e3ee2b70ee..9a3b3a27c293 100644 >>>--- a/hw/nvme/ctrl.c >>>+++ b/hw/nvme/ctrl.c >>>@@ -6514,16 +6514,14 @@ static void nvme_exit(PCIDevice *pci_dev) >>> nvme_ctrl_reset(n); >>>- for (i = 1; i <= NVME_MAX_NAMESPACES; i++) { >>>- ns = nvme_ns(n, i); >>>- if (!ns) { >>>- continue; >>>+ if (n->subsys) { >>>+ for (i = 1; i <= NVME_MAX_NAMESPACES; i++) { >>>+ ns = nvme_ns(n, i); >>>+ if (ns) { >>>+ ns->attached--; >>>+ } >>> } >>>- nvme_ns_cleanup(ns); >> >>So who is removing the namespaces, then? >>I would have expected some cleanup action from the subsystem, seeing >>that we reparent to that ... >> > >Since we "move" the namespaces to the subsystem, and since the >subsystem is non-hotpluggable, they will (and can) not be removed. In >the case that there is no subsystem, nvme_ns_unrealize() will be >called for each child namespace on the controller NvmeBus. > >>>- } >>>- >>>- if (n->subsys) { >>> nvme_subsys_unregister_ctrl(n->subsys, n); >>> } >>>diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c >>>index 3c4f5b8c714a..b7cf1494e75b 100644 >>>--- a/hw/nvme/ns.c >>>+++ b/hw/nvme/ns.c >>>@@ -441,6 +441,15 @@ void nvme_ns_cleanup(NvmeNamespace *ns) >>> } >>> } >>>+static void nvme_ns_unrealize(DeviceState *dev) >>>+{ >>>+ NvmeNamespace *ns = NVME_NS(dev); >>>+ >>>+ nvme_ns_drain(ns); >>>+ nvme_ns_shutdown(ns); >>>+ nvme_ns_cleanup(ns); >>>+} >>>+ >>> static void nvme_ns_realize(DeviceState *dev, Error **errp) >>> { >>> NvmeNamespace *ns = NVME_NS(dev); >>>@@ -462,6 +471,14 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) >>> "linked to an nvme-subsys device"); >>> return; >>> } >>>+ } else { >>>+ /* >>>+ * If this namespace belongs to a subsystem (through a link on the >>>+ * controller device), reparent the device. >>>+ */ >>>+ if (!qdev_set_parent_bus(dev, &subsys->bus.parent_bus, errp)) { >>>+ return; >>>+ } >> >>What happens if that fails? >>Will we abort? Not create the namespace? >> > >Good point! > >It can actually only fail if the bus implements check_address(), which >it does not, so it always succeeds, so it should assert instead. > Nah, the 'if' is fine. If check_address() should be implemented at some point, errp will be set and invocation of qemu will stop with an error. So I think the error handling is fine as-is.