qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
@ 2021-01-15 12:05 Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 1/5] hw/block/nvme: add controller id parameter Minwoo Im
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

Hello,

This series added support for multi-path I/O with multi-controllers and
namespace sharing.  By supporting these features, we can test Linux
kernel mpath(multi-path) code with this NVMe device.

Patches from the first to third added multi-controller support in a NVM
subsystem by adding a mpath.ctrl parameter to nvme device.  The rest of
the patches added namespace sharing support in a NVM subsystem with two
or more controllers by adding mpath.ns parameter to nvme-ns device.

Multi-path enabled in kernel with this series for two controllers with a
namespace:

  root@vm:~/work# nvme list -v
  NVM Express Subsystems

  Subsystem        Subsystem-NQN                                                                                    Controllers
  ---------------- ------------------------------------------------------------------------------------------------ ----------------
  nvme-subsys0     nqn.2019-08.org.qemu:serial                                                                      nvme0, nvme1

  NVM Express Controllers

  Device   SN                   MN                                       FR       TxPort Address        Subsystem    Namespaces
  -------- -------------------- ---------------------------------------- -------- ------ -------------- ------------ ----------------
  nvme0    serial               QEMU NVMe Ctrl                           1.0      pcie   0000:01:00.0   nvme-subsys0 nvme0n1
  nvme1    serial               QEMU NVMe Ctrl                           1.0      pcie   0000:02:00.0   nvme-subsys0 nvme0n1

  NVM Express Namespaces

  Device       NSID     Usage                      Format           Controllers
  ------------ -------- -------------------------- ---------------- ----------------
  nvme0n1      1        268.44  MB / 268.44  MB    512   B +  0 B   nvme0, nvme1

The reason why I put 'RFC' tag to this series is mostly about the last
patch "hw/block/nvme: add namespace sharing param for mpath".  It seems
like QEMU block backing device does not support to be shared among two
or more -device(s).  It means that we just can't give same drive=
property to multiple nvme-ns devices.  This patch has just let -device
maps to -drive one-to-one(1:1), but if namespae sharing is detected and
setup by the host kernel, then a single block device will be selected
for the NVM subsystem.  I'm not sure this is a good start for this
feature, so I put the RFC tag here.

Please kindly review!

Thanks,

Minwoo Im (5):
  hw/block/nvme: add controller id parameter
  nvme: add CMIC enum value for Identify Controller
  hw/block/nvme: add multi-controller param for mpath
  nvme: add NMIC enum value for Identify Namespace
  hw/block/nvme: add namespace sharing param for mpath

 hw/block/nvme-ns.c   | 14 ++++++++++++--
 hw/block/nvme-ns.h   |  2 ++
 hw/block/nvme.c      | 26 ++++++++++++++++++++++++++
 hw/block/nvme.h      |  2 ++
 include/block/nvme.h |  8 ++++++++
 5 files changed, 50 insertions(+), 2 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/5] hw/block/nvme: add controller id parameter
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
@ 2021-01-15 12:05 ` Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 2/5] nvme: add CMIC enum value for Identify Controller Minwoo Im
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

There is Contrller ID field in Identify Controller data structure and
nvme device has never set this field, 0 by default.

Added a parameter for controller identifier in a NVM subsystem.  This is
reflected to Identify Controller data structrue of the controller.  This
parameter is helpful when a user wants to set up multi-controller in a
NVM subsystem.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
---
 hw/block/nvme.c | 10 ++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index cf0fe28fe6eb..132e61c0ee7b 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -23,6 +23,7 @@
  *              max_ioqpairs=<N[optional]>, \
  *              aerl=<N[optional]>, aer_max_queued=<N[optional]>, \
  *              mdts=<N[optional]>,zoned.append_size_limit=<N[optional]> \
+ *              cntlid=<N[optional]>
  *      -device nvme-ns,drive=<drive_id>,bus=<bus_name>,nsid=<nsid>,\
  *              zoned=<true|false[optional]>
  *
@@ -50,6 +51,12 @@
  *   completion when there are no outstanding AERs. When the maximum number of
  *   enqueued events are reached, subsequent events will be dropped.
  *
+ * - `cntlid`
+ *   NVM subsystem unique controller identifier (default: 0).  This property
+ *   is used if a user wants to set up multi-controller in a NVM subsystem.
+ *   This value will be reported through Identify Controller data structure
+ *   with a field named CNTLID[79:78].
+ *
  * - `zoned.append_size_limit`
  *   The maximum I/O size in bytes that is allowed in Zone Append command.
  *   The default is 128KiB. Since internally this this value is maintained as
@@ -4275,6 +4282,8 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
 
     n->bar.vs = NVME_SPEC_VER;
     n->bar.intmc = n->bar.intms = 0;
+
+    id->cntlid = n->params.cntlid;
 }
 
 static void nvme_realize(PCIDevice *pci_dev, Error **errp)
@@ -4345,6 +4354,7 @@ static Property nvme_props[] = {
     DEFINE_BLOCK_PROPERTIES(NvmeCtrl, namespace.blkconf),
     DEFINE_PROP_LINK("pmrdev", NvmeCtrl, pmrdev, TYPE_MEMORY_BACKEND,
                      HostMemoryBackend *),
+    DEFINE_PROP_UINT32("cntlid", NvmeCtrl, params.cntlid, 0),
     DEFINE_PROP_STRING("serial", NvmeCtrl, params.serial),
     DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, params.cmb_size_mb, 0),
     DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index b7fbcca39d9f..6aa9e89ac5a8 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -11,6 +11,7 @@
 
 typedef struct NvmeParams {
     char     *serial;
+    uint32_t cntlid;
     uint32_t num_queues; /* deprecated since 5.1 */
     uint32_t max_ioqpairs;
     uint16_t msix_qsize;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/5] nvme: add CMIC enum value for Identify Controller
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 1/5] hw/block/nvme: add controller id parameter Minwoo Im
@ 2021-01-15 12:05 ` Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 3/5] hw/block/nvme: add multi-controller param for mpath Minwoo Im
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

Added Controller Multi-path I/O and Namespace Sharing Capabilities
(CMIC) field to support multi-controller in the following patches.

This field is in Identify Controller data structure in [76].

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
---
 include/block/nvme.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 9494246f1f59..e83ec1e124c6 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -941,6 +941,10 @@ enum NvmeIdCtrlLpa {
     NVME_LPA_EXTENDED = 1 << 2,
 };
 
+enum NvmeIdCtrlCmic {
+    NVME_CMIC_MULTI_CTRL    = 1 << 1,
+};
+
 #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
 #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/5] hw/block/nvme: add multi-controller param for mpath
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 1/5] hw/block/nvme: add controller id parameter Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 2/5] nvme: add CMIC enum value for Identify Controller Minwoo Im
@ 2021-01-15 12:05 ` Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 4/5] nvme: add NMIC enum value for Identify Namespace Minwoo Im
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

Added mpath.ctrl parameter to set multi-controller bit[1] in CMIC field
in Identify Controller data structure.  It will indicate that a NVM
subsystem can have two or more controllers in the subsystem.

To set up multi-controller in a NVM subsystem, user needs to give same
serial parameter to the controllers (-device), but different cntlid
parameter to each controllers (-device).

Example:

  -device nvme,ctrlid=0,serial=foo,...
  -device nvme,ctrlid=1,serial=foo,...

The example above prepares two different controllers in a NVM subsystem
with the same serial which leads to same subsystem NQN.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
---
 hw/block/nvme.c | 10 ++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 132e61c0ee7b..50b349cf9ea3 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -57,6 +57,11 @@
  *   This value will be reported through Identify Controller data structure
  *   with a field named CNTLID[79:78].
  *
+ * - `mpath.ctrl`
+ *   Multi-path I/O with multi-controller (default: false). A NVM subsystem
+ *   can hold two or more controllers.  This will be reflected to Identify
+ *   Controller data structure CMIC[76] field.
+ *
  * - `zoned.append_size_limit`
  *   The maximum I/O size in bytes that is allowed in Zone Append command.
  *   The default is 128KiB. Since internally this this value is maintained as
@@ -4284,6 +4289,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     n->bar.intmc = n->bar.intms = 0;
 
     id->cntlid = n->params.cntlid;
+
+    if (n->params.mpath_ctrl) {
+        id->cmic |= NVME_CMIC_MULTI_CTRL;
+    }
 }
 
 static void nvme_realize(PCIDevice *pci_dev, Error **errp)
@@ -4364,6 +4373,7 @@ static Property nvme_props[] = {
     DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
     DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7),
     DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false),
+    DEFINE_PROP_BOOL("mpath.ctrl", NvmeCtrl, params.mpath_ctrl, false),
     DEFINE_PROP_SIZE32("zoned.append_size_limit", NvmeCtrl, params.zasl_bs,
                        NVME_DEFAULT_MAX_ZA_SIZE),
     DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 6aa9e89ac5a8..73c9c2cff247 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -21,6 +21,7 @@ typedef struct NvmeParams {
     uint8_t  mdts;
     bool     use_intel_id;
     uint32_t zasl_bs;
+    bool     mpath_ctrl;
 } NvmeParams;
 
 typedef struct NvmeAsyncEvent {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/5] nvme: add NMIC enum value for Identify Namespace
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
                   ` (2 preceding siblings ...)
  2021-01-15 12:05 ` [RFC PATCH 3/5] hw/block/nvme: add multi-controller param for mpath Minwoo Im
@ 2021-01-15 12:05 ` Minwoo Im
  2021-01-15 12:05 ` [RFC PATCH 5/5] hw/block/nvme: add namespace sharing param for mpath Minwoo Im
  2021-01-15 13:57 ` [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Klaus Jensen
  5 siblings, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

Added Namespace Multi-path I/O and Namespace Sharing Capabilities (NMIC)
field to support shared namespace from controller(s).

This field is in Identify Namespace data structure in [30].

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
---
 include/block/nvme.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index e83ec1e124c6..dd7b5ba9ef4c 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1110,6 +1110,10 @@ enum NvmeNsIdentifierType {
     NVME_NIDT_CSI               = 0x04,
 };
 
+enum NvmeNsNmic {
+    NVME_NMIC_NS_SHARED         = 1 << 0,
+};
+
 enum NvmeCsi {
     NVME_CSI_NVM                = 0x00,
     NVME_CSI_ZONED              = 0x02,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 5/5] hw/block/nvme: add namespace sharing param for mpath
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
                   ` (3 preceding siblings ...)
  2021-01-15 12:05 ` [RFC PATCH 4/5] nvme: add NMIC enum value for Identify Namespace Minwoo Im
@ 2021-01-15 12:05 ` Minwoo Im
  2021-01-15 13:57 ` [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Klaus Jensen
  5 siblings, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 12:05 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: Keith Busch, Klaus Jensen, Minwoo Im, Kevin Wolf, Max Reitz

Added mpath.ns parameter to set multi-namespace bit[0] in NMIC field in
Identify Namespace data structure.  It will indicate that the namespace
can be shared by two or more controllers.

Example:

  To share the namespace between two controllers in a NVM subsystem,
first multi-controllers should be prepared with the same serial:

  -device nvme,id=nvme0,ctrlid=0,serial=foo,...
  -device nvme,id=nvme1,ctrlid=1,serial=foo,...

Then, we can prepare namespace device with mpath.ns enabled.  Also, we
must give the same UUID for those two devices with the same Namespace
ID.

  -device nvme-ns,uuid=<uuid>,bus=nvme0,nsid=1,...
  -device nvme-ns,uuid=<uuid>,bus=nvme1,nsid=1,...

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
---
 hw/block/nvme-ns.c | 14 ++++++++++++--
 hw/block/nvme-ns.h |  2 ++
 hw/block/nvme.c    |  6 ++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 274eaf61b721..cedacda9ebab 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -32,13 +32,18 @@
 
 #define MIN_DISCARD_GRANULARITY (4 * KiB)
 
-static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
+static int nvme_ns_init(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 {
     BlockDriverInfo bdi;
     NvmeIdNs *id_ns = &ns->id_ns;
     int lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     int npdg;
 
+    if (!n->params.mpath_ctrl && ns->params.mpath_ns) {
+        error_setg(errp, "mpath.ctrl must be enabled for ns sharing");
+        return -1;
+    }
+
     ns->id_ns.dlfeat = 0x9;
 
     id_ns->lbaf[lba_index].ds = 31 - clz32(ns->blkconf.logical_block_size);
@@ -63,6 +68,10 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
 
     id_ns->npda = id_ns->npdg = npdg - 1;
 
+    if (ns->params.mpath_ns) {
+        id_ns->nmic |= NVME_NMIC_NS_SHARED;
+    }
+
     return 0;
 }
 
@@ -314,7 +323,7 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
         return -1;
     }
 
-    if (nvme_ns_init(ns, errp)) {
+    if (nvme_ns_init(n, ns, errp)) {
         return -1;
     }
     if (ns->params.zoned) {
@@ -371,6 +380,7 @@ static Property nvme_ns_props[] = {
     DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf),
     DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0),
     DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid),
+    DEFINE_PROP_BOOL("mpath.ns", NvmeNamespace, params.mpath_ns, false),
     DEFINE_PROP_BOOL("zoned", NvmeNamespace, params.zoned, false),
     DEFINE_PROP_SIZE("zoned.zone_size", NvmeNamespace, params.zone_size_bs,
                      NVME_DEFAULT_ZONE_SIZE),
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index f8f3c28c360b..d1f2518f7210 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -29,6 +29,8 @@ typedef struct NvmeNamespaceParams {
     uint32_t nsid;
     QemuUUID uuid;
 
+    bool     mpath_ns;
+
     bool     zoned;
     bool     cross_zone_read;
     uint64_t zone_size_bs;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 50b349cf9ea3..e4aa15345c12 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -62,6 +62,12 @@
  *   can hold two or more controllers.  This will be reflected to Identify
  *   Controller data structure CMIC[76] field.
  *
+ * Setting `mpath.ctrl` to true enables the following properties to configure
+ * multi-path for a namespace:
+ *     mpath.ns=<true|false, default: false>
+ *         If set to true, namespace sharing in a NVM subsystem is enabled.
+ *         This will be reflected to Identify Namespace data structure.
+ *
  * - `zoned.append_size_limit`
  *   The maximum I/O size in bytes that is allowed in Zone Append command.
  *   The default is 128KiB. Since internally this this value is maintained as
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
  2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
                   ` (4 preceding siblings ...)
  2021-01-15 12:05 ` [RFC PATCH 5/5] hw/block/nvme: add namespace sharing param for mpath Minwoo Im
@ 2021-01-15 13:57 ` Klaus Jensen
  2021-01-15 17:35   ` Keith Busch
  5 siblings, 1 reply; 11+ messages in thread
From: Klaus Jensen @ 2021-01-15 13:57 UTC (permalink / raw)
  To: Minwoo Im; +Cc: Keith Busch, Kevin Wolf, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 4552 bytes --]

On Jan 15 21:05, Minwoo Im wrote:
> Hello,
> 
> This series added support for multi-path I/O with multi-controllers and
> namespace sharing.  By supporting these features, we can test Linux
> kernel mpath(multi-path) code with this NVMe device.
> 
> Patches from the first to third added multi-controller support in a NVM
> subsystem by adding a mpath.ctrl parameter to nvme device.  The rest of
> the patches added namespace sharing support in a NVM subsystem with two
> or more controllers by adding mpath.ns parameter to nvme-ns device.
> 
> Multi-path enabled in kernel with this series for two controllers with a
> namespace:
> 
>   root@vm:~/work# nvme list -v
>   NVM Express Subsystems
> 
>   Subsystem        Subsystem-NQN                                                                                    Controllers
>   ---------------- ------------------------------------------------------------------------------------------------ ----------------
>   nvme-subsys0     nqn.2019-08.org.qemu:serial                                                                      nvme0, nvme1
> 
>   NVM Express Controllers
> 
>   Device   SN                   MN                                       FR       TxPort Address        Subsystem    Namespaces
>   -------- -------------------- ---------------------------------------- -------- ------ -------------- ------------ ----------------
>   nvme0    serial               QEMU NVMe Ctrl                           1.0      pcie   0000:01:00.0   nvme-subsys0 nvme0n1
>   nvme1    serial               QEMU NVMe Ctrl                           1.0      pcie   0000:02:00.0   nvme-subsys0 nvme0n1
> 
>   NVM Express Namespaces
> 
>   Device       NSID     Usage                      Format           Controllers
>   ------------ -------- -------------------------- ---------------- ----------------
>   nvme0n1      1        268.44  MB / 268.44  MB    512   B +  0 B   nvme0, nvme1
> 
> The reason why I put 'RFC' tag to this series is mostly about the last
> patch "hw/block/nvme: add namespace sharing param for mpath".  It seems
> like QEMU block backing device does not support to be shared among two
> or more -device(s).  It means that we just can't give same drive=
> property to multiple nvme-ns devices.  This patch has just let -device
> maps to -drive one-to-one(1:1), but if namespae sharing is detected and
> setup by the host kernel, then a single block device will be selected
> for the NVM subsystem.  I'm not sure this is a good start for this
> feature, so I put the RFC tag here.
> 
> Please kindly review!
> 

Hi Minwoo,

First - super awesome that we get this discussion going. I've been
hacking around this a couple of times, but I've never been happy with
the approach.

As you already mentioned, the problem I see with this approach is that
we have separate namespaces attached to separate controllers. So we are
faking it to the max and if I/O starts going through the other
controller we end up on a namespace that is unrelated (different data).
Havoc ensues.

My approach looks a lot like yours, but I hacked around this by adding
extra 'ctrl-0', 'ctrl-1', ..., link-parameters to the namespace device,
replacing the bus. This works well because the namespace then just
registers with multiple controllers. But adding more parameters like
that just isnt nice, so I've been trying to figure out how to allow a
parameter to be specified multiple times, so we could just do more
'ctrl'-parameters.

Alas, since I started thinking about namespace sharing I have been
regretting that I didn't reverse the bus-mechanic for namespace
attachment. It would align better with the theory of operation if it was
the controllers that attached to namespaces, and not the other way
around. So what I would actually really prefer, is that we had multiple
'ns' link parameters on the controller device.

   -device nvme-ns,id=a,nsid=1,...
   -device nvme-ns,id=b,nsid=2,...
   -device nvme-ns,id=c,nsid=3,...
   -device nvme,cntlid=0,serial=foo,ns=a,ns=b
   -device nvme,cntlid=1,serial=foo,ns=a,ns=c

But I havn't been able to figure out how to kick QOM into doing this.
And I'm definitely not sure this is the way to go. Should we instead
introduce a 'nvme-subsys' device and walk a bus?

I'd really appreciate some input on how we should model this if anyone
has any thoughts. And I think we should consider stuff like detached
namespaces as well. Support for Namespace Management. The whole shabang.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
  2021-01-15 13:57 ` [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Klaus Jensen
@ 2021-01-15 17:35   ` Keith Busch
  2021-01-15 17:47     ` Klaus Jensen
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2021-01-15 17:35 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: Kevin Wolf, Minwoo Im, qemu-devel, qemu-block, Max Reitz

On Fri, Jan 15, 2021 at 02:57:45PM +0100, Klaus Jensen wrote:
> 
> As you already mentioned, the problem I see with this approach is that
> we have separate namespaces attached to separate controllers. So we are
> faking it to the max and if I/O starts going through the other
> controller we end up on a namespace that is unrelated (different data).
> Havoc ensues.
> 
> My approach looks a lot like yours, but I hacked around this by adding
> extra 'ctrl-0', 'ctrl-1', ..., link-parameters to the namespace device,
> replacing the bus. This works well because the namespace then just
> registers with multiple controllers. But adding more parameters like
> that just isnt nice, so I've been trying to figure out how to allow a
> parameter to be specified multiple times, so we could just do more
> 'ctrl'-parameters.
> 
> Alas, since I started thinking about namespace sharing I have been
> regretting that I didn't reverse the bus-mechanic for namespace
> attachment. It would align better with the theory of operation if it was
> the controllers that attached to namespaces, and not the other way
> around. So what I would actually really prefer, is that we had multiple
> 'ns' link parameters on the controller device.

Would this work better if we introduce a new device in the nvme hierarchy:
the nvme-subsystem? You could attach multi-path namespaces and
controllers to that, and namespaces you don't want shared can attach
directly to controllers like they do today. You could also auto-assign
cntlid, and you wouldn't need to duplicate serial numbers in your
parameters.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
  2021-01-15 17:35   ` Keith Busch
@ 2021-01-15 17:47     ` Klaus Jensen
  2021-01-15 17:53       ` Keith Busch
  2021-01-15 18:38       ` Minwoo Im
  0 siblings, 2 replies; 11+ messages in thread
From: Klaus Jensen @ 2021-01-15 17:47 UTC (permalink / raw)
  To: Keith Busch; +Cc: Kevin Wolf, Minwoo Im, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]

On Jan 15 09:35, Keith Busch wrote:
> On Fri, Jan 15, 2021 at 02:57:45PM +0100, Klaus Jensen wrote:
> > 
> > As you already mentioned, the problem I see with this approach is that
> > we have separate namespaces attached to separate controllers. So we are
> > faking it to the max and if I/O starts going through the other
> > controller we end up on a namespace that is unrelated (different data).
> > Havoc ensues.
> > 
> > My approach looks a lot like yours, but I hacked around this by adding
> > extra 'ctrl-0', 'ctrl-1', ..., link-parameters to the namespace device,
> > replacing the bus. This works well because the namespace then just
> > registers with multiple controllers. But adding more parameters like
> > that just isnt nice, so I've been trying to figure out how to allow a
> > parameter to be specified multiple times, so we could just do more
> > 'ctrl'-parameters.
> > 
> > Alas, since I started thinking about namespace sharing I have been
> > regretting that I didn't reverse the bus-mechanic for namespace
> > attachment. It would align better with the theory of operation if it was
> > the controllers that attached to namespaces, and not the other way
> > around. So what I would actually really prefer, is that we had multiple
> > 'ns' link parameters on the controller device.
> 
> Would this work better if we introduce a new device in the nvme hierarchy:
> the nvme-subsystem? You could attach multi-path namespaces and
> controllers to that, and namespaces you don't want shared can attach
> directly to controllers like they do today. You could also auto-assign
> cntlid, and you wouldn't need to duplicate serial numbers in your
> parameters.

I kinda POC'ed that, but I think I tried to make it work with a bus and
walking it and all kinds of fancy stuff.

I think it can just be a 'link' parameter, so something like:

  -device nvme-subsys,id=subsys0
  -device nvme,id=nvme0,subsys=subsys0
  -device nvme,id=nvme1,subsys=subsys0
  -device nvme-ns,id=shared-ns1,nsid=1,subsys=subsys0
  -device nvme-ns,id=private-ns2,nsid=2,bus=nvme0

When a controller "registers" with the subsystem it attaches to all
namespaces known, and when a namespace attaches, it attaches to all
controllers known. We can even add a 'detached' bool parameter to the
namespace and keep controllers from attaching, but allowing for later
attachment.

Cool!

Question: NSIDs must be the same on each controller for shared
namespaces, but can private namespaces "share" nsid across controllers
in the subsystem?  I don't think the spec is clear on that point.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
  2021-01-15 17:47     ` Klaus Jensen
@ 2021-01-15 17:53       ` Keith Busch
  2021-01-15 18:38       ` Minwoo Im
  1 sibling, 0 replies; 11+ messages in thread
From: Keith Busch @ 2021-01-15 17:53 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: Kevin Wolf, Minwoo Im, qemu-devel, qemu-block, Max Reitz

On Fri, Jan 15, 2021 at 06:47:20PM +0100, Klaus Jensen wrote:
> Cool!

I thought so too :)
 
> Question: NSIDs must be the same on each controller for shared
> namespaces, but can private namespaces "share" nsid across controllers
> in the subsystem?  I don't think the spec is clear on that point.

The namespace NSID has to be unique within the entire subsystem, whether
they're shared or private.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns
  2021-01-15 17:47     ` Klaus Jensen
  2021-01-15 17:53       ` Keith Busch
@ 2021-01-15 18:38       ` Minwoo Im
  1 sibling, 0 replies; 11+ messages in thread
From: Minwoo Im @ 2021-01-15 18:38 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: Keith Busch, Kevin Wolf, qemu-devel, qemu-block, Max Reitz

On 21-01-15 18:47:20, Klaus Jensen wrote:
> On Jan 15 09:35, Keith Busch wrote:
> > On Fri, Jan 15, 2021 at 02:57:45PM +0100, Klaus Jensen wrote:
> > > 
> > > As you already mentioned, the problem I see with this approach is that
> > > we have separate namespaces attached to separate controllers. So we are
> > > faking it to the max and if I/O starts going through the other
> > > controller we end up on a namespace that is unrelated (different data).
> > > Havoc ensues.
> > > 
> > > My approach looks a lot like yours, but I hacked around this by adding
> > > extra 'ctrl-0', 'ctrl-1', ..., link-parameters to the namespace device,
> > > replacing the bus. This works well because the namespace then just
> > > registers with multiple controllers. But adding more parameters like
> > > that just isnt nice, so I've been trying to figure out how to allow a
> > > parameter to be specified multiple times, so we could just do more
> > > 'ctrl'-parameters.
> > > 
> > > Alas, since I started thinking about namespace sharing I have been
> > > regretting that I didn't reverse the bus-mechanic for namespace
> > > attachment. It would align better with the theory of operation if it was
> > > the controllers that attached to namespaces, and not the other way
> > > around. So what I would actually really prefer, is that we had multiple
> > > 'ns' link parameters on the controller device.
> > 
> > Would this work better if we introduce a new device in the nvme hierarchy:
> > the nvme-subsystem? You could attach multi-path namespaces and
> > controllers to that, and namespaces you don't want shared can attach
> > directly to controllers like they do today. You could also auto-assign
> > cntlid, and you wouldn't need to duplicate serial numbers in your
> > parameters.
> 
> I kinda POC'ed that, but I think I tried to make it work with a bus and
> walking it and all kinds of fancy stuff.
> 
> I think it can just be a 'link' parameter, so something like:
> 
>   -device nvme-subsys,id=subsys0

Do we have any plan for default subsys hierarchy?  Or is it going to be
a mandatory root node of nvme controllers and namespaces?

>   -device nvme,id=nvme0,subsys=subsys0
>   -device nvme,id=nvme1,subsys=subsys0
>   -device nvme-ns,id=shared-ns1,nsid=1,subsys=subsys0

In this case, what is the default set-up for shared-ns1?  Is this
namespace going to be ready right after the two nvme controllers being
realized?  If so, do we iterate all the namespace devices in the NVM
subsystem and attach them to this controller in the initial time?

If so, I agree with this approach.

>   -device nvme-ns,id=private-ns2,nsid=2,bus=nvme0

This must be the case what Keith mentioned of directly attaching to a
controller.  It looks nice.  But, one concerning point here is that, in
!shared namespace, if we don't specify 'subsys' property here to attach
it to directly to a controller, it means it implicitly will belong to
the subsys0 where the nvme0 belongs to.  It means that user should give
nsid different than 1 which is already shared.

So, how do we make subsys property as a mandatory for namespace device
and provide optional choice for bus.  If bus is given to a controller,
then it can mean a private namespace, otherwise it can be shared among
controllers in a subsystem.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-15 18:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15 12:05 [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Minwoo Im
2021-01-15 12:05 ` [RFC PATCH 1/5] hw/block/nvme: add controller id parameter Minwoo Im
2021-01-15 12:05 ` [RFC PATCH 2/5] nvme: add CMIC enum value for Identify Controller Minwoo Im
2021-01-15 12:05 ` [RFC PATCH 3/5] hw/block/nvme: add multi-controller param for mpath Minwoo Im
2021-01-15 12:05 ` [RFC PATCH 4/5] nvme: add NMIC enum value for Identify Namespace Minwoo Im
2021-01-15 12:05 ` [RFC PATCH 5/5] hw/block/nvme: add namespace sharing param for mpath Minwoo Im
2021-01-15 13:57 ` [RFC PATCH 0/5] hw/block/nvme: support multi-path for ctrl/ns Klaus Jensen
2021-01-15 17:35   ` Keith Busch
2021-01-15 17:47     ` Klaus Jensen
2021-01-15 17:53       ` Keith Busch
2021-01-15 18:38       ` Minwoo Im

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).