All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
@ 2022-10-21  0:18 clay.mayers
  2022-10-21  0:18 ` [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: clay.mayers @ 2022-10-21  0:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé

From: Clay Mayers <clay.mayers@kioxia.com>

ZNS controllers have the option to limit the time a zone can remain in
the active state.  It begins with a background process in the controller
setting the finish-zone-recommended FZR attribute for a zone.  As part of
setting this attribute, the zone's id is added to the namespace's
zone-descriptor-changed (ZDC) log page. If enabled, items added to the
ZDC log page generate a ZDC "asynchronous event notification" AEN. Optionally,
the control can induce a "zone excursion" forcing the zone into the finished
state that also generates a ZDC event.

Zone enabled applications need to properly handle ZDC events. In a real device,
the timeout is many hours making testing an application difficult.
Implemented is the generation of FZR ZDC events to speed up O/S and application
testing.

Added to the zoned NVMe command set is an optional, per-namespace timer
(zoned.finish_time) to set the FZR attr for long-lived active zones; A per
namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the ZDC
AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
modeled.

See section 5.5 of the NVMe Zoned Namespace Command Set Specification v1.1
for more details.

Clay Mayers (4):
  hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
  hw/block/nvme: add zone descriptor changed log page
  hw/block/nvme: supply dw1 for aen result
  hw/block/nvme: add zone descriptor changed AEN

 docs/system/devices/nvme.rst |   5 ++
 hw/nvme/ctrl.c               | 166 +++++++++++++++++++++++++++++++++--
 hw/nvme/ns.c                 |  15 ++++
 hw/nvme/nvme.h               |  37 +++++++-
 hw/nvme/trace-events         |   3 +-
 include/block/nvme.h         |  14 ++-
 6 files changed, 225 insertions(+), 15 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
  2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
@ 2022-10-21  0:18 ` clay.mayers
  2022-10-21  6:36   ` Klaus Jensen
  2022-10-21  0:18 ` [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: clay.mayers @ 2022-10-21  0:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé

From: Clay Mayers <clay.mayers@kioxia.com>

Adds ns.param.zoned.finish_time, which sets the number of
seconds a zone can remain active before the zone attribute
ZONE_FINISH_RECOMMENDED is set.

This requires scanning the exp open, imp open and closed lists
of zones whenever a zone is marked as requiring finishing.  The
expectation is these lists will be short (10s of items) allowing a
simpler implementation than keeping the lists sorted.  It also
keeps the overhead during the exception of a timeout instead of
when zones change state between open and closed. For use cases
where this isn't true, finish_time should be 0 to disable this
feature (the default).

Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
 docs/system/devices/nvme.rst |  5 +++++
 hw/nvme/ctrl.c               | 35 +++++++++++++++++++++++++++++++++++
 hw/nvme/ns.c                 |  8 ++++++++
 hw/nvme/nvme.h               | 18 ++++++++++++++----
 4 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst
index 30f841ef62..1cb0ef844c 100644
--- a/docs/system/devices/nvme.rst
+++ b/docs/system/devices/nvme.rst
@@ -212,6 +212,11 @@ The namespace may be configured with additional parameters
   the minimum memory page size (CAP.MPSMIN). The default value (``0``)
   has this property inherit the ``mdts`` value.
 
+``zoned.finish_time=UINT32`` (default: ``0``)
+  Set the time in seconds for how long a zone can be active before setting the
+  zone attribute ``Zone Finish Recommended``.  The default value (``0``)
+  disables this feature.
+
 Metadata
 --------
 
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 87aeba0564..d7e9fae0b0 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1516,6 +1516,38 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
     }
 }
 
+static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
+{
+    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    NvmeZone *zone;
+
+    QTAILQ_FOREACH(zone, list, entry) {
+        if (zone->finish_ms <= now) {
+            zone->finish_ms = INT64_MAX;
+            zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
+        } else if (zone->finish_ms != INT64_MAX) {
+            timer_mod_anticipate(ns->active_timer, zone->finish_ms);
+        }
+    }
+}
+
+void nvme_finish_needed(void *opaque)
+{
+    NvmeNamespace *ns = opaque;
+
+    nvme_check_finish(ns, &ns->exp_open_zones);
+    nvme_check_finish(ns, &ns->imp_open_zones);
+    nvme_check_finish(ns, &ns->closed_zones);
+}
+
+void nvme_set_active_timeout(NvmeNamespace *ns, NvmeZone *zone)
+{
+    if (ns->fto_ms) {
+        zone->finish_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + ns->fto_ms;
+        timer_mod_anticipate(ns->active_timer, zone->finish_ms);
+    }
+}
+
 static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len)
 {
     uint8_t mdts = n->params.mdts;
@@ -1791,6 +1823,7 @@ static uint16_t nvme_zrm_finish(NvmeNamespace *ns, NvmeZone *zone)
 
         /* fallthrough */
     case NVME_ZONE_STATE_EMPTY:
+        zone->d.za &= ~NVME_ZA_FINISH_RECOMMENDED;
         nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL);
         return NVME_SUCCESS;
 
@@ -1891,6 +1924,7 @@ static uint16_t nvme_zrm_open_flags(NvmeCtrl *n, NvmeNamespace *ns,
 
         if (act) {
             nvme_aor_inc_active(ns);
+            nvme_set_active_timeout(ns, zone);
         }
 
         nvme_aor_inc_open(ns);
@@ -3619,6 +3653,7 @@ static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone)
             return status;
         }
         nvme_aor_inc_active(ns);
+        nvme_set_active_timeout(ns, zone);
         zone->d.za |= NVME_ZA_ZD_EXT_VALID;
         nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED);
         return NVME_SUCCESS;
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 62a1f97be0..b577f2d8e0 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -322,6 +322,11 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
         ns->id_ns.nsfeat &= ~0x4;
     }
 
+    ns->fto_ms = ns->params.fto * INT64_C(1000);
+    if (ns->fto_ms) {
+        ns->active_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, nvme_finish_needed,
+                                        ns);
+    }
     ns->id_ns_zoned = id_ns_z;
 }
 
@@ -338,6 +343,7 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone)
             nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED);
         }
         nvme_aor_inc_active(ns);
+        nvme_set_active_timeout(ns, zone);
         QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry);
     } else {
         trace_pci_nvme_clear_ns_reset(state, zone->d.zslba);
@@ -521,6 +527,7 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
 void nvme_ns_cleanup(NvmeNamespace *ns)
 {
     if (ns->params.zoned) {
+        timer_free(ns->active_timer);
         g_free(ns->id_ns_zoned);
         g_free(ns->zone_array);
         g_free(ns->zd_extensions);
@@ -644,6 +651,7 @@ static Property nvme_ns_props[] = {
     DEFINE_PROP_SIZE("zoned.zrwafg", NvmeNamespace, params.zrwafg, -1),
     DEFINE_PROP_BOOL("eui64-default", NvmeNamespace, params.eui64_default,
                      false),
+    DEFINE_PROP_UINT32("zoned.finish_time", NvmeNamespace, params.fto, 0),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 79f5c281c2..9a54dcdb32 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -93,6 +93,7 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
 typedef struct NvmeZone {
     NvmeZoneDescr   d;
     uint64_t        w_ptr;
+    int64_t         finish_ms;
     QTAILQ_ENTRY(NvmeZone) entry;
 } NvmeZone;
 
@@ -121,12 +122,15 @@ typedef struct NvmeNamespaceParams {
     uint32_t max_active_zones;
     uint32_t max_open_zones;
     uint32_t zd_extension_size;
+    uint32_t fto;
 
     uint32_t numzrwa;
     uint64_t zrwas;
     uint64_t zrwafg;
 } NvmeNamespaceParams;
 
+typedef QTAILQ_HEAD(, NvmeZone) NvmeZoneListHead;
+
 typedef struct NvmeNamespace {
     DeviceState  parent_obj;
     BlockConf    blkconf;
@@ -154,10 +158,10 @@ typedef struct NvmeNamespace {
 
     NvmeIdNsZoned   *id_ns_zoned;
     NvmeZone        *zone_array;
-    QTAILQ_HEAD(, NvmeZone) exp_open_zones;
-    QTAILQ_HEAD(, NvmeZone) imp_open_zones;
-    QTAILQ_HEAD(, NvmeZone) closed_zones;
-    QTAILQ_HEAD(, NvmeZone) full_zones;
+    NvmeZoneListHead exp_open_zones;
+    NvmeZoneListHead imp_open_zones;
+    NvmeZoneListHead closed_zones;
+    NvmeZoneListHead full_zones;
     uint32_t        num_zones;
     uint64_t        zone_size;
     uint64_t        zone_capacity;
@@ -166,6 +170,9 @@ typedef struct NvmeNamespace {
     int32_t         nr_open_zones;
     int32_t         nr_active_zones;
 
+    int64_t         fto_ms;
+    QEMUTimer       *active_timer;
+
     NvmeNamespaceParams params;
 
     struct {
@@ -274,6 +281,9 @@ static inline void nvme_aor_dec_active(NvmeNamespace *ns)
     assert(ns->nr_active_zones >= 0);
 }
 
+void nvme_set_active_timeout(NvmeNamespace *ns, NvmeZone *zone);
+void nvme_finish_needed(void *opaque);
+
 void nvme_ns_init_format(NvmeNamespace *ns);
 int nvme_ns_setup(NvmeNamespace *ns, Error **errp);
 void nvme_ns_drain(NvmeNamespace *ns);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page
  2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
  2022-10-21  0:18 ` [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
@ 2022-10-21  0:18 ` clay.mayers
  2022-10-21  6:26   ` Klaus Jensen
  2022-10-21  0:18 ` [PATCH 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: clay.mayers @ 2022-10-21  0:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé

From: Clay Mayers <clay.mayers@kioxia.com>

Zones marked with ZONE_FINISH_RECOMMENDED are added to the zone
descriptor changed log page.  Once read with RAE cleared, they are
removed from the list.

Zones stay in the list regardless of what other states the zones may
go through so applications must be aware of ABA issues where finish
may be recommended, the zone freed and re-opened and now the attribute
is now clear.

Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
 hw/nvme/ctrl.c       | 50 ++++++++++++++++++++++++++++++++++++++++++++
 hw/nvme/ns.c         |  6 ++++++
 hw/nvme/nvme.h       |  8 +++++++
 hw/nvme/trace-events |  1 +
 include/block/nvme.h |  8 +++++++
 5 files changed, 73 insertions(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index d7e9fae0b0..3ffd0fb469 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1516,15 +1516,42 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
     }
 }
 
+static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool reset)
+{
+    NvmeZdc *zdc;
+    NvmeZdc *next;
+    int index = 0;
+
+    QTAILQ_FOREACH_SAFE(zdc, &ns->zdc_list, entry, next) {
+        if (index >= ARRAY_SIZE(zlist->zids)) {
+            break;
+        }
+        zlist->zids[index++] = zdc->zone->d.zslba;
+        if (reset) {
+            QTAILQ_REMOVE(&ns->zdc_list, zdc, entry);
+            zdc->zone->zdc_entry = NULL;
+            g_free(zdc);
+        }
+    }
+    zlist->nzid = cpu_to_le16(index);
+}
+
 static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
 {
     int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
     NvmeZone *zone;
+    NvmeZdc  *zdc;
 
     QTAILQ_FOREACH(zone, list, entry) {
         if (zone->finish_ms <= now) {
             zone->finish_ms = INT64_MAX;
             zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
+            if (!zone->zdc_entry) {
+                zdc = g_malloc0(sizeof(*zdc));
+                zdc->zone = zone;
+                zone->zdc_entry = zdc;
+                QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
+            }
         } else if (zone->finish_ms != INT64_MAX) {
             timer_mod_anticipate(ns->active_timer, zone->finish_ms);
         }
@@ -4675,6 +4702,27 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len,
     return nvme_c2h(n, ((uint8_t *)&log) + off, trans_len, req);
 }
 
+static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
+                                    uint64_t off, NvmeRequest *req)
+{
+    NvmeNamespace *ns;
+    NvmeCmd *cmd = &req->cmd;
+    uint32_t nsid = le32_to_cpu(cmd->nsid);
+    NvmeZoneIdList zlist = { };
+    uint32_t trans_len = MIN(sizeof(zlist) - off, buf_len);
+
+    nsid = le32_to_cpu(cmd->nsid);
+    trace_pci_nvme_changed_zones(nsid);
+
+    ns = nvme_ns(n, nsid);
+    if (!ns) {
+        return NVME_INVALID_NSID | NVME_DNR;
+    }
+    nvme_zdc_list(ns, &zlist, !rae);
+
+    return nvme_c2h(n, ((uint8_t *)&zlist) + off, trans_len, req);
+}
+
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeCmd *cmd = &req->cmd;
@@ -4722,6 +4770,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
         return nvme_changed_nslist(n, rae, len, off, req);
     case NVME_LOG_CMD_EFFECTS:
         return nvme_cmd_effects(n, csi, len, off, req);
+    case NVME_LOG_CHANGED_ZONE:
+        return nvme_changed_zones(n, rae, len, off, req);
     default:
         trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
         return NVME_INVALID_FIELD | NVME_DNR;
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index b577f2d8e0..25cd490c99 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -240,6 +240,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
     QTAILQ_INIT(&ns->imp_open_zones);
     QTAILQ_INIT(&ns->closed_zones);
     QTAILQ_INIT(&ns->full_zones);
+    QTAILQ_INIT(&ns->zdc_list);
 
     zone = ns->zone_array;
     for (i = 0; i < ns->num_zones; i++, zone++) {
@@ -526,8 +527,13 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
 
 void nvme_ns_cleanup(NvmeNamespace *ns)
 {
+    NvmeZdc *zdc;
+
     if (ns->params.zoned) {
         timer_free(ns->active_timer);
+        while ((zdc = QTAILQ_FIRST(&ns->zdc_list))) {
+            g_free(zdc);
+        }
         g_free(ns->id_ns_zoned);
         g_free(ns->zone_array);
         g_free(ns->zd_extensions);
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 9a54dcdb32..ae65226150 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -32,6 +32,7 @@ QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES > NVME_NSID_BROADCAST - 1);
 
 typedef struct NvmeCtrl NvmeCtrl;
 typedef struct NvmeNamespace NvmeNamespace;
+typedef struct NvmeZone NvmeZone;
 
 #define TYPE_NVME_BUS "nvme-bus"
 OBJECT_DECLARE_SIMPLE_TYPE(NvmeBus, NVME_BUS)
@@ -90,10 +91,16 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
 #define NVME_NS(obj) \
     OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
 
+typedef struct NvmeZdc {
+    QTAILQ_ENTRY(NvmeZdc) entry;
+    NvmeZone *zone;
+} NvmeZdc;
+
 typedef struct NvmeZone {
     NvmeZoneDescr   d;
     uint64_t        w_ptr;
     int64_t         finish_ms;
+    NvmeZdc         *zdc_entry;
     QTAILQ_ENTRY(NvmeZone) entry;
 } NvmeZone;
 
@@ -172,6 +179,7 @@ typedef struct NvmeNamespace {
 
     int64_t         fto_ms;
     QEMUTimer       *active_timer;
+    QTAILQ_HEAD(, NvmeZdc) zdc_list;
 
     NvmeNamespaceParams params;
 
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index fccb79f489..337927e607 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -64,6 +64,7 @@ pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8""
 pci_nvme_identify_cmd_set(void) "identify i/o command set"
 pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
+pci_nvme_changed_zones(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
 pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
 pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 8027b7126b..c747cc4948 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1010,6 +1010,7 @@ enum NvmeLogIdentifier {
     NVME_LOG_FW_SLOT_INFO   = 0x03,
     NVME_LOG_CHANGED_NSLIST = 0x04,
     NVME_LOG_CMD_EFFECTS    = 0x05,
+    NVME_LOG_CHANGED_ZONE   = 0xbf,
 };
 
 typedef struct QEMU_PACKED NvmePSD {
@@ -1617,6 +1618,12 @@ typedef enum NvmeVirtualResourceType {
     NVME_VIRT_RES_INTERRUPT     = 0x01,
 } NvmeVirtualResourceType;
 
+typedef struct QEMU_PACKED NvmeZoneIdList {
+    uint16_t nzid;
+    uint16_t rsvd2[3];
+    uint64_t zids[511];
+} NvmeZoneIdList;
+
 static inline void _nvme_check_size(void)
 {
     QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
@@ -1655,5 +1662,6 @@ static inline void _nvme_check_size(void)
     QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) != 4096);
     QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlEntry) != 32);
     QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlList) != 4096);
+    QEMU_BUILD_BUG_ON(sizeof(NvmeZoneIdList) != 4096);
 }
 #endif
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] hw/block/nvme: supply dw1 for aen result
  2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
  2022-10-21  0:18 ` [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
  2022-10-21  0:18 ` [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
@ 2022-10-21  0:18 ` clay.mayers
  2022-10-21  5:59   ` Klaus Jensen
  2022-10-21  0:18 ` [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
  2022-10-21  5:57 ` [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Klaus Jensen
  4 siblings, 1 reply; 13+ messages in thread
From: clay.mayers @ 2022-10-21  0:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé

From: Clay Mayers <clay.mayers@kioxia.com>

cqe.dw1 AEN is sometimes required to convey the NSID of the log page
to read.  This is the case for the zone descriptor changed log
page.

Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
 hw/nvme/ctrl.c       | 19 +++++++++++--------
 hw/nvme/nvme.h       |  2 ++
 hw/nvme/trace-events |  2 +-
 include/block/nvme.h |  4 +++-
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 3ffd0fb469..c7ee54ef5e 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1445,6 +1445,7 @@ static void nvme_process_aers(void *opaque)
         result->event_type = event->result.event_type;
         result->event_info = event->result.event_info;
         result->log_page = event->result.log_page;
+        req->cqe.dw1 = cpu_to_le32(event->result.nsid);
         g_free(event);
 
         trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
@@ -1455,11 +1456,12 @@ static void nvme_process_aers(void *opaque)
 }
 
 static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
-                               uint8_t event_info, uint8_t log_page)
+                               uint8_t event_info, uint8_t log_page,
+                               uint32_t nsid)
 {
     NvmeAsyncEvent *event;
 
-    trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
+    trace_pci_nvme_enqueue_event(event_type, event_info, log_page, nsid);
 
     if (n->aer_queued == n->params.aer_max_queued) {
         trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
@@ -1471,6 +1473,7 @@ static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
         .event_type = event_type,
         .event_info = event_info,
         .log_page   = log_page,
+        .nsid       = nsid,
     };
 
     QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
@@ -1505,7 +1508,7 @@ static void nvme_smart_event(NvmeCtrl *n, uint8_t event)
         return;
     }
 
-    nvme_enqueue_event(n, NVME_AER_TYPE_SMART, aer_info, NVME_LOG_SMART_INFO);
+    nvme_enqueue_event(n, NVME_AER_TYPE_SMART, aer_info, NVME_LOG_SMART_INFO, 0);
 }
 
 static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
@@ -5823,7 +5826,7 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req)
         if (!test_and_set_bit(nsid, ctrl->changed_nsids)) {
             nvme_enqueue_event(ctrl, NVME_AER_TYPE_NOTICE,
                                NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED,
-                               NVME_LOG_CHANGED_NSLIST);
+                               NVME_LOG_CHANGED_NSLIST, 0);
         }
     }
 
@@ -6964,7 +6967,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             if (n->outstanding_aers) {
                 nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
                                    NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
-                                   NVME_LOG_ERROR_INFO);
+                                   NVME_LOG_ERROR_INFO, 0);
             }
 
             return;
@@ -6981,7 +6984,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             if (n->outstanding_aers) {
                 nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
                                    NVME_AER_INFO_ERR_INVALID_DB_VALUE,
-                                   NVME_LOG_ERROR_INFO);
+                                   NVME_LOG_ERROR_INFO, 0);
             }
 
             return;
@@ -7026,7 +7029,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             if (n->outstanding_aers) {
                 nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
                                    NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
-                                   NVME_LOG_ERROR_INFO);
+                                   NVME_LOG_ERROR_INFO, 0);
             }
 
             return;
@@ -7043,7 +7046,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             if (n->outstanding_aers) {
                 nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
                                    NVME_AER_INFO_ERR_INVALID_DB_VALUE,
-                                   NVME_LOG_ERROR_INFO);
+                                   NVME_LOG_ERROR_INFO, 0);
             }
 
             return;
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index ae65226150..2b7997e4a7 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -477,6 +477,8 @@ typedef struct NvmeCtrl {
     uint64_t    dbbuf_eis;
     bool        dbbuf_enabled;
 
+    bool        zdc_event_queued;
+
     struct {
         MemoryRegion mem;
         uint8_t      *buf;
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index 337927e607..86c01f8762 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -80,7 +80,7 @@ pci_nvme_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8"
 pci_nvme_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
 pci_nvme_ns_attachment(uint16_t cid, uint8_t sel) "cid %"PRIu16", sel=0x%"PRIx8""
 pci_nvme_ns_attachment_attach(uint16_t cntlid, uint32_t nsid) "cntlid=0x%"PRIx16", nsid=0x%"PRIx32""
-pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page, uint32_t nsid) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8" nsid %"PRIu32""
 pci_nvme_enqueue_event_noqueue(int queued) "queued %d"
 pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
 pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index c747cc4948..9467d4b939 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -837,6 +837,7 @@ typedef struct QEMU_PACKED NvmeAerResult {
     uint8_t event_info;
     uint8_t log_page;
     uint8_t resv;
+    uint32_t nsid;
 } NvmeAerResult;
 
 typedef struct QEMU_PACKED NvmeZonedResult {
@@ -1228,6 +1229,7 @@ enum NvmeNsAttachmentOperation {
 #define NVME_AEC_SMART(aec)         (aec & 0xff)
 #define NVME_AEC_NS_ATTR(aec)       ((aec >> 8) & 0x1)
 #define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1)
+#define NVME_AEC_ZONE_CHANGED(aec)  ((aec >> 27) & 0x1)
 
 #define NVME_ERR_REC_TLER(err_rec)  (err_rec & 0xffff)
 #define NVME_ERR_REC_DULBE(err_rec) (err_rec & 0x10000)
@@ -1627,7 +1629,7 @@ typedef struct QEMU_PACKED NvmeZoneIdList {
 static inline void _nvme_check_size(void)
 {
     QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
-    QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4);
+    QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 8);
     QEMU_BUILD_BUG_ON(sizeof(NvmeZonedResult) != 8);
     QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16);
     QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN
  2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
                   ` (2 preceding siblings ...)
  2022-10-21  0:18 ` [PATCH 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
@ 2022-10-21  0:18 ` clay.mayers
  2022-10-21  6:41   ` Klaus Jensen
  2022-10-21  5:57 ` [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Klaus Jensen
  4 siblings, 1 reply; 13+ messages in thread
From: clay.mayers @ 2022-10-21  0:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé

From: Clay Mayers <clay.mayers@kioxia.com>

If a namespace's param.zoned.finish_time is non-zero,
controllers register with the namespace to be notified
when entries are added to its zone-descriptor-changed
log page.  If the zone-descriptor-changed aen is enabled,
this will cause an AEN to be sent from that controller.

Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
 hw/nvme/ctrl.c       | 62 +++++++++++++++++++++++++++++++++++++++++++-
 hw/nvme/ns.c         |  1 +
 hw/nvme/nvme.h       |  9 +++++++
 include/block/nvme.h |  2 ++
 4 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index c7ee54ef5e..f1cfa272b4 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1519,6 +1519,52 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
     }
 }
 
+static void nvme_zdc_watch(NvmeCtrl *n, NvmeNamespace *ns, NvmeNotifyFnc fnc)
+{
+    NvmeZdcNotify *watcher = g_malloc0(sizeof(*watcher));
+
+    watcher->n = n;
+    watcher->notify = fnc;
+    QTAILQ_INSERT_TAIL(&ns->zdc_watchers, watcher, entry);
+}
+
+static void nvme_zdc_unwatch(NvmeCtrl *n, NvmeNamespace *ns)
+{
+    NvmeZdcNotify *watcher;
+
+    QTAILQ_FOREACH(watcher, &ns->zdc_watchers, entry) {
+        if (watcher->n == n) {
+            QTAILQ_REMOVE(&ns->zdc_watchers, watcher, entry);
+            break;
+        }
+    }
+}
+
+static void nvme_zdc_notify(NvmeNamespace *ns)
+{
+    NvmeZdcNotify *watcher;
+
+    QTAILQ_FOREACH(watcher, &ns->zdc_watchers, entry) {
+        (*watcher->notify)(watcher->n, ns);
+    }
+}
+
+static void nvme_zdc_aen(NvmeCtrl *n, NvmeNamespace *ns)
+{
+    g_assert(n->id_ctrl.oaes & (1 << 27));
+
+    if (!NVME_AEC_ZONE_CHANGED(n->features.async_config)) {
+        return;
+    }
+
+    if (!n->zdc_event_queued) {
+        n->zdc_event_queued = true;
+        nvme_enqueue_event(n, NVME_AER_TYPE_NOTICE,
+                            NVME_AER_INFO_NOTICE_ZONE_DESC_CHANGED,
+                            NVME_LOG_CHANGED_ZONE, ns->params.nsid);
+    }
+}
+
 static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool reset)
 {
     NvmeZdc *zdc;
@@ -1554,6 +1600,7 @@ static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
                 zdc->zone = zone;
                 zone->zdc_entry = zdc;
                 QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
+                nvme_zdc_notify(ns);
             }
         } else if (zone->finish_ms != INT64_MAX) {
             timer_mod_anticipate(ns->active_timer, zone->finish_ms);
@@ -4722,6 +4769,14 @@ static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
         return NVME_INVALID_NSID | NVME_DNR;
     }
     nvme_zdc_list(ns, &zlist, !rae);
+    if (!rae) {
+        n->zdc_event_queued = false;
+        nvme_clear_events(n, NVME_AER_TYPE_NOTICE);
+        /* send a new aen if there are still zdc entries */
+        if (!QTAILQ_EMPTY(&ns->zdc_list)) {
+            nvme_zdc_notify(ns);
+        }
+    }
 
     return nvme_c2h(n, ((uint8_t *)&zlist) + off, trans_len, req);
 }
@@ -5808,6 +5863,7 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req)
                 return NVME_NS_NOT_ATTACHED | NVME_DNR;
             }
 
+            nvme_zdc_unwatch(n, ns);
             ctrl->namespaces[nsid] = NULL;
             ns->attached--;
 
@@ -7535,7 +7591,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
 
     id->cntlid = cpu_to_le16(n->cntlid);
 
-    id->oaes = cpu_to_le32(NVME_OAES_NS_ATTR);
+    id->oaes = cpu_to_le32(NVME_OAES_NS_ATTR | NVME_OAES_ZDC);
     id->ctratt |= cpu_to_le32(NVME_CTRATT_ELBAS);
 
     id->rab = 6;
@@ -7652,6 +7708,10 @@ void nvme_attach_ns(NvmeCtrl *n, NvmeNamespace *ns)
 
     n->dmrsl = MIN_NON_ZERO(n->dmrsl,
                             BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1));
+
+    if (ns->params.fto) {
+        nvme_zdc_watch(n, ns, nvme_zdc_aen);
+    }
 }
 
 static void nvme_realize(PCIDevice *pci_dev, Error **errp)
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 25cd490c99..5629b61302 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -241,6 +241,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
     QTAILQ_INIT(&ns->closed_zones);
     QTAILQ_INIT(&ns->full_zones);
     QTAILQ_INIT(&ns->zdc_list);
+    QTAILQ_INIT(&ns->zdc_watchers);
 
     zone = ns->zone_array;
     for (i = 0; i < ns->num_zones; i++, zone++) {
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 2b7997e4a7..5499105e7b 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -91,6 +91,14 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
 #define NVME_NS(obj) \
     OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
 
+typedef void (*NvmeNotifyFnc)(NvmeCtrl *n, NvmeNamespace *ns);
+
+typedef struct NvmeZdcNotify {
+    QTAILQ_ENTRY(NvmeZdcNotify) entry;
+    NvmeNotifyFnc notify;
+    NvmeCtrl *n;
+} NvmeZdcNotify;
+
 typedef struct NvmeZdc {
     QTAILQ_ENTRY(NvmeZdc) entry;
     NvmeZone *zone;
@@ -179,6 +187,7 @@ typedef struct NvmeNamespace {
 
     int64_t         fto_ms;
     QEMUTimer       *active_timer;
+    QTAILQ_HEAD(, NvmeZdcNotify) zdc_watchers;
     QTAILQ_HEAD(, NvmeZdc) zdc_list;
 
     NvmeNamespaceParams params;
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 9467d4b939..1662046c0d 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -830,6 +830,7 @@ enum NvmeAsyncEventRequest {
     NVME_AER_INFO_SMART_TEMP_THRESH         = 1,
     NVME_AER_INFO_SMART_SPARE_THRESH        = 2,
     NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED    = 0,
+    NVME_AER_INFO_NOTICE_ZONE_DESC_CHANGED  = 0xef,
 };
 
 typedef struct QEMU_PACKED NvmeAerResult {
@@ -1133,6 +1134,7 @@ typedef struct NvmeIdCtrlNvm {
 
 enum NvmeIdCtrlOaes {
     NVME_OAES_NS_ATTR   = 1 << 8,
+    NVME_OAES_ZDC       = 1 << 27,
 };
 
 enum NvmeIdCtrlCtratt {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
  2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
                   ` (3 preceding siblings ...)
  2022-10-21  0:18 ` [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
@ 2022-10-21  5:57 ` Klaus Jensen
  4 siblings, 0 replies; 13+ messages in thread
From: Klaus Jensen @ 2022-10-21  5:57 UTC (permalink / raw)
  To: clay.mayers
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé,
	Dmitry Fomichev

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Okt 20 17:18, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
> 
> ZNS controllers have the option to limit the time a zone can remain in
> the active state.  It begins with a background process in the controller
> setting the finish-zone-recommended FZR attribute for a zone.  As part of
> setting this attribute, the zone's id is added to the namespace's
> zone-descriptor-changed (ZDC) log page. If enabled, items added to the
> ZDC log page generate a ZDC "asynchronous event notification" AEN. Optionally,
> the control can induce a "zone excursion" forcing the zone into the finished
> state that also generates a ZDC event.
> 
> Zone enabled applications need to properly handle ZDC events. In a real device,
> the timeout is many hours making testing an application difficult.
> Implemented is the generation of FZR ZDC events to speed up O/S and application
> testing.
> 
> Added to the zoned NVMe command set is an optional, per-namespace timer
> (zoned.finish_time) to set the FZR attr for long-lived active zones; A per
> namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the ZDC
> AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
> modeled.
> 
> See section 5.5 of the NVMe Zoned Namespace Command Set Specification v1.1
> for more details.
> 
> Clay Mayers (4):
>   hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
>   hw/block/nvme: add zone descriptor changed log page
>   hw/block/nvme: supply dw1 for aen result
>   hw/block/nvme: add zone descriptor changed AEN
> 
>  docs/system/devices/nvme.rst |   5 ++
>  hw/nvme/ctrl.c               | 166 +++++++++++++++++++++++++++++++++--
>  hw/nvme/ns.c                 |  15 ++++
>  hw/nvme/nvme.h               |  37 +++++++-
>  hw/nvme/trace-events         |   3 +-
>  include/block/nvme.h         |  14 ++-
>  6 files changed, 225 insertions(+), 15 deletions(-)
> 
> -- 
> 2.27.0
> 


Hi Clay,

Thanks! Very nicely done, I have only a few comments on the individual
patches.

Adding Dmitry on CC.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] hw/block/nvme: supply dw1 for aen result
  2022-10-21  0:18 ` [PATCH 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
@ 2022-10-21  5:59   ` Klaus Jensen
  2022-10-21 15:25     ` Clay Mayers
  0 siblings, 1 reply; 13+ messages in thread
From: Klaus Jensen @ 2022-10-21  5:59 UTC (permalink / raw)
  To: clay.mayers
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 997 bytes --]

On Okt 20 17:18, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
> 
> cqe.dw1 AEN is sometimes required to convey the NSID of the log page
> to read.  This is the case for the zone descriptor changed log
> page.
> 
> Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> ---
>  hw/nvme/ctrl.c       | 19 +++++++++++--------
>  hw/nvme/nvme.h       |  2 ++
>  hw/nvme/trace-events |  2 +-
>  include/block/nvme.h |  4 +++-
>  4 files changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> index ae65226150..2b7997e4a7 100644
> --- a/hw/nvme/nvme.h
> +++ b/hw/nvme/nvme.h
> @@ -477,6 +477,8 @@ typedef struct NvmeCtrl {
>      uint64_t    dbbuf_eis;
>      bool        dbbuf_enabled;
>  
> +    bool        zdc_event_queued;
> +
>      struct {
>          MemoryRegion mem;
>          uint8_t      *buf;

Looks unrelated to this patch.

Otherwise,

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page
  2022-10-21  0:18 ` [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
@ 2022-10-21  6:26   ` Klaus Jensen
  2022-10-21 15:24     ` Clay Mayers
  0 siblings, 1 reply; 13+ messages in thread
From: Klaus Jensen @ 2022-10-21  6:26 UTC (permalink / raw)
  To: clay.mayers
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 8582 bytes --]

On Okt 20 17:18, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
> 
> Zones marked with ZONE_FINISH_RECOMMENDED are added to the zone
> descriptor changed log page.  Once read with RAE cleared, they are
> removed from the list.
> 
> Zones stay in the list regardless of what other states the zones may
> go through so applications must be aware of ABA issues where finish
> may be recommended, the zone freed and re-opened and now the attribute
> is now clear.
> 
> Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> ---
>  hw/nvme/ctrl.c       | 50 ++++++++++++++++++++++++++++++++++++++++++++
>  hw/nvme/ns.c         |  6 ++++++
>  hw/nvme/nvme.h       |  8 +++++++
>  hw/nvme/trace-events |  1 +
>  include/block/nvme.h |  8 +++++++
>  5 files changed, 73 insertions(+)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index d7e9fae0b0..3ffd0fb469 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -1516,15 +1516,42 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
>      }
>  }
>  
> +static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool reset)
> +{
> +    NvmeZdc *zdc;
> +    NvmeZdc *next;
> +    int index = 0;
> +
> +    QTAILQ_FOREACH_SAFE(zdc, &ns->zdc_list, entry, next) {
> +        if (index >= ARRAY_SIZE(zlist->zids)) {
> +            break;
> +        }

I could've sweared that a previous revision of the spec stated that this
required `nzid` to be set to 0xffff and the log page be zeroed in this
case.

However, upon reading it again, that doesnt seem to be the case, so this
seems to be just fine.

> +        zlist->zids[index++] = zdc->zone->d.zslba;
> +        if (reset) {
> +            QTAILQ_REMOVE(&ns->zdc_list, zdc, entry);
> +            zdc->zone->zdc_entry = NULL;
> +            g_free(zdc);
> +        }
> +    }
> +    zlist->nzid = cpu_to_le16(index);
> +}
> +
>  static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
>  {
>      int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>      NvmeZone *zone;
> +    NvmeZdc  *zdc;
>  
>      QTAILQ_FOREACH(zone, list, entry) {
>          if (zone->finish_ms <= now) {
>              zone->finish_ms = INT64_MAX;
>              zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
> +            if (!zone->zdc_entry) {
> +                zdc = g_malloc0(sizeof(*zdc));
> +                zdc->zone = zone;
> +                zone->zdc_entry = zdc;
> +                QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
> +            }
>          } else if (zone->finish_ms != INT64_MAX) {
>              timer_mod_anticipate(ns->active_timer, zone->finish_ms);
>          }
> @@ -4675,6 +4702,27 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len,
>      return nvme_c2h(n, ((uint8_t *)&log) + off, trans_len, req);
>  }
>  
> +static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
> +                                    uint64_t off, NvmeRequest *req)
> +{
> +    NvmeNamespace *ns;
> +    NvmeCmd *cmd = &req->cmd;
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> +    NvmeZoneIdList zlist = { };
> +    uint32_t trans_len = MIN(sizeof(zlist) - off, buf_len);

I believe this is unsafe if off >= sizeof(zlist).

> +
> +    nsid = le32_to_cpu(cmd->nsid);
> +    trace_pci_nvme_changed_zones(nsid);
> +
> +    ns = nvme_ns(n, nsid);
> +    if (!ns) {
> +        return NVME_INVALID_NSID | NVME_DNR;
> +    }
> +    nvme_zdc_list(ns, &zlist, !rae);
> +
> +    return nvme_c2h(n, ((uint8_t *)&zlist) + off, trans_len, req);
> +}
> +
>  static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
>  {
>      NvmeCmd *cmd = &req->cmd;
> @@ -4722,6 +4770,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
>          return nvme_changed_nslist(n, rae, len, off, req);
>      case NVME_LOG_CMD_EFFECTS:
>          return nvme_cmd_effects(n, csi, len, off, req);
> +    case NVME_LOG_CHANGED_ZONE:
> +        return nvme_changed_zones(n, rae, len, off, req);
>      default:
>          trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
>          return NVME_INVALID_FIELD | NVME_DNR;
> diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> index b577f2d8e0..25cd490c99 100644
> --- a/hw/nvme/ns.c
> +++ b/hw/nvme/ns.c
> @@ -240,6 +240,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
>      QTAILQ_INIT(&ns->imp_open_zones);
>      QTAILQ_INIT(&ns->closed_zones);
>      QTAILQ_INIT(&ns->full_zones);
> +    QTAILQ_INIT(&ns->zdc_list);
>  
>      zone = ns->zone_array;
>      for (i = 0; i < ns->num_zones; i++, zone++) {
> @@ -526,8 +527,13 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
>  
>  void nvme_ns_cleanup(NvmeNamespace *ns)
>  {
> +    NvmeZdc *zdc;
> +
>      if (ns->params.zoned) {
>          timer_free(ns->active_timer);
> +        while ((zdc = QTAILQ_FIRST(&ns->zdc_list))) {
> +            g_free(zdc);
> +        }
>          g_free(ns->id_ns_zoned);
>          g_free(ns->zone_array);
>          g_free(ns->zd_extensions);
> diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> index 9a54dcdb32..ae65226150 100644
> --- a/hw/nvme/nvme.h
> +++ b/hw/nvme/nvme.h
> @@ -32,6 +32,7 @@ QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES > NVME_NSID_BROADCAST - 1);
>  
>  typedef struct NvmeCtrl NvmeCtrl;
>  typedef struct NvmeNamespace NvmeNamespace;
> +typedef struct NvmeZone NvmeZone;
>  
>  #define TYPE_NVME_BUS "nvme-bus"
>  OBJECT_DECLARE_SIMPLE_TYPE(NvmeBus, NVME_BUS)
> @@ -90,10 +91,16 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
>  #define NVME_NS(obj) \
>      OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
>  
> +typedef struct NvmeZdc {
> +    QTAILQ_ENTRY(NvmeZdc) entry;
> +    NvmeZone *zone;
> +} NvmeZdc;
> +
>  typedef struct NvmeZone {
>      NvmeZoneDescr   d;
>      uint64_t        w_ptr;
>      int64_t         finish_ms;
> +    NvmeZdc         *zdc_entry;
>      QTAILQ_ENTRY(NvmeZone) entry;
>  } NvmeZone;
>  
> @@ -172,6 +179,7 @@ typedef struct NvmeNamespace {
>  
>      int64_t         fto_ms;
>      QEMUTimer       *active_timer;
> +    QTAILQ_HEAD(, NvmeZdc) zdc_list;
>  
>      NvmeNamespaceParams params;
>  
> diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
> index fccb79f489..337927e607 100644
> --- a/hw/nvme/trace-events
> +++ b/hw/nvme/trace-events
> @@ -64,6 +64,7 @@ pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8""
>  pci_nvme_identify_cmd_set(void) "identify i/o command set"
>  pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_changed_zones(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
>  pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 8027b7126b..c747cc4948 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -1010,6 +1010,7 @@ enum NvmeLogIdentifier {
>      NVME_LOG_FW_SLOT_INFO   = 0x03,
>      NVME_LOG_CHANGED_NSLIST = 0x04,
>      NVME_LOG_CMD_EFFECTS    = 0x05,
> +    NVME_LOG_CHANGED_ZONE   = 0xbf,
>  };
>  
>  typedef struct QEMU_PACKED NvmePSD {
> @@ -1617,6 +1618,12 @@ typedef enum NvmeVirtualResourceType {
>      NVME_VIRT_RES_INTERRUPT     = 0x01,
>  } NvmeVirtualResourceType;
>  
> +typedef struct QEMU_PACKED NvmeZoneIdList {
> +    uint16_t nzid;
> +    uint16_t rsvd2[3];
> +    uint64_t zids[511];
> +} NvmeZoneIdList;
> +
>  static inline void _nvme_check_size(void)
>  {
>      QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
> @@ -1655,5 +1662,6 @@ static inline void _nvme_check_size(void)
>      QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) != 4096);
>      QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlEntry) != 32);
>      QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlList) != 4096);
> +    QEMU_BUILD_BUG_ON(sizeof(NvmeZoneIdList) != 4096);
>  }
>  #endif
> -- 
> 2.27.0
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
  2022-10-21  0:18 ` [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
@ 2022-10-21  6:36   ` Klaus Jensen
  0 siblings, 0 replies; 13+ messages in thread
From: Klaus Jensen @ 2022-10-21  6:36 UTC (permalink / raw)
  To: clay.mayers
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

On Oct 20 17:18, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
> 
> Adds ns.param.zoned.finish_time, which sets the number of
> seconds a zone can remain active before the zone attribute
> ZONE_FINISH_RECOMMENDED is set.
> 
> This requires scanning the exp open, imp open and closed lists
> of zones whenever a zone is marked as requiring finishing.  The
> expectation is these lists will be short (10s of items) allowing a
> simpler implementation than keeping the lists sorted.  It also
> keeps the overhead during the exception of a timeout instead of
> when zones change state between open and closed. For use cases
> where this isn't true, finish_time should be 0 to disable this
> feature (the default).
> 
> Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> ---
>  docs/system/devices/nvme.rst |  5 +++++
>  hw/nvme/ctrl.c               | 35 +++++++++++++++++++++++++++++++++++
>  hw/nvme/ns.c                 |  8 ++++++++
>  hw/nvme/nvme.h               | 18 ++++++++++++++----
>  4 files changed, 62 insertions(+), 4 deletions(-)
> 

Single timer? Check.

LGTM.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN
  2022-10-21  0:18 ` [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
@ 2022-10-21  6:41   ` Klaus Jensen
  2022-10-21 16:39     ` Clay Mayers
  0 siblings, 1 reply; 13+ messages in thread
From: Klaus Jensen @ 2022-10-21  6:41 UTC (permalink / raw)
  To: clay.mayers
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 857 bytes --]

On Oct 20 17:18, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
> 
> If a namespace's param.zoned.finish_time is non-zero,
> controllers register with the namespace to be notified
> when entries are added to its zone-descriptor-changed
> log page.  If the zone-descriptor-changed aen is enabled,
> this will cause an AEN to be sent from that controller.
> 
> Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> ---
>  hw/nvme/ctrl.c       | 62 +++++++++++++++++++++++++++++++++++++++++++-
>  hw/nvme/ns.c         |  1 +
>  hw/nvme/nvme.h       |  9 +++++++
>  include/block/nvme.h |  2 ++
>  4 files changed, 73 insertions(+), 1 deletion(-)
> 

If the controller is hotplugged (device_del'ed), you need to remove the
controller from the watch list as well. I think in nvme_exit().

Otherwise, looks good!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page
  2022-10-21  6:26   ` Klaus Jensen
@ 2022-10-21 15:24     ` Clay Mayers
  0 siblings, 0 replies; 13+ messages in thread
From: Clay Mayers @ 2022-10-21 15:24 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

> From: Klaus Jensen <its@irrelevant.dk>
> Sent: Thursday, October 20, 2022 11:26 PM
> 
> On Okt 20 17:18, clay.mayers@kioxia.com wrote:
> > From: Clay Mayers <clay.mayers@kioxia.com>
> >
> > Zones marked with ZONE_FINISH_RECOMMENDED are added to the zone
> > descriptor changed log page.  Once read with RAE cleared, they are
> > removed from the list.
> >
> > Zones stay in the list regardless of what other states the zones may
> > go through so applications must be aware of ABA issues where finish
> > may be recommended, the zone freed and re-opened and now the attribute
> > is now clear.
> >
> > Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> > ---
> >  hw/nvme/ctrl.c       | 50 ++++++++++++++++++++++++++++++++++++++++++++
> >  hw/nvme/ns.c         |  6 ++++++
> >  hw/nvme/nvme.h       |  8 +++++++
> >  hw/nvme/trace-events |  1 +
> >  include/block/nvme.h |  8 +++++++
> >  5 files changed, 73 insertions(+)
> >
> > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > index d7e9fae0b0..3ffd0fb469 100644
> > --- a/hw/nvme/ctrl.c
> > +++ b/hw/nvme/ctrl.c
> > @@ -1516,15 +1516,42 @@ static void nvme_clear_events(NvmeCtrl *n,
> uint8_t event_type)
> >      }
> >  }
> >
> > +static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool
> reset)
> > +{
> > +    NvmeZdc *zdc;
> > +    NvmeZdc *next;
> > +    int index = 0;
> > +
> > +    QTAILQ_FOREACH_SAFE(zdc, &ns->zdc_list, entry, next) {
> > +        if (index >= ARRAY_SIZE(zlist->zids)) {
> > +            break;
> > +        }
> 
> I could've sweared that a previous revision of the spec stated that this
> required `nzid` to be set to 0xffff and the log page be zeroed in this
> case.
> 
> However, upon reading it again, that doesnt seem to be the case, so this
> seems to be just fine.

My reading was it only does that if the list can't be returned.

> 
> > +        zlist->zids[index++] = zdc->zone->d.zslba;
> > +        if (reset) {
> > +            QTAILQ_REMOVE(&ns->zdc_list, zdc, entry);
> > +            zdc->zone->zdc_entry = NULL;
> > +            g_free(zdc);
> > +        }
> > +    }
> > +    zlist->nzid = cpu_to_le16(index);
> > +}
> > +
> >  static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead
> *list)
> >  {
> >      int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> >      NvmeZone *zone;
> > +    NvmeZdc  *zdc;
> >
> >      QTAILQ_FOREACH(zone, list, entry) {
> >          if (zone->finish_ms <= now) {
> >              zone->finish_ms = INT64_MAX;
> >              zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
> > +            if (!zone->zdc_entry) {
> > +                zdc = g_malloc0(sizeof(*zdc));
> > +                zdc->zone = zone;
> > +                zone->zdc_entry = zdc;
> > +                QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
> > +            }
> >          } else if (zone->finish_ms != INT64_MAX) {
> >              timer_mod_anticipate(ns->active_timer, zone->finish_ms);
> >          }
> > @@ -4675,6 +4702,27 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n,
> uint8_t csi, uint32_t buf_len,
> >      return nvme_c2h(n, ((uint8_t *)&log) + off, trans_len, req);
> >  }
> >
> > +static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t
> buf_len,
> > +                                    uint64_t off, NvmeRequest *req)
> > +{
> > +    NvmeNamespace *ns;
> > +    NvmeCmd *cmd = &req->cmd;
> > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> > +    NvmeZoneIdList zlist = { };
> > +    uint32_t trans_len = MIN(sizeof(zlist) - off, buf_len);
> 
> I believe this is unsafe if off >= sizeof(zlist).

Yep, I copied that from how most of the other log pages are
handled, but missed the check for off > sizeof(zlist).

> 
> > +
> > +    nsid = le32_to_cpu(cmd->nsid);
> > +    trace_pci_nvme_changed_zones(nsid);
> > +
> > +    ns = nvme_ns(n, nsid);
> > +    if (!ns) {
> > +        return NVME_INVALID_NSID | NVME_DNR;
> > +    }
> > +    nvme_zdc_list(ns, &zlist, !rae);
> > +
> > +    return nvme_c2h(n, ((uint8_t *)&zlist) + off, trans_len, req);
> > +}
> > +
> >  static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
> >  {
> >      NvmeCmd *cmd = &req->cmd;
> > @@ -4722,6 +4770,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n,
> NvmeRequest *req)
> >          return nvme_changed_nslist(n, rae, len, off, req);
> >      case NVME_LOG_CMD_EFFECTS:
> >          return nvme_cmd_effects(n, csi, len, off, req);
> > +    case NVME_LOG_CHANGED_ZONE:
> > +        return nvme_changed_zones(n, rae, len, off, req);
> >      default:
> >          trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
> >          return NVME_INVALID_FIELD | NVME_DNR;
> > diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> > index b577f2d8e0..25cd490c99 100644
> > --- a/hw/nvme/ns.c
> > +++ b/hw/nvme/ns.c
> > @@ -240,6 +240,7 @@ static void
> nvme_ns_zoned_init_state(NvmeNamespace *ns)
> >      QTAILQ_INIT(&ns->imp_open_zones);
> >      QTAILQ_INIT(&ns->closed_zones);
> >      QTAILQ_INIT(&ns->full_zones);
> > +    QTAILQ_INIT(&ns->zdc_list);
> >
> >      zone = ns->zone_array;
> >      for (i = 0; i < ns->num_zones; i++, zone++) {
> > @@ -526,8 +527,13 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
> >
> >  void nvme_ns_cleanup(NvmeNamespace *ns)
> >  {
> > +    NvmeZdc *zdc;
> > +
> >      if (ns->params.zoned) {
> >          timer_free(ns->active_timer);
> > +        while ((zdc = QTAILQ_FIRST(&ns->zdc_list))) {
> > +            g_free(zdc);
> > +        }
> >          g_free(ns->id_ns_zoned);
> >          g_free(ns->zone_array);
> >          g_free(ns->zd_extensions);
> > diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> > index 9a54dcdb32..ae65226150 100644
> > --- a/hw/nvme/nvme.h
> > +++ b/hw/nvme/nvme.h
> > @@ -32,6 +32,7 @@ QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES >
> NVME_NSID_BROADCAST - 1);
> >
> >  typedef struct NvmeCtrl NvmeCtrl;
> >  typedef struct NvmeNamespace NvmeNamespace;
> > +typedef struct NvmeZone NvmeZone;
> >
> >  #define TYPE_NVME_BUS "nvme-bus"
> >  OBJECT_DECLARE_SIMPLE_TYPE(NvmeBus, NVME_BUS)
> > @@ -90,10 +91,16 @@ static inline NvmeNamespace
> *nvme_subsys_ns(NvmeSubsystem *subsys,
> >  #define NVME_NS(obj) \
> >      OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
> >
> > +typedef struct NvmeZdc {
> > +    QTAILQ_ENTRY(NvmeZdc) entry;
> > +    NvmeZone *zone;
> > +} NvmeZdc;
> > +
> >  typedef struct NvmeZone {
> >      NvmeZoneDescr   d;
> >      uint64_t        w_ptr;
> >      int64_t         finish_ms;
> > +    NvmeZdc         *zdc_entry;
> >      QTAILQ_ENTRY(NvmeZone) entry;
> >  } NvmeZone;
> >
> > @@ -172,6 +179,7 @@ typedef struct NvmeNamespace {
> >
> >      int64_t         fto_ms;
> >      QEMUTimer       *active_timer;
> > +    QTAILQ_HEAD(, NvmeZdc) zdc_list;
> >
> >      NvmeNamespaceParams params;
> >
> > diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
> > index fccb79f489..337927e607 100644
> > --- a/hw/nvme/trace-events
> > +++ b/hw/nvme/trace-events
> > @@ -64,6 +64,7 @@ pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
> >  pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16",
> csi=0x%"PRIx8""
> >  pci_nvme_identify_cmd_set(void) "identify i/o command set"
> >  pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
> > +pci_nvme_changed_zones(uint32_t ns) "nsid %"PRIu32""
> >  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t
> len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8"
> len %"PRIu32" off %"PRIu64""
> >  pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel,
> uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel
> 0x%"PRIx8" cdw11 0x%"PRIx32""
> >  pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save,
> uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save
> 0x%"PRIx8" cdw11 0x%"PRIx32""
> > diff --git a/include/block/nvme.h b/include/block/nvme.h
> > index 8027b7126b..c747cc4948 100644
> > --- a/include/block/nvme.h
> > +++ b/include/block/nvme.h
> > @@ -1010,6 +1010,7 @@ enum NvmeLogIdentifier {
> >      NVME_LOG_FW_SLOT_INFO   = 0x03,
> >      NVME_LOG_CHANGED_NSLIST = 0x04,
> >      NVME_LOG_CMD_EFFECTS    = 0x05,
> > +    NVME_LOG_CHANGED_ZONE   = 0xbf,
> >  };
> >
> >  typedef struct QEMU_PACKED NvmePSD {
> > @@ -1617,6 +1618,12 @@ typedef enum NvmeVirtualResourceType {
> >      NVME_VIRT_RES_INTERRUPT     = 0x01,
> >  } NvmeVirtualResourceType;
> >
> > +typedef struct QEMU_PACKED NvmeZoneIdList {
> > +    uint16_t nzid;
> > +    uint16_t rsvd2[3];
> > +    uint64_t zids[511];
> > +} NvmeZoneIdList;
> > +
> >  static inline void _nvme_check_size(void)
> >  {
> >      QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
> > @@ -1655,5 +1662,6 @@ static inline void _nvme_check_size(void)
> >      QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) != 4096);
> >      QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlEntry) != 32);
> >      QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlList) != 4096);
> > +    QEMU_BUILD_BUG_ON(sizeof(NvmeZoneIdList) != 4096);
> >  }
> >  #endif
> > --
> > 2.27.0
> >
> 
> --
> One of us - No more doubt, silence or taboo about mental illness.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 3/4] hw/block/nvme: supply dw1 for aen result
  2022-10-21  5:59   ` Klaus Jensen
@ 2022-10-21 15:25     ` Clay Mayers
  0 siblings, 0 replies; 13+ messages in thread
From: Clay Mayers @ 2022-10-21 15:25 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

> From: Klaus Jensen <its@irrelevant.dk>
> Sent: Thursday, October 20, 2022 10:59 PM
> 
> On Okt 20 17:18, clay.mayers@kioxia.com wrote:
> > From: Clay Mayers <clay.mayers@kioxia.com>
> >
> > cqe.dw1 AEN is sometimes required to convey the NSID of the log page
> > to read.  This is the case for the zone descriptor changed log
> > page.
> >
> > Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> > ---
> >  hw/nvme/ctrl.c       | 19 +++++++++++--------
> >  hw/nvme/nvme.h       |  2 ++
> >  hw/nvme/trace-events |  2 +-
> >  include/block/nvme.h |  4 +++-
> >  4 files changed, 17 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> > index ae65226150..2b7997e4a7 100644
> > --- a/hw/nvme/nvme.h
> > +++ b/hw/nvme/nvme.h
> > @@ -477,6 +477,8 @@ typedef struct NvmeCtrl {
> >      uint64_t    dbbuf_eis;
> >      bool        dbbuf_enabled;
> >
> > +    bool        zdc_event_queued;
> > +
> >      struct {
> >          MemoryRegion mem;
> >          uint8_t      *buf;
> 
> Looks unrelated to this patch.

Yep - should be in patch 4.

> 
> Otherwise,
> 
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN
  2022-10-21  6:41   ` Klaus Jensen
@ 2022-10-21 16:39     ` Clay Mayers
  0 siblings, 0 replies; 13+ messages in thread
From: Clay Mayers @ 2022-10-21 16:39 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé

> From: Klaus Jensen <its@irrelevant.dk>
> Sent: Thursday, October 20, 2022 11:41 PM
> 
> On Oct 20 17:18, clay.mayers@kioxia.com wrote:
> > From: Clay Mayers <clay.mayers@kioxia.com>
> >
> > If a namespace's param.zoned.finish_time is non-zero,
> > controllers register with the namespace to be notified
> > when entries are added to its zone-descriptor-changed
> > log page.  If the zone-descriptor-changed aen is enabled,
> > this will cause an AEN to be sent from that controller.
> >
> > Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
> > ---
> >  hw/nvme/ctrl.c       | 62 +++++++++++++++++++++++++++++++++++++++++++-
> >  hw/nvme/ns.c         |  1 +
> >  hw/nvme/nvme.h       |  9 +++++++
> >  include/block/nvme.h |  2 ++
> >  4 files changed, 73 insertions(+), 1 deletion(-)
> >
> 
> If the controller is hotplugged (device_del'ed), you need to remove the
> controller from the watch list as well. I think in nvme_exit().
> 
> Otherwise, looks good!

Thanks for the quick and useful review.  Adding the watch list was 
a significant design change and I wasn't certain it was in the
spirit of the existing code.  The logic split between ctrl/ns was to
handle shared namespaces as smoothly as possible.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-10-21 19:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-21  0:18 [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
2022-10-21  0:18 ` [PATCH 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
2022-10-21  6:36   ` Klaus Jensen
2022-10-21  0:18 ` [PATCH 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
2022-10-21  6:26   ` Klaus Jensen
2022-10-21 15:24     ` Clay Mayers
2022-10-21  0:18 ` [PATCH 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
2022-10-21  5:59   ` Klaus Jensen
2022-10-21 15:25     ` Clay Mayers
2022-10-21  0:18 ` [PATCH 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
2022-10-21  6:41   ` Klaus Jensen
2022-10-21 16:39     ` Clay Mayers
2022-10-21  5:57 ` [PATCH 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Klaus Jensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.