All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] nvme qemu cleanups and fixes
@ 2020-09-30 22:04 Keith Busch
  2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

After going through the zns enabling, I notice the controller enabling
is not correct. Then I just continued maked more stuff. The series, I
think, contains some of the less controversial patches from the two
conflicting zns series, preceeded by some cleanups and fixes from me.

If this is all fine, I took the liberty of porting the zns enabling to
it and made a public branch for consideration here:

 http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 

Dmitry Fomichev (1):
  hw/block/nvme: report actual LBA data shift in LBAF

Keith Busch (5):
  hw/block/nvme: remove pointless rw indirection
  hw/block/nvme: fix log page offset check
  hw/block/nvme: support per-namespace smart log
  hw/block/nvme: validate command set selected
  hw/block/nvme: support for admin-only command set

Klaus Jensen (3):
  hw/block/nvme: reject io commands if only admin command set selected
  hw/block/nvme: add nsid to get/setfeat trace events
  hw/block/nvme: add trace event for requests with non-zero status code

 hw/block/nvme-ns.c    |   5 ++
 hw/block/nvme.c       | 194 ++++++++++++++++++++----------------------
 hw/block/trace-events |   6 +-
 include/block/nvme.h  |  11 +++
 4 files changed, 114 insertions(+), 102 deletions(-)

-- 
2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-10-01  4:05   ` Klaus Jensen
  2020-10-06  1:49   ` Dmitry Fomichev
  2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

The code switches on the opcode to invoke a function specific to that
opcode. There's no point in consolidating back to a common function that
just switches on that same opcode without any actual common code.
Restore the opcode specific behavior without going back through another
level of switches.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c | 91 ++++++++++++++++---------------------------------
 1 file changed, 29 insertions(+), 62 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index da8344f196..db52ea0db9 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -927,68 +927,12 @@ static void nvme_rw_cb(void *opaque, int ret)
     nvme_enqueue_req_completion(nvme_cq(req), req);
 }
 
-static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len,
-                            NvmeRequest *req)
-{
-    BlockAcctCookie *acct = &req->acct;
-    BlockAcctStats *stats = blk_get_stats(blk);
-
-    bool is_write = false;
-
-    trace_pci_nvme_do_aio(nvme_cid(req), req->cmd.opcode,
-                          nvme_io_opc_str(req->cmd.opcode), blk_name(blk),
-                          offset, len);
-
-    switch (req->cmd.opcode) {
-    case NVME_CMD_FLUSH:
-        block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
-        req->aiocb = blk_aio_flush(blk, nvme_rw_cb, req);
-        break;
-
-    case NVME_CMD_WRITE_ZEROES:
-        block_acct_start(stats, acct, len, BLOCK_ACCT_WRITE);
-        req->aiocb = blk_aio_pwrite_zeroes(blk, offset, len,
-                                           BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
-                                           req);
-        break;
-
-    case NVME_CMD_WRITE:
-        is_write = true;
-
-        /* fallthrough */
-
-    case NVME_CMD_READ:
-        block_acct_start(stats, acct, len,
-                         is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
-
-        if (req->qsg.sg) {
-            if (is_write) {
-                req->aiocb = dma_blk_write(blk, &req->qsg, offset,
-                                           BDRV_SECTOR_SIZE, nvme_rw_cb, req);
-            } else {
-                req->aiocb = dma_blk_read(blk, &req->qsg, offset,
-                                          BDRV_SECTOR_SIZE, nvme_rw_cb, req);
-            }
-        } else {
-            if (is_write) {
-                req->aiocb = blk_aio_pwritev(blk, offset, &req->iov, 0,
-                                             nvme_rw_cb, req);
-            } else {
-                req->aiocb = blk_aio_preadv(blk, offset, &req->iov, 0,
-                                            nvme_rw_cb, req);
-            }
-        }
-
-        break;
-    }
-
-    return NVME_NO_COMPLETE;
-}
-
 static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeNamespace *ns = req->ns;
-    return nvme_do_aio(ns->blkconf.blk, 0, 0, req);
+    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
+                     BLOCK_ACCT_FLUSH);
+    req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
+    return NVME_NO_COMPLETE;
 }
 
 static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
@@ -1009,7 +953,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
         return status;
     }
 
-    return nvme_do_aio(ns->blkconf.blk, offset, count, req);
+    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
+                     BLOCK_ACCT_WRITE);
+    req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
+                                       BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
+    return NVME_NO_COMPLETE;
 }
 
 static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
@@ -1023,6 +971,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
     uint64_t data_offset = nvme_l2b(ns, slba);
     enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
         BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
+    BlockBackend *blk = ns->blkconf.blk;
     uint16_t status;
 
     trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
@@ -1045,7 +994,25 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
         goto invalid;
     }
 
-    return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req);
+    block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct);
+    if (req->qsg.sg) {
+        if (acct == BLOCK_ACCT_WRITE) {
+            req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
+                                       BDRV_SECTOR_SIZE, nvme_rw_cb, req);
+        } else {
+            req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
+                                      BDRV_SECTOR_SIZE, nvme_rw_cb, req);
+        }
+    } else {
+        if (acct == BLOCK_ACCT_WRITE) {
+            req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
+                                         nvme_rw_cb, req);
+        } else {
+            req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
+                                        nvme_rw_cb, req);
+        }
+    }
+    return NVME_NO_COMPLETE;
 
 invalid:
     block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct);
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/9] hw/block/nvme: fix log page offset check
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
  2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-09-30 23:18   ` Dmitry Fomichev
                     ` (2 more replies)
  2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
                   ` (8 subsequent siblings)
  10 siblings, 3 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

Return error if the requested offset starts after the size of the log
being returned. Also, move the check for earlier in the function so
we're not doing unnecessary calculations.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index db52ea0db9..8d2b5be567 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1179,6 +1179,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    if (off >= sizeof(smart)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
     for (int i = 1; i <= n->num_namespaces; i++) {
         NvmeNamespace *ns = nvme_ns(n, i);
         if (!ns) {
@@ -1193,10 +1197,6 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
         write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
     }
 
-    if (off > sizeof(smart)) {
-        return NVME_INVALID_FIELD | NVME_DNR;
-    }
-
     trans_len = MIN(sizeof(smart) - off, buf_len);
 
     memset(&smart, 0x0, sizeof(smart));
@@ -1234,12 +1234,11 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
         .afi = 0x1,
     };
 
-    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
-
-    if (off > sizeof(fw_log)) {
+    if (off >= sizeof(fw_log)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
     trans_len = MIN(sizeof(fw_log) - off, buf_len);
 
     return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len,
@@ -1252,16 +1251,15 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
     uint32_t trans_len;
     NvmeErrorLog errlog;
 
-    if (!rae) {
-        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
+    if (off >= sizeof(errlog)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    if (off > sizeof(errlog)) {
-        return NVME_INVALID_FIELD | NVME_DNR;
+    if (!rae) {
+        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
     }
 
     memset(&errlog, 0x0, sizeof(errlog));
-
     trans_len = MIN(sizeof(errlog) - off, buf_len);
 
     return nvme_dma(n, (uint8_t *)&errlog, trans_len,
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
  2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
  2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-10-01  4:10   ` Klaus Jensen
                     ` (2 more replies)
  2020-09-30 22:04 ` [PATCH 4/9] hw/block/nvme: validate command set selected Keith Busch
                   ` (7 subsequent siblings)
  10 siblings, 3 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

Let the user specify a specific namespace if they want to get access
stats for a specific namespace.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c      | 66 +++++++++++++++++++++++++++-----------------
 include/block/nvme.h |  1 +
 2 files changed, 41 insertions(+), 26 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 8d2b5be567..41389b2b09 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1164,48 +1164,62 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
+struct nvme_stats {
+    uint64_t units_read;
+    uint64_t units_written;
+    uint64_t read_commands;
+    uint64_t write_commands;
+};
+
+static void nvme_set_blk_stats(NvmeNamespace *ns, struct nvme_stats *stats)
+{
+    BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
+
+    stats->units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
+    stats->units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
+    stats->read_commands += s->nr_ops[BLOCK_ACCT_READ];
+    stats->write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
+}
+
 static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
                                 uint64_t off, NvmeRequest *req)
 {
     uint32_t nsid = le32_to_cpu(req->cmd.nsid);
-
+    struct nvme_stats stats = { 0 };
+    NvmeSmartLog smart = { 0 };
     uint32_t trans_len;
+    NvmeNamespace *ns;
     time_t current_ms;
-    uint64_t units_read = 0, units_written = 0;
-    uint64_t read_commands = 0, write_commands = 0;
-    NvmeSmartLog smart;
-
-    if (nsid && nsid != 0xffffffff) {
-        return NVME_INVALID_FIELD | NVME_DNR;
-    }
 
     if (off >= sizeof(smart)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    for (int i = 1; i <= n->num_namespaces; i++) {
-        NvmeNamespace *ns = nvme_ns(n, i);
-        if (!ns) {
-            continue;
-        }
-
-        BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
+    if (nsid != 0xffffffff) {
+        ns = nvme_ns(n, nsid);
+        if (!ns)
+            return NVME_INVALID_NSID | NVME_DNR;
+        nvme_set_blk_stats(ns, &stats);
+    } else {
+        int i;
 
-        units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
-        units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
-        read_commands += s->nr_ops[BLOCK_ACCT_READ];
-        write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
+        for (i = 1; i <= n->num_namespaces; i++) {
+            ns = nvme_ns(n, i);
+            if (!ns) {
+                continue;
+            }
+            nvme_set_blk_stats(ns, &stats);
+        }
     }
 
     trans_len = MIN(sizeof(smart) - off, buf_len);
 
-    memset(&smart, 0x0, sizeof(smart));
-
-    smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(units_read, 1000));
-    smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(units_written,
+    smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read,
+                                                        1000));
+    smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_written,
                                                            1000));
-    smart.host_read_commands[0] = cpu_to_le64(read_commands);
-    smart.host_write_commands[0] = cpu_to_le64(write_commands);
+    smart.host_read_commands[0] = cpu_to_le64(stats.read_commands);
+    smart.host_write_commands[0] = cpu_to_le64(stats.write_commands);
 
     smart.temperature = cpu_to_le16(n->temperature);
 
@@ -2708,7 +2722,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->acl = 3;
     id->aerl = n->params.aerl;
     id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
-    id->lpa = NVME_LPA_EXTENDED;
+    id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED;
 
     /* recommended default value (~70 C) */
     id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 58647bcdad..868cf53f0b 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -849,6 +849,7 @@ enum NvmeIdCtrlFrmw {
 };
 
 enum NvmeIdCtrlLpa {
+    NVME_LPA_NS_SMART = 1 << 0,
     NVME_LPA_EXTENDED = 1 << 2,
 };
 
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/9] hw/block/nvme: validate command set selected
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (2 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-10-01  4:14   ` Klaus Jensen
  2020-09-30 22:04 ` [PATCH 5/9] hw/block/nvme: support for admin-only command set Keith Busch
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

Fail to start the controller if the user requests a command set that the
controller does not support.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 6 +++++-
 hw/block/trace-events | 1 +
 include/block/nvme.h  | 4 ++++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 41389b2b09..6c582e6874 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2049,6 +2049,10 @@ static int nvme_start_ctrl(NvmeCtrl *n)
         trace_pci_nvme_err_startfail_acq_misaligned(n->bar.acq);
         return -1;
     }
+    if (unlikely(!(NVME_CAP_CSS(n->bar.cap) & (1 << NVME_CC_CSS(n->bar.cc))))) {
+        trace_pci_nvme_err_startfail_css(NVME_CC_CSS(n->bar.cc));
+        return -1;
+    }
     if (unlikely(NVME_CC_MPS(n->bar.cc) <
                  NVME_CAP_MPSMIN(n->bar.cap))) {
         trace_pci_nvme_err_startfail_page_too_small(
@@ -2750,7 +2754,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
     NVME_CAP_SET_CQR(n->bar.cap, 1);
     NVME_CAP_SET_TO(n->bar.cap, 0xf);
-    NVME_CAP_SET_CSS(n->bar.cap, 1);
+    NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
     NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
     n->bar.vs = NVME_SPEC_VER;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 446cca08e9..7720e1b4d9 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -133,6 +133,7 @@ pci_nvme_err_startfail_cqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_
 pci_nvme_err_startfail_cqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too large: log2size=%u, max=%u"
 pci_nvme_err_startfail_sqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too small: log2size=%u, min=%u"
 pci_nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too large: log2size=%u, max=%u"
+pci_nvme_err_startfail_css(uint8_t css) "nvme_start_ctrl failed because invalid command set selected:%u"
 pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero"
 pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero"
 pci_nvme_err_startfail(void) "setting controller enable bit failed"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 868cf53f0b..bc20a2ba5e 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -82,6 +82,10 @@ enum NvmeCapMask {
 #define NVME_CAP_SET_PMRS(cap, val) (cap |= (uint64_t)(val & CAP_PMR_MASK)\
                                                             << CAP_PMR_SHIFT)
 
+enum NvmeCapCss {
+    NVME_CAP_CSS_NVM = 1 << 0,
+};
+
 enum NvmeCcShift {
     CC_EN_SHIFT     = 0,
     CC_CSS_SHIFT    = 4,
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/9] hw/block/nvme: support for admin-only command set
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (3 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 4/9] hw/block/nvme: validate command set selected Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-10-01  0:11   ` Dmitry Fomichev
  2020-10-01  4:17   ` Klaus Jensen
  2020-09-30 22:04 ` [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected Keith Busch
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c      | 1 +
 include/block/nvme.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6c582e6874..ec7363ea40 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2755,6 +2755,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     NVME_CAP_SET_CQR(n->bar.cap, 1);
     NVME_CAP_SET_TO(n->bar.cap, 0xf);
     NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
+    NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY);
     NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
     n->bar.vs = NVME_SPEC_VER;
diff --git a/include/block/nvme.h b/include/block/nvme.h
index bc20a2ba5e..521533fd2a 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -83,7 +83,8 @@ enum NvmeCapMask {
                                                             << CAP_PMR_SHIFT)
 
 enum NvmeCapCss {
-    NVME_CAP_CSS_NVM = 1 << 0,
+    NVME_CAP_CSS_NVM        = 1 << 0,
+    NVME_CAP_CSS_ADMIN_ONLY = 1 << 7,
 };
 
 enum NvmeCcShift {
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (4 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 5/9] hw/block/nvme: support for admin-only command set Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-09-30 23:11   ` Dmitry Fomichev
  2020-09-30 22:04 ` [PATCH 7/9] hw/block/nvme: add nsid to get/setfeat trace events Keith Busch
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

From: Klaus Jensen <k.jensen@samsung.com>

If the host sets CC.CSS to 111b, all commands submitted to I/O queues
should be completed with status Invalid Command Opcode.

Note that this is technically a v1.4 feature, but it does not hurt to
implement before we finally bump the reported version implemented.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c      | 4 ++++
 include/block/nvme.h | 5 +++++
 2 files changed, 9 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index ec7363ea40..80730e1c03 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1026,6 +1026,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
     trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req),
                           req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode));
 
+    if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) {
+        return NVME_INVALID_OPCODE | NVME_DNR;
+    }
+
     if (!nvme_nsid_valid(n, nsid)) {
         return NVME_INVALID_NSID | NVME_DNR;
     }
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 521533fd2a..6de2d5aa75 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -115,6 +115,11 @@ enum NvmeCcMask {
 #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK)
 #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK)
 
+enum NvmeCcCss {
+    NVME_CC_CSS_NVM        = 0x0,
+    NVME_CC_CSS_ADMIN_ONLY = 0x7,
+};
+
 enum NvmeCstsShift {
     CSTS_RDY_SHIFT      = 0,
     CSTS_CFS_SHIFT      = 1,
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/9] hw/block/nvme: add nsid to get/setfeat trace events
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (5 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-09-30 22:04 ` [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code Keith Busch
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

From: Klaus Jensen <k.jensen@samsung.com>

Include the namespace id in the pci_nvme_{get,set}feat trace events.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 4 ++--
 hw/block/trace-events | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 80730e1c03..dc971c9653 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1643,7 +1643,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
         [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
     };
 
-    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
+    trace_pci_nvme_getfeat(nvme_cid(req), nsid, fid, sel, dw11);
 
     if (!nvme_feature_support[fid]) {
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1781,7 +1781,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
     uint8_t fid = NVME_GETSETFEAT_FID(dw10);
     uint8_t save = NVME_SETFEAT_SAVE(dw10);
 
-    trace_pci_nvme_setfeat(nvme_cid(req), fid, save, dw11);
+    trace_pci_nvme_setfeat(nvme_cid(req), nsid, fid, save, dw11);
 
     if (save) {
         return NVME_FID_NOT_SAVEABLE | NVME_DNR;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 7720e1b4d9..180c43d258 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -53,8 +53,8 @@ pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
-pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
-pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
+pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
+pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
 pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (6 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 7/9] hw/block/nvme: add nsid to get/setfeat trace events Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-09-30 23:21   ` Dmitry Fomichev
  2020-10-01 15:25   ` Philippe Mathieu-Daudé
  2020-09-30 22:04 ` [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF Keith Busch
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

From: Klaus Jensen <k.jensen@samsung.com>

If a command results in a non-zero status code, trace it.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 6 ++++++
 hw/block/trace-events | 1 +
 2 files changed, 7 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index dc971c9653..16804d0278 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -777,6 +777,12 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
     assert(cq->cqid == req->sq->cqid);
     trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
                                           req->status);
+
+    if (req->status) {
+        trace_pci_nvme_err_req_status(nvme_cid(req), nvme_nsid(req->ns),
+                                      req->status, req->cmd.opcode);
+    }
+
     QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
     QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 180c43d258..ff3ca4bbf6 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -89,6 +89,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
 
 # nvme traces for error conditions
 pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu"
+pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8""
 pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
 pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
 pci_nvme_err_cfs(void) "controller fatal status"
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (7 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code Keith Busch
@ 2020-09-30 22:04 ` Keith Busch
  2020-10-01  9:48   ` Klaus Jensen
  2020-10-01 18:46 ` [PATCH 0/9] nvme qemu cleanups and fixes Klaus Jensen
  2020-10-13  9:04 ` Klaus Jensen
  10 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-09-30 22:04 UTC (permalink / raw)
  To: qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Philippe Mathieu-Daudé,
	Keith Busch, Kevin Wolf

From: Dmitry Fomichev <dmitry.fomichev@wdc.com>

Calculate the data shift value to report based on the set value of
logical_block_size device property.

In the process, use a local variable to calculate the LBA format
index instead of the hardcoded value 0. This makes the code more
readable and it will make it easier to add support for multiple LBA
formats in the future.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme-ns.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 2ba0263dda..a85e5fdb42 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -47,6 +47,8 @@ static void nvme_ns_init(NvmeNamespace *ns)
 
 static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 {
+    int lba_index;
+
     if (!blkconf_blocksizes(&ns->blkconf, errp)) {
         return -1;
     }
@@ -67,6 +69,9 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
         n->features.vwc = 0x1;
     }
 
+    lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
+    ns->id_ns.lbaf[lba_index].ds = 31 - clz32(ns->blkconf.logical_block_size);
+
     return 0;
 }
 
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* RE: [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected
  2020-09-30 22:04 ` [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected Keith Busch
@ 2020-09-30 23:11   ` Dmitry Fomichev
  0 siblings, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-09-30 23:11 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 6/9] hw/block/nvme: reject io commands if only admin
> command set selected
> 
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> If the host sets CC.CSS to 111b, all commands submitted to I/O queues
> should be completed with status Invalid Command Opcode.
> 
> Note that this is technically a v1.4 feature, but it does not hurt to
> implement before we finally bump the reported version implemented.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---
>  hw/block/nvme.c      | 4 ++++
>  include/block/nvme.h | 5 +++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index ec7363ea40..80730e1c03 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1026,6 +1026,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n,
> NvmeRequest *req)
>      trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req),
>                            req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode));
> 
> +    if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) {
> +        return NVME_INVALID_OPCODE | NVME_DNR;
> +    }
> +
>      if (!nvme_nsid_valid(n, nsid)) {
>          return NVME_INVALID_NSID | NVME_DNR;
>      }
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 521533fd2a..6de2d5aa75 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -115,6 +115,11 @@ enum NvmeCcMask {
>  #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) &
> CC_IOSQES_MASK)
>  #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) &
> CC_IOCQES_MASK)
> 
> +enum NvmeCcCss {
> +    NVME_CC_CSS_NVM        = 0x0,
> +    NVME_CC_CSS_ADMIN_ONLY = 0x7,
> +};
> +
>  enum NvmeCstsShift {
>      CSTS_RDY_SHIFT      = 0,
>      CSTS_CFS_SHIFT      = 1,
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 2/9] hw/block/nvme: fix log page offset check
  2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
@ 2020-09-30 23:18   ` Dmitry Fomichev
  2020-10-01  4:05   ` Klaus Jensen
  2020-10-01 10:11   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-09-30 23:18 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 2/9] hw/block/nvme: fix log page offset check
> 
> Return error if the requested offset starts after the size of the log
> being returned. Also, move the check for earlier in the function so
> we're not doing unnecessary calculations.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed- by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---
>  hw/block/nvme.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index db52ea0db9..8d2b5be567 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1179,6 +1179,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n,
> uint8_t rae, uint32_t buf_len,
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
> 
> +    if (off >= sizeof(smart)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
>      for (int i = 1; i <= n->num_namespaces; i++) {
>          NvmeNamespace *ns = nvme_ns(n, i);
>          if (!ns) {
> @@ -1193,10 +1197,6 @@ static uint16_t nvme_smart_info(NvmeCtrl *n,
> uint8_t rae, uint32_t buf_len,
>          write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
>      }
> 
> -    if (off > sizeof(smart)) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> -    }
> -
>      trans_len = MIN(sizeof(smart) - off, buf_len);
> 
>      memset(&smart, 0x0, sizeof(smart));
> @@ -1234,12 +1234,11 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n,
> uint32_t buf_len, uint64_t off,
>          .afi = 0x1,
>      };
> 
> -    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> -
> -    if (off > sizeof(fw_log)) {
> +    if (off >= sizeof(fw_log)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
> 
> +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
>      trans_len = MIN(sizeof(fw_log) - off, buf_len);
> 
>      return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len,
> @@ -1252,16 +1251,15 @@ static uint16_t nvme_error_info(NvmeCtrl *n,
> uint8_t rae, uint32_t buf_len,
>      uint32_t trans_len;
>      NvmeErrorLog errlog;
> 
> -    if (!rae) {
> -        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
> +    if (off >= sizeof(errlog)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
>      }
> 
> -    if (off > sizeof(errlog)) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> +    if (!rae) {
> +        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
>      }
> 
>      memset(&errlog, 0x0, sizeof(errlog));
> -
>      trans_len = MIN(sizeof(errlog) - off, buf_len);
> 
>      return nvme_dma(n, (uint8_t *)&errlog, trans_len,
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code
  2020-09-30 22:04 ` [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code Keith Busch
@ 2020-09-30 23:21   ` Dmitry Fomichev
  2020-10-01 15:25   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-09-30 23:21 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 8/9] hw/block/nvme: add trace event for requests with
> non-zero status code
> 
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> If a command results in a non-zero status code, trace it.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c       | 6 ++++++
>  hw/block/trace-events | 1 +
>  2 files changed, 7 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index dc971c9653..16804d0278 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -777,6 +777,12 @@ static void
> nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
>      assert(cq->cqid == req->sq->cqid);
>      trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
>                                            req->status);
> +
> +    if (req->status) {
> +        trace_pci_nvme_err_req_status(nvme_cid(req), nvme_nsid(req->ns),
> +                                      req->status, req->cmd.opcode);
> +    }
> +

Very useful.
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

>      QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
>      QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
>      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> 500);
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 180c43d258..ff3ca4bbf6 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -89,6 +89,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown
> bit cleared"
> 
>  # nvme traces for error conditions
>  pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu"
> +pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status,
> uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc
> 0x%"PRIx8""
>  pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
>  pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
>  pci_nvme_err_cfs(void) "controller fatal status"
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 5/9] hw/block/nvme: support for admin-only command set
  2020-09-30 22:04 ` [PATCH 5/9] hw/block/nvme: support for admin-only command set Keith Busch
@ 2020-10-01  0:11   ` Dmitry Fomichev
  2020-10-01  4:17   ` Klaus Jensen
  1 sibling, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-10-01  0:11 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 5/9] hw/block/nvme: support for admin-only command set
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c      | 1 +
>  include/block/nvme.h | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 6c582e6874..ec7363ea40 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2755,6 +2755,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice
> *pci_dev)
>      NVME_CAP_SET_CQR(n->bar.cap, 1);
>      NVME_CAP_SET_TO(n->bar.cap, 0xf);
>      NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
> +    NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY);

This could be

-     NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
+    NVME_CAP_SET_CSS(n->bar.cap, (NVME_CAP_CSS_NVM  | NVME_CAP_CSS_ADMIN_ONLY));

Unfortunately, parentheses are needed above because NVME_CAP_SET_CSS macro and
other similar macros use "val" instead of (val). A possible cleanup topic...

>      NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
> 
>      n->bar.vs = NVME_SPEC_VER;
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index bc20a2ba5e..521533fd2a 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -83,7 +83,8 @@ enum NvmeCapMask {
>                                                              << CAP_PMR_SHIFT)
> 
>  enum NvmeCapCss {
> -    NVME_CAP_CSS_NVM = 1 << 0,
> +    NVME_CAP_CSS_NVM        = 1 << 0,
> +    NVME_CAP_CSS_ADMIN_ONLY = 1 << 7,
>  };
> 
>  enum NvmeCcShift {
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
@ 2020-10-01  4:05   ` Klaus Jensen
  2020-10-01  8:48     ` Klaus Jensen
  2020-10-01 18:34     ` Klaus Jensen
  2020-10-06  1:49   ` Dmitry Fomichev
  1 sibling, 2 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  4:05 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 5772 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> The code switches on the opcode to invoke a function specific to that
> opcode. There's no point in consolidating back to a common function that
> just switches on that same opcode without any actual common code.
> Restore the opcode specific behavior without going back through another
> level of switches.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

Point taken. I could've sweared I had a better reason for this.

> ---
>  hw/block/nvme.c | 91 ++++++++++++++++---------------------------------
>  1 file changed, 29 insertions(+), 62 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index da8344f196..db52ea0db9 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -927,68 +927,12 @@ static void nvme_rw_cb(void *opaque, int ret)
>      nvme_enqueue_req_completion(nvme_cq(req), req);
>  }
>  
> -static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len,
> -                            NvmeRequest *req)
> -{
> -    BlockAcctCookie *acct = &req->acct;
> -    BlockAcctStats *stats = blk_get_stats(blk);
> -
> -    bool is_write = false;
> -
> -    trace_pci_nvme_do_aio(nvme_cid(req), req->cmd.opcode,
> -                          nvme_io_opc_str(req->cmd.opcode), blk_name(blk),
> -                          offset, len);
> -
> -    switch (req->cmd.opcode) {
> -    case NVME_CMD_FLUSH:
> -        block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
> -        req->aiocb = blk_aio_flush(blk, nvme_rw_cb, req);
> -        break;
> -
> -    case NVME_CMD_WRITE_ZEROES:
> -        block_acct_start(stats, acct, len, BLOCK_ACCT_WRITE);
> -        req->aiocb = blk_aio_pwrite_zeroes(blk, offset, len,
> -                                           BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
> -                                           req);
> -        break;
> -
> -    case NVME_CMD_WRITE:
> -        is_write = true;
> -
> -        /* fallthrough */
> -
> -    case NVME_CMD_READ:
> -        block_acct_start(stats, acct, len,
> -                         is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
> -
> -        if (req->qsg.sg) {
> -            if (is_write) {
> -                req->aiocb = dma_blk_write(blk, &req->qsg, offset,
> -                                           BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> -            } else {
> -                req->aiocb = dma_blk_read(blk, &req->qsg, offset,
> -                                          BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> -            }
> -        } else {
> -            if (is_write) {
> -                req->aiocb = blk_aio_pwritev(blk, offset, &req->iov, 0,
> -                                             nvme_rw_cb, req);
> -            } else {
> -                req->aiocb = blk_aio_preadv(blk, offset, &req->iov, 0,
> -                                            nvme_rw_cb, req);
> -            }
> -        }
> -
> -        break;
> -    }
> -
> -    return NVME_NO_COMPLETE;
> -}
> -
>  static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
>  {
> -    NvmeNamespace *ns = req->ns;
> -    return nvme_do_aio(ns->blkconf.blk, 0, 0, req);
> +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
> +                     BLOCK_ACCT_FLUSH);
> +    req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
> +    return NVME_NO_COMPLETE;
>  }
>  
>  static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
> @@ -1009,7 +953,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
>          return status;
>      }
>  
> -    return nvme_do_aio(ns->blkconf.blk, offset, count, req);
> +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
> +                     BLOCK_ACCT_WRITE);
> +    req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
> +                                       BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
> +    return NVME_NO_COMPLETE;
>  }
>  
>  static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> @@ -1023,6 +971,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
>      uint64_t data_offset = nvme_l2b(ns, slba);
>      enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
>          BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
> +    BlockBackend *blk = ns->blkconf.blk;
>      uint16_t status;
>  
>      trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
> @@ -1045,7 +994,25 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
>          goto invalid;
>      }
>  
> -    return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req);
> +    block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct);
> +    if (req->qsg.sg) {
> +        if (acct == BLOCK_ACCT_WRITE) {
> +            req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
> +                                       BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> +        } else {
> +            req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
> +                                      BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> +        }
> +    } else {
> +        if (acct == BLOCK_ACCT_WRITE) {
> +            req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
> +                                         nvme_rw_cb, req);
> +        } else {
> +            req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
> +                                        nvme_rw_cb, req);
> +        }
> +    }
> +    return NVME_NO_COMPLETE;
>  
>  invalid:
>      block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct);
> -- 
> 2.24.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] hw/block/nvme: fix log page offset check
  2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
  2020-09-30 23:18   ` Dmitry Fomichev
@ 2020-10-01  4:05   ` Klaus Jensen
  2020-10-01 10:11   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  4:05 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 2708 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> Return error if the requested offset starts after the size of the log
> being returned. Also, move the check for earlier in the function so
> we're not doing unnecessary calculations.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

> ---
>  hw/block/nvme.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index db52ea0db9..8d2b5be567 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1179,6 +1179,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    if (off >= sizeof(smart)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
>      for (int i = 1; i <= n->num_namespaces; i++) {
>          NvmeNamespace *ns = nvme_ns(n, i);
>          if (!ns) {
> @@ -1193,10 +1197,6 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>          write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
>      }
>  
> -    if (off > sizeof(smart)) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> -    }
> -
>      trans_len = MIN(sizeof(smart) - off, buf_len);
>  
>      memset(&smart, 0x0, sizeof(smart));
> @@ -1234,12 +1234,11 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
>          .afi = 0x1,
>      };
>  
> -    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> -
> -    if (off > sizeof(fw_log)) {
> +    if (off >= sizeof(fw_log)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
>      trans_len = MIN(sizeof(fw_log) - off, buf_len);
>  
>      return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len,
> @@ -1252,16 +1251,15 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>      uint32_t trans_len;
>      NvmeErrorLog errlog;
>  
> -    if (!rae) {
> -        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
> +    if (off >= sizeof(errlog)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    if (off > sizeof(errlog)) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> +    if (!rae) {
> +        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
>      }
>  
>      memset(&errlog, 0x0, sizeof(errlog));
> -
>      trans_len = MIN(sizeof(errlog) - off, buf_len);
>  
>      return nvme_dma(n, (uint8_t *)&errlog, trans_len,
> -- 
> 2.24.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
@ 2020-10-01  4:10   ` Klaus Jensen
  2020-10-01 15:20     ` Keith Busch
  2020-10-02  8:48   ` Klaus Jensen
  2020-10-06  1:57   ` Dmitry Fomichev
  2 siblings, 1 reply; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  4:10 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 5079 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> Let the user specify a specific namespace if they want to get access
> stats for a specific namespace.
> 

I don't think this makes sense for v1.3+.

NVM Express v1.3d, Section 5.14.1.2: "There is no namespace specific
information defined in the SMART / Health log page in this revision of
the specification.  therefore the controller log page and namespace
specific log page contain identical information".

I have no idea why the TWG decided this, but that's the way it is ;)

> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c      | 66 +++++++++++++++++++++++++++-----------------
>  include/block/nvme.h |  1 +
>  2 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8d2b5be567..41389b2b09 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1164,48 +1164,62 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
>      return NVME_SUCCESS;
>  }
>  
> +struct nvme_stats {
> +    uint64_t units_read;
> +    uint64_t units_written;
> +    uint64_t read_commands;
> +    uint64_t write_commands;
> +};
> +
> +static void nvme_set_blk_stats(NvmeNamespace *ns, struct nvme_stats *stats)
> +{
> +    BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
> +
> +    stats->units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> +    stats->units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> +    stats->read_commands += s->nr_ops[BLOCK_ACCT_READ];
> +    stats->write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
> +}
> +
>  static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>                                  uint64_t off, NvmeRequest *req)
>  {
>      uint32_t nsid = le32_to_cpu(req->cmd.nsid);
> -
> +    struct nvme_stats stats = { 0 };
> +    NvmeSmartLog smart = { 0 };
>      uint32_t trans_len;
> +    NvmeNamespace *ns;
>      time_t current_ms;
> -    uint64_t units_read = 0, units_written = 0;
> -    uint64_t read_commands = 0, write_commands = 0;
> -    NvmeSmartLog smart;
> -
> -    if (nsid && nsid != 0xffffffff) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> -    }
>  
>      if (off >= sizeof(smart)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    for (int i = 1; i <= n->num_namespaces; i++) {
> -        NvmeNamespace *ns = nvme_ns(n, i);
> -        if (!ns) {
> -            continue;
> -        }
> -
> -        BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
> +    if (nsid != 0xffffffff) {
> +        ns = nvme_ns(n, nsid);
> +        if (!ns)
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        nvme_set_blk_stats(ns, &stats);
> +    } else {
> +        int i;
>  
> -        units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> -        units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> -        read_commands += s->nr_ops[BLOCK_ACCT_READ];
> -        write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
> +        for (i = 1; i <= n->num_namespaces; i++) {
> +            ns = nvme_ns(n, i);
> +            if (!ns) {
> +                continue;
> +            }
> +            nvme_set_blk_stats(ns, &stats);
> +        }
>      }
>  
>      trans_len = MIN(sizeof(smart) - off, buf_len);
>  
> -    memset(&smart, 0x0, sizeof(smart));
> -
> -    smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(units_read, 1000));
> -    smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(units_written,
> +    smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read,
> +                                                        1000));
> +    smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_written,
>                                                             1000));
> -    smart.host_read_commands[0] = cpu_to_le64(read_commands);
> -    smart.host_write_commands[0] = cpu_to_le64(write_commands);
> +    smart.host_read_commands[0] = cpu_to_le64(stats.read_commands);
> +    smart.host_write_commands[0] = cpu_to_le64(stats.write_commands);
>  
>      smart.temperature = cpu_to_le16(n->temperature);
>  
> @@ -2708,7 +2722,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->acl = 3;
>      id->aerl = n->params.aerl;
>      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> -    id->lpa = NVME_LPA_EXTENDED;
> +    id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED;
>  
>      /* recommended default value (~70 C) */
>      id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 58647bcdad..868cf53f0b 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -849,6 +849,7 @@ enum NvmeIdCtrlFrmw {
>  };
>  
>  enum NvmeIdCtrlLpa {
> +    NVME_LPA_NS_SMART = 1 << 0,
>      NVME_LPA_EXTENDED = 1 << 2,
>  };
>  
> -- 
> 2.24.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] hw/block/nvme: validate command set selected
  2020-09-30 22:04 ` [PATCH 4/9] hw/block/nvme: validate command set selected Keith Busch
@ 2020-10-01  4:14   ` Klaus Jensen
  0 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  4:14 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 3304 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> Fail to start the controller if the user requests a command set that the
> controller does not support.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

> ---
>  hw/block/nvme.c       | 6 +++++-
>  hw/block/trace-events | 1 +
>  include/block/nvme.h  | 4 ++++
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 41389b2b09..6c582e6874 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2049,6 +2049,10 @@ static int nvme_start_ctrl(NvmeCtrl *n)
>          trace_pci_nvme_err_startfail_acq_misaligned(n->bar.acq);
>          return -1;
>      }
> +    if (unlikely(!(NVME_CAP_CSS(n->bar.cap) & (1 << NVME_CC_CSS(n->bar.cc))))) {
> +        trace_pci_nvme_err_startfail_css(NVME_CC_CSS(n->bar.cc));
> +        return -1;
> +    }
>      if (unlikely(NVME_CC_MPS(n->bar.cc) <
>                   NVME_CAP_MPSMIN(n->bar.cap))) {
>          trace_pci_nvme_err_startfail_page_too_small(
> @@ -2750,7 +2754,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
>      NVME_CAP_SET_CQR(n->bar.cap, 1);
>      NVME_CAP_SET_TO(n->bar.cap, 0xf);
> -    NVME_CAP_SET_CSS(n->bar.cap, 1);
> +    NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
>      NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
>  
>      n->bar.vs = NVME_SPEC_VER;
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 446cca08e9..7720e1b4d9 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -133,6 +133,7 @@ pci_nvme_err_startfail_cqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_
>  pci_nvme_err_startfail_cqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too large: log2size=%u, max=%u"
>  pci_nvme_err_startfail_sqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too small: log2size=%u, min=%u"
>  pci_nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too large: log2size=%u, max=%u"
> +pci_nvme_err_startfail_css(uint8_t css) "nvme_start_ctrl failed because invalid command set selected:%u"
>  pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero"
>  pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero"
>  pci_nvme_err_startfail(void) "setting controller enable bit failed"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 868cf53f0b..bc20a2ba5e 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -82,6 +82,10 @@ enum NvmeCapMask {
>  #define NVME_CAP_SET_PMRS(cap, val) (cap |= (uint64_t)(val & CAP_PMR_MASK)\
>                                                              << CAP_PMR_SHIFT)
>  
> +enum NvmeCapCss {
> +    NVME_CAP_CSS_NVM = 1 << 0,
> +};
> +
>  enum NvmeCcShift {
>      CC_EN_SHIFT     = 0,
>      CC_CSS_SHIFT    = 4,
> -- 
> 2.24.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] hw/block/nvme: support for admin-only command set
  2020-09-30 22:04 ` [PATCH 5/9] hw/block/nvme: support for admin-only command set Keith Busch
  2020-10-01  0:11   ` Dmitry Fomichev
@ 2020-10-01  4:17   ` Klaus Jensen
  1 sibling, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  4:17 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

>  hw/block/nvme.c      | 1 +
>  include/block/nvme.h | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 6c582e6874..ec7363ea40 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2755,6 +2755,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      NVME_CAP_SET_CQR(n->bar.cap, 1);
>      NVME_CAP_SET_TO(n->bar.cap, 0xf);
>      NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM);
> +    NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY);
>      NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
>  
>      n->bar.vs = NVME_SPEC_VER;
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index bc20a2ba5e..521533fd2a 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -83,7 +83,8 @@ enum NvmeCapMask {
>                                                              << CAP_PMR_SHIFT)
>  
>  enum NvmeCapCss {
> -    NVME_CAP_CSS_NVM = 1 << 0,
> +    NVME_CAP_CSS_NVM        = 1 << 0,
> +    NVME_CAP_CSS_ADMIN_ONLY = 1 << 7,
>  };
>  
>  enum NvmeCcShift {
> -- 
> 2.24.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-10-01  4:05   ` Klaus Jensen
@ 2020-10-01  8:48     ` Klaus Jensen
  2020-10-01 15:24       ` Keith Busch
  2020-10-01 18:34     ` Klaus Jensen
  1 sibling, 1 reply; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  8:48 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

On Oct  1 06:05, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > The code switches on the opcode to invoke a function specific to that
> > opcode. There's no point in consolidating back to a common function that
> > just switches on that same opcode without any actual common code.
> > Restore the opcode specific behavior without going back through another
> > level of switches.
> > 
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> 
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> 
> Point taken. I could've sweared I had a better reason for this.
> 

Can you also remove the nvme_do_aio trace event?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF
  2020-09-30 22:04 ` [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF Keith Busch
@ 2020-10-01  9:48   ` Klaus Jensen
  0 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01  9:48 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> From: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> 
> Calculate the data shift value to report based on the set value of
> logical_block_size device property.
> 
> In the process, use a local variable to calculate the LBA format
> index instead of the hardcoded value 0. This makes the code more
> readable and it will make it easier to add support for multiple LBA
> formats in the future.
> 
> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme-ns.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index 2ba0263dda..a85e5fdb42 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c
> @@ -47,6 +47,8 @@ static void nvme_ns_init(NvmeNamespace *ns)
>  
>  static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
>  {
> +    int lba_index;
> +
>      if (!blkconf_blocksizes(&ns->blkconf, errp)) {
>          return -1;
>      }
> @@ -67,6 +69,9 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
>          n->features.vwc = 0x1;
>      }
>  
> +    lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
> +    ns->id_ns.lbaf[lba_index].ds = 31 - clz32(ns->blkconf.logical_block_size);
> +
>      return 0;
>  }
>  

I think that back when I reviewed this, it was before the
multi-namespace patch.

I believe this setup should move to the nvme_ns_init() function instead.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] hw/block/nvme: fix log page offset check
  2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
  2020-09-30 23:18   ` Dmitry Fomichev
  2020-10-01  4:05   ` Klaus Jensen
@ 2020-10-01 10:11   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 36+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-10-01 10:11 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Kevin Wolf

On 10/1/20 12:04 AM, Keith Busch wrote:
> Return error if the requested offset starts after the size of the log
> being returned. Also, move the check for earlier in the function so
> we're not doing unnecessary calculations.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-10-01  4:10   ` Klaus Jensen
@ 2020-10-01 15:20     ` Keith Busch
  2020-10-01 17:18       ` Klaus Jensen
  0 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-10-01 15:20 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

On Thu, Oct 01, 2020 at 06:10:57AM +0200, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > Let the user specify a specific namespace if they want to get access
> > stats for a specific namespace.
> > 
> 
> I don't think this makes sense for v1.3+.
> 
> NVM Express v1.3d, Section 5.14.1.2: "There is no namespace specific
> information defined in the SMART / Health log page in this revision of
> the specification.  therefore the controller log page and namespace
> specific log page contain identical information".
> 
> I have no idea why the TWG decided this, but that's the way it is ;)

I don't think they did that. The behavior you're referring to is specific to
controllers that operate a particular way: "If the log page is not supported on
a per namespace basis ...". They were trying to provide a spec compliant way
for controllers to return a success status if you supplied a valid NSID when
the controller doesn't support per-namespace smart data..

The previous paragraph is more clear on this: "The controller may also support
requesting the log page on a per namespace basis, as indicated by bit 0 of the
LPA field in the Identify Controller data structure".


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-10-01  8:48     ` Klaus Jensen
@ 2020-10-01 15:24       ` Keith Busch
  0 siblings, 0 replies; 36+ messages in thread
From: Keith Busch @ 2020-10-01 15:24 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

On Thu, Oct 01, 2020 at 10:48:03AM +0200, Klaus Jensen wrote:
> On Oct  1 06:05, Klaus Jensen wrote:
> > On Sep 30 15:04, Keith Busch wrote:
> > > The code switches on the opcode to invoke a function specific to that
> > > opcode. There's no point in consolidating back to a common function that
> > > just switches on that same opcode without any actual common code.
> > > Restore the opcode specific behavior without going back through another
> > > level of switches.
> > > 
> > > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > 
> > Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Point taken. I could've sweared I had a better reason for this.
> > 
> 
> Can you also remove the nvme_do_aio trace event?

Ah, thanks for pointing that out. I'll fix up the trace.

I think you may have a case where this might make sense still in the making? If
so we can reevaluate when it's ready.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code
  2020-09-30 22:04 ` [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code Keith Busch
  2020-09-30 23:21   ` Dmitry Fomichev
@ 2020-10-01 15:25   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 36+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-10-01 15:25 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Dmitry Fomichev, Niklas Cassel, Kevin Wolf

On 10/1/20 12:04 AM, Keith Busch wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> If a command results in a non-zero status code, trace it.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c       | 6 ++++++
>  hw/block/trace-events | 1 +
>  2 files changed, 7 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-10-01 15:20     ` Keith Busch
@ 2020-10-01 17:18       ` Klaus Jensen
  2020-10-01 17:30         ` Keith Busch
  0 siblings, 1 reply; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01 17:18 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1922 bytes --]

On Oct  1 09:20, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 06:10:57AM +0200, Klaus Jensen wrote:
> > On Sep 30 15:04, Keith Busch wrote:
> > > Let the user specify a specific namespace if they want to get access
> > > stats for a specific namespace.
> > > 
> > 
> > I don't think this makes sense for v1.3+.
> > 
> > NVM Express v1.3d, Section 5.14.1.2: "There is no namespace specific
> > information defined in the SMART / Health log page in this revision of
> > the specification.  therefore the controller log page and namespace
> > specific log page contain identical information".
> > 
> > I have no idea why the TWG decided this, but that's the way it is ;)
> 
> I don't think they did that. The behavior you're referring to is specific to
> controllers that operate a particular way: "If the log page is not supported on
> a per namespace basis ...". They were trying to provide a spec compliant way
> for controllers to return a success status if you supplied a valid NSID when
> the controller doesn't support per-namespace smart data..
> 
> The previous paragraph is more clear on this: "The controller may also support
> requesting the log page on a per namespace basis, as indicated by bit 0 of the
> LPA field in the Identify Controller data structure".

OK, so I agree that it makes sense for it to be supported on a per
namespace basis, but I think the spec is just keeping the door open for
future namespace specific stuff in the log page - currently there is
none.

Figure 94 (the actual SMART log page) says that the Data Units
Read/Written are controller wide, so there really is no namespace
specific information. Maybe this could be in the context of shared
namespaces? How would a controller know how much data has been
read/written from it without asking the other controllers? What if a
controller is detached from the namespace - you'd lose those numbers.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-10-01 17:18       ` Klaus Jensen
@ 2020-10-01 17:30         ` Keith Busch
  2020-10-01 17:34           ` Klaus Jensen
  0 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-10-01 17:30 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

On Thu, Oct 01, 2020 at 07:18:37PM +0200, Klaus Jensen wrote:
> OK, so I agree that it makes sense for it to be supported on a per
> namespace basis, but I think the spec is just keeping the door open for
> future namespace specific stuff in the log page - currently there is
> none.
> 
> Figure 94 (the actual SMART log page) says that the Data Units
> Read/Written are controller wide, so there really is no namespace
> specific information. Maybe this could be in the context of shared
> namespaces? How would a controller know how much data has been
> read/written from it without asking the other controllers? What if a
> controller is detached from the namespace - you'd lose those numbers.

That text is wrong. There is no "controller" scope to the smart log.
Figure 191 says the smart scope is to the subsystem or the namespace. It
doesn't matter which controller performed an IO to a particular
namespace; the log needs to report the same information regardless of
which controller you query. How that is coordinated within the subsystem
is a detail not defined by spec.

Not that that particular detail matters here, as we don't support
multi-controller subsystems (yet!). But the smart log text has missed an
update to reflect this, so it looks like trivial ECN material to me.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-10-01 17:30         ` Keith Busch
@ 2020-10-01 17:34           ` Klaus Jensen
  0 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01 17:34 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1386 bytes --]

On Oct  1 10:30, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 07:18:37PM +0200, Klaus Jensen wrote:
> > OK, so I agree that it makes sense for it to be supported on a per
> > namespace basis, but I think the spec is just keeping the door open for
> > future namespace specific stuff in the log page - currently there is
> > none.
> > 
> > Figure 94 (the actual SMART log page) says that the Data Units
> > Read/Written are controller wide, so there really is no namespace
> > specific information. Maybe this could be in the context of shared
> > namespaces? How would a controller know how much data has been
> > read/written from it without asking the other controllers? What if a
> > controller is detached from the namespace - you'd lose those numbers.
> 
> That text is wrong. There is no "controller" scope to the smart log.
> Figure 191 says the smart scope is to the subsystem or the namespace. It
> doesn't matter which controller performed an IO to a particular
> namespace; the log needs to report the same information regardless of
> which controller you query. How that is coordinated within the subsystem
> is a detail not defined by spec.
> 

Oh! Thats new in v1.4. So they updated that, but forgot figure 194. In
v1.3 it is controller and namespace scope.

Anyway, fair enough. In that case,

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-10-01  4:05   ` Klaus Jensen
  2020-10-01  8:48     ` Klaus Jensen
@ 2020-10-01 18:34     ` Klaus Jensen
  1 sibling, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01 18:34 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 6283 bytes --]

On Oct  1 06:05, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > The code switches on the opcode to invoke a function specific to that
> > opcode. There's no point in consolidating back to a common function that
> > just switches on that same opcode without any actual common code.
> > Restore the opcode specific behavior without going back through another
> > level of switches.
> > 
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> 
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> 
> Point taken. I could've sweared I had a better reason for this.
> 
> > ---
> >  hw/block/nvme.c | 91 ++++++++++++++++---------------------------------
> >  1 file changed, 29 insertions(+), 62 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index da8344f196..db52ea0db9 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -927,68 +927,12 @@ static void nvme_rw_cb(void *opaque, int ret)
> >      nvme_enqueue_req_completion(nvme_cq(req), req);
> >  }
> >  
> > -static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len,
> > -                            NvmeRequest *req)
> > -{
> > -    BlockAcctCookie *acct = &req->acct;
> > -    BlockAcctStats *stats = blk_get_stats(blk);
> > -
> > -    bool is_write = false;
> > -
> > -    trace_pci_nvme_do_aio(nvme_cid(req), req->cmd.opcode,
> > -                          nvme_io_opc_str(req->cmd.opcode), blk_name(blk),
> > -                          offset, len);
> > -
> > -    switch (req->cmd.opcode) {
> > -    case NVME_CMD_FLUSH:
> > -        block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
> > -        req->aiocb = blk_aio_flush(blk, nvme_rw_cb, req);
> > -        break;
> > -
> > -    case NVME_CMD_WRITE_ZEROES:
> > -        block_acct_start(stats, acct, len, BLOCK_ACCT_WRITE);
> > -        req->aiocb = blk_aio_pwrite_zeroes(blk, offset, len,
> > -                                           BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
> > -                                           req);
> > -        break;
> > -
> > -    case NVME_CMD_WRITE:
> > -        is_write = true;
> > -
> > -        /* fallthrough */
> > -
> > -    case NVME_CMD_READ:
> > -        block_acct_start(stats, acct, len,
> > -                         is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
> > -
> > -        if (req->qsg.sg) {
> > -            if (is_write) {
> > -                req->aiocb = dma_blk_write(blk, &req->qsg, offset,
> > -                                           BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> > -            } else {
> > -                req->aiocb = dma_blk_read(blk, &req->qsg, offset,
> > -                                          BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> > -            }
> > -        } else {
> > -            if (is_write) {
> > -                req->aiocb = blk_aio_pwritev(blk, offset, &req->iov, 0,
> > -                                             nvme_rw_cb, req);
> > -            } else {
> > -                req->aiocb = blk_aio_preadv(blk, offset, &req->iov, 0,
> > -                                            nvme_rw_cb, req);
> > -            }
> > -        }
> > -
> > -        break;
> > -    }
> > -
> > -    return NVME_NO_COMPLETE;
> > -}
> > -
> >  static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
> >  {
> > -    NvmeNamespace *ns = req->ns;
> > -    return nvme_do_aio(ns->blkconf.blk, 0, 0, req);
> > +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,

Uh, ouch!

This and the rest needs to be changed to ns->blkconf.blk and not
n->conf.blk.

> > +                     BLOCK_ACCT_FLUSH);
> > +    req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
> > +    return NVME_NO_COMPLETE;
> >  }
> >  
> >  static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
> > @@ -1009,7 +953,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
> >          return status;
> >      }
> >  
> > -    return nvme_do_aio(ns->blkconf.blk, offset, count, req);
> > +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
> > +                     BLOCK_ACCT_WRITE);
> > +    req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
> > +                                       BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
> > +    return NVME_NO_COMPLETE;
> >  }
> >  
> >  static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> > @@ -1023,6 +971,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> >      uint64_t data_offset = nvme_l2b(ns, slba);
> >      enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
> >          BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
> > +    BlockBackend *blk = ns->blkconf.blk;
> >      uint16_t status;
> >  
> >      trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
> > @@ -1045,7 +994,25 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> >          goto invalid;
> >      }
> >  
> > -    return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req);
> > +    block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct);
> > +    if (req->qsg.sg) {
> > +        if (acct == BLOCK_ACCT_WRITE) {
> > +            req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
> > +                                       BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> > +        } else {
> > +            req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
> > +                                      BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> > +        }
> > +    } else {
> > +        if (acct == BLOCK_ACCT_WRITE) {
> > +            req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
> > +                                         nvme_rw_cb, req);
> > +        } else {
> > +            req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
> > +                                        nvme_rw_cb, req);
> > +        }
> > +    }
> > +    return NVME_NO_COMPLETE;
> >  
> >  invalid:
> >      block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct);
> > -- 
> > 2.24.1
> > 
> > 
> 
> -- 
> One of us - No more doubt, silence or taboo about mental illness.



-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/9] nvme qemu cleanups and fixes
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (8 preceding siblings ...)
  2020-09-30 22:04 ` [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF Keith Busch
@ 2020-10-01 18:46 ` Klaus Jensen
  2020-10-13  9:04 ` Klaus Jensen
  10 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-01 18:46 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 711 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> After going through the zns enabling, I notice the controller enabling
> is not correct. Then I just continued maked more stuff. The series, I
> think, contains some of the less controversial patches from the two
> conflicting zns series, preceeded by some cleanups and fixes from me.
> 
> If this is all fine, I took the liberty of porting the zns enabling to
> it and made a public branch for consideration here:
> 
>  http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 
> 

I rebased my patches on top of Keith's fixups as well. And in compliance
with the concensus, I removed persistence.

https://irrelevant.dk/g/pci-nvme.git/log/?h=zns

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
  2020-10-01  4:10   ` Klaus Jensen
@ 2020-10-02  8:48   ` Klaus Jensen
  2020-10-06  1:57   ` Dmitry Fomichev
  2 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-02  8:48 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1660 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> Let the user specify a specific namespace if they want to get access
> stats for a specific namespace.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c      | 66 +++++++++++++++++++++++++++-----------------
>  include/block/nvme.h |  1 +
>  2 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8d2b5be567..41389b2b09 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
>  static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>                                  uint64_t off, NvmeRequest *req)
>  {
>      uint32_t nsid = le32_to_cpu(req->cmd.nsid);
> -
> +    struct nvme_stats stats = { 0 };
> +    NvmeSmartLog smart = { 0 };
>      uint32_t trans_len;
> +    NvmeNamespace *ns;
>      time_t current_ms;
> -    uint64_t units_read = 0, units_written = 0;
> -    uint64_t read_commands = 0, write_commands = 0;
> -    NvmeSmartLog smart;
> -
> -    if (nsid && nsid != 0xffffffff) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> -    }
>  
>      if (off >= sizeof(smart)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    for (int i = 1; i <= n->num_namespaces; i++) {
> -        NvmeNamespace *ns = nvme_ns(n, i);
> -        if (!ns) {
> -            continue;
> -        }
> -
> -        BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
> +    if (nsid != 0xffffffff) {
> +        ns = nvme_ns(n, nsid);
> +        if (!ns)
> +            return NVME_INVALID_NSID | NVME_DNR;

Btw, this is failing style check (missing braces).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
  2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
  2020-10-01  4:05   ` Klaus Jensen
@ 2020-10-06  1:49   ` Dmitry Fomichev
  1 sibling, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-10-06  1:49 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection
> 
> The code switches on the opcode to invoke a function specific to that
> opcode. There's no point in consolidating back to a common function that
> just switches on that same opcode without any actual common code.
> Restore the opcode specific behavior without going back through another
> level of switches.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c | 91 ++++++++++++++++---------------------------------
>  1 file changed, 29 insertions(+), 62 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index da8344f196..db52ea0db9 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -927,68 +927,12 @@ static void nvme_rw_cb(void *opaque, int ret)
>      nvme_enqueue_req_completion(nvme_cq(req), req);
>  }
> 
> -static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len,
> -                            NvmeRequest *req)
> -{
> -    BlockAcctCookie *acct = &req->acct;
> -    BlockAcctStats *stats = blk_get_stats(blk);
> -
> -    bool is_write = false;
> -
> -    trace_pci_nvme_do_aio(nvme_cid(req), req->cmd.opcode,
> -                          nvme_io_opc_str(req->cmd.opcode), blk_name(blk),
> -                          offset, len);
> -
> -    switch (req->cmd.opcode) {
> -    case NVME_CMD_FLUSH:
> -        block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
> -        req->aiocb = blk_aio_flush(blk, nvme_rw_cb, req);
> -        break;
> -
> -    case NVME_CMD_WRITE_ZEROES:
> -        block_acct_start(stats, acct, len, BLOCK_ACCT_WRITE);
> -        req->aiocb = blk_aio_pwrite_zeroes(blk, offset, len,
> -                                           BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
> -                                           req);
> -        break;
> -
> -    case NVME_CMD_WRITE:
> -        is_write = true;
> -
> -        /* fallthrough */
> -
> -    case NVME_CMD_READ:
> -        block_acct_start(stats, acct, len,
> -                         is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
> -
> -        if (req->qsg.sg) {
> -            if (is_write) {
> -                req->aiocb = dma_blk_write(blk, &req->qsg, offset,
> -                                           BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> -            } else {
> -                req->aiocb = dma_blk_read(blk, &req->qsg, offset,
> -                                          BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> -            }
> -        } else {
> -            if (is_write) {
> -                req->aiocb = blk_aio_pwritev(blk, offset, &req->iov, 0,
> -                                             nvme_rw_cb, req);
> -            } else {
> -                req->aiocb = blk_aio_preadv(blk, offset, &req->iov, 0,
> -                                            nvme_rw_cb, req);
> -            }
> -        }
> -
> -        break;
> -    }
> -
> -    return NVME_NO_COMPLETE;
> -}
> -
>  static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
>  {
> -    NvmeNamespace *ns = req->ns;
> -    return nvme_do_aio(ns->blkconf.blk, 0, 0, req);
> +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,

This should be
block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0,

> +                     BLOCK_ACCT_FLUSH);
> +    req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);

and here
req->aiocb = blk_aio_flush(req->ns->blkconf.blk, nvme_rw_cb, req);

> +    return NVME_NO_COMPLETE;
>  }
> 
>  static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
> @@ -1009,7 +953,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n,
> NvmeRequest *req)
>          return status;
>      }
> 
> -    return nvme_do_aio(ns->blkconf.blk, offset, count, req);
> +    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,

 block_acct_start(blk_get_stats(ns->blkconf.blk), &req->acct, 0,

> +                     BLOCK_ACCT_WRITE);
> +    req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,

Here as well,
 req->aiocb = blk_aio_pwrite_zeroes(ns->blkconf.blk, offset, count,

> +                                       BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
> +    return NVME_NO_COMPLETE;
>  }
> 
>  static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> @@ -1023,6 +971,7 @@ static uint16_t nvme_rw(NvmeCtrl *n,
> NvmeRequest *req)
>      uint64_t data_offset = nvme_l2b(ns, slba);
>      enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
>          BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
> +    BlockBackend *blk = ns->blkconf.blk;
>      uint16_t status;
> 
>      trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
> @@ -1045,7 +994,25 @@ static uint16_t nvme_rw(NvmeCtrl *n,
> NvmeRequest *req)
>          goto invalid;
>      }
> 
> -    return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req);
> +    block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct);
> +    if (req->qsg.sg) {
> +        if (acct == BLOCK_ACCT_WRITE) {
> +            req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
> +                                       BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> +        } else {
> +            req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
> +                                      BDRV_SECTOR_SIZE, nvme_rw_cb, req);
> +        }
> +    } else {
> +        if (acct == BLOCK_ACCT_WRITE) {
> +            req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
> +                                         nvme_rw_cb, req);
> +        } else {
> +            req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
> +                                        nvme_rw_cb, req);
> +        }
> +    }
> +    return NVME_NO_COMPLETE;
> 
>  invalid:
>      block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct);
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
  2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
  2020-10-01  4:10   ` Klaus Jensen
  2020-10-02  8:48   ` Klaus Jensen
@ 2020-10-06  1:57   ` Dmitry Fomichev
  2 siblings, 0 replies; 36+ messages in thread
From: Dmitry Fomichev @ 2020-10-06  1:57 UTC (permalink / raw)
  To: Keith Busch, qemu-block, qemu-devel, Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, Philippe Mathieu-Daudé

> -----Original Message-----
> From: Keith Busch <kbusch@kernel.org>
> Sent: Wednesday, September 30, 2020 6:04 PM
> To: qemu-block@nongnu.org; qemu-devel@nongnu.org; Klaus Jensen
> <k.jensen@samsung.com>
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>; Dmitry Fomichev
> <Dmitry.Fomichev@wdc.com>; Kevin Wolf <kwolf@redhat.com>; Philippe
> Mathieu-Daudé <philmd@redhat.com>; Keith Busch <kbusch@kernel.org>
> Subject: [PATCH 3/9] hw/block/nvme: support per-namespace smart log
> 
> Let the user specify a specific namespace if they want to get access
> stats for a specific namespace.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c      | 66 +++++++++++++++++++++++++++-----------------
>  include/block/nvme.h |  1 +
>  2 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8d2b5be567..41389b2b09 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1164,48 +1164,62 @@ static uint16_t nvme_create_sq(NvmeCtrl *n,
> NvmeRequest *req)
>      return NVME_SUCCESS;
>  }
> 
> +struct nvme_stats {
> +    uint64_t units_read;
> +    uint64_t units_written;
> +    uint64_t read_commands;
> +    uint64_t write_commands;
> +};
> +
> +static void nvme_set_blk_stats(NvmeNamespace *ns, struct nvme_stats
> *stats)
> +{
> +    BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
> +
> +    stats->units_read += s->nr_bytes[BLOCK_ACCT_READ] >>
> BDRV_SECTOR_BITS;
> +    stats->units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >>
> BDRV_SECTOR_BITS;
> +    stats->read_commands += s->nr_ops[BLOCK_ACCT_READ];
> +    stats->write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
> +}
> +
>  static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t
> buf_len,
>                                  uint64_t off, NvmeRequest *req)
>  {
>      uint32_t nsid = le32_to_cpu(req->cmd.nsid);
> -
> +    struct nvme_stats stats = { 0 };
> +    NvmeSmartLog smart = { 0 };
>      uint32_t trans_len;
> +    NvmeNamespace *ns;
>      time_t current_ms;
> -    uint64_t units_read = 0, units_written = 0;
> -    uint64_t read_commands = 0, write_commands = 0;
> -    NvmeSmartLog smart;
> -
> -    if (nsid && nsid != 0xffffffff) {
> -        return NVME_INVALID_FIELD | NVME_DNR;
> -    }
> 
>      if (off >= sizeof(smart)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
> 
> -    for (int i = 1; i <= n->num_namespaces; i++) {
> -        NvmeNamespace *ns = nvme_ns(n, i);
> -        if (!ns) {
> -            continue;
> -        }
> -
> -        BlockAcctStats *s = blk_get_stats(ns->blkconf.blk);
> +    if (nsid != 0xffffffff) {
> +        ns = nvme_ns(n, nsid);
> +        if (!ns)
> +            return NVME_INVALID_NSID | NVME_DNR;

checkpatch will complain about missing curly braces here

> +        nvme_set_blk_stats(ns, &stats);
> +    } else {
> +        int i;
> 
> -        units_read += s->nr_bytes[BLOCK_ACCT_READ] >>
> BDRV_SECTOR_BITS;
> -        units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >>
> BDRV_SECTOR_BITS;
> -        read_commands += s->nr_ops[BLOCK_ACCT_READ];
> -        write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
> +        for (i = 1; i <= n->num_namespaces; i++) {
> +            ns = nvme_ns(n, i);
> +            if (!ns) {
> +                continue;
> +            }
> +            nvme_set_blk_stats(ns, &stats);
> +        }
>      }
> 
>      trans_len = MIN(sizeof(smart) - off, buf_len);
> 
> -    memset(&smart, 0x0, sizeof(smart));
> -
> -    smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(units_read,
> 1000));
> -    smart.data_units_written[0] =
> cpu_to_le64(DIV_ROUND_UP(units_written,
> +    smart.data_units_read[0] =
> cpu_to_le64(DIV_ROUND_UP(stats.units_read,
> +                                                        1000));
> +    smart.data_units_written[0] =
> cpu_to_le64(DIV_ROUND_UP(stats.units_written,
>                                                             1000));
> -    smart.host_read_commands[0] = cpu_to_le64(read_commands);
> -    smart.host_write_commands[0] = cpu_to_le64(write_commands);
> +    smart.host_read_commands[0] = cpu_to_le64(stats.read_commands);
> +    smart.host_write_commands[0] = cpu_to_le64(stats.write_commands);
> 
>      smart.temperature = cpu_to_le16(n->temperature);
> 
> @@ -2708,7 +2722,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice
> *pci_dev)
>      id->acl = 3;
>      id->aerl = n->params.aerl;
>      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> -    id->lpa = NVME_LPA_EXTENDED;
> +    id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED;
> 
>      /* recommended default value (~70 C) */
>      id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 58647bcdad..868cf53f0b 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -849,6 +849,7 @@ enum NvmeIdCtrlFrmw {
>  };
> 
>  enum NvmeIdCtrlLpa {
> +    NVME_LPA_NS_SMART = 1 << 0,
>      NVME_LPA_EXTENDED = 1 << 2,
>  };
> 
> --
> 2.24.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/9] nvme qemu cleanups and fixes
  2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
                   ` (9 preceding siblings ...)
  2020-10-01 18:46 ` [PATCH 0/9] nvme qemu cleanups and fixes Klaus Jensen
@ 2020-10-13  9:04 ` Klaus Jensen
  2020-10-13 17:48   ` Keith Busch
  10 siblings, 1 reply; 36+ messages in thread
From: Klaus Jensen @ 2020-10-13  9:04 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 1647 bytes --]

On Sep 30 15:04, Keith Busch wrote:
> After going through the zns enabling, I notice the controller enabling
> is not correct. Then I just continued maked more stuff. The series, I
> think, contains some of the less controversial patches from the two
> conflicting zns series, preceeded by some cleanups and fixes from me.
> 
> If this is all fine, I took the liberty of porting the zns enabling to
> it and made a public branch for consideration here:
> 
>  http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 
> 
> Dmitry Fomichev (1):
>   hw/block/nvme: report actual LBA data shift in LBAF
> 
> Keith Busch (5):
>   hw/block/nvme: remove pointless rw indirection
>   hw/block/nvme: fix log page offset check
>   hw/block/nvme: support per-namespace smart log
>   hw/block/nvme: validate command set selected
>   hw/block/nvme: support for admin-only command set
> 
> Klaus Jensen (3):
>   hw/block/nvme: reject io commands if only admin command set selected
>   hw/block/nvme: add nsid to get/setfeat trace events
>   hw/block/nvme: add trace event for requests with non-zero status code
> 
>  hw/block/nvme-ns.c    |   5 ++
>  hw/block/nvme.c       | 194 ++++++++++++++++++++----------------------
>  hw/block/trace-events |   6 +-
>  include/block/nvme.h  |  11 +++
>  4 files changed, 114 insertions(+), 102 deletions(-)
> 
> -- 
> 2.24.1
> 
> 

These fixes all look good to me apart from the odd fixes that has been
mentioned in the reviews. Since soft freeze is only two weeks away (Oct
27th), it would be nice to get this staged on nvme-next so we can get a
pull sent off to Peter.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/9] nvme qemu cleanups and fixes
  2020-10-13  9:04 ` Klaus Jensen
@ 2020-10-13 17:48   ` Keith Busch
  2020-10-13 18:36     ` Klaus Jensen
  0 siblings, 1 reply; 36+ messages in thread
From: Keith Busch @ 2020-10-13 17:48 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

On Tue, Oct 13, 2020 at 11:04:01AM +0200, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > After going through the zns enabling, I notice the controller enabling
> > is not correct. Then I just continued maked more stuff. The series, I
> > think, contains some of the less controversial patches from the two
> > conflicting zns series, preceeded by some cleanups and fixes from me.
> > 
> > If this is all fine, I took the liberty of porting the zns enabling to
> > it and made a public branch for consideration here:
> > 
> >  http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 
> > 
> > Dmitry Fomichev (1):
> >   hw/block/nvme: report actual LBA data shift in LBAF
> > 
> > Keith Busch (5):
> >   hw/block/nvme: remove pointless rw indirection
> >   hw/block/nvme: fix log page offset check
> >   hw/block/nvme: support per-namespace smart log
> >   hw/block/nvme: validate command set selected
> >   hw/block/nvme: support for admin-only command set
> > 
> > Klaus Jensen (3):
> >   hw/block/nvme: reject io commands if only admin command set selected
> >   hw/block/nvme: add nsid to get/setfeat trace events
> >   hw/block/nvme: add trace event for requests with non-zero status code
> > 
> >  hw/block/nvme-ns.c    |   5 ++
> >  hw/block/nvme.c       | 194 ++++++++++++++++++++----------------------
> >  hw/block/trace-events |   6 +-
> >  include/block/nvme.h  |  11 +++
> >  4 files changed, 114 insertions(+), 102 deletions(-)
> > 
> > -- 
> > 2.24.1
> > 
> > 
> 
> These fixes all look good to me apart from the odd fixes that has been
> mentioned in the reviews. Since soft freeze is only two weeks away (Oct
> 27th), it would be nice to get this staged on nvme-next so we can get a
> pull sent off to Peter.

I've fixed up the comments mentioned and added the received reviews.
Since it was pretty trivial fixups and passes my basic santify tests, I
went ahead and pushed to nvme-next.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/9] nvme qemu cleanups and fixes
  2020-10-13 17:48   ` Keith Busch
@ 2020-10-13 18:36     ` Klaus Jensen
  0 siblings, 0 replies; 36+ messages in thread
From: Klaus Jensen @ 2020-10-13 18:36 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Niklas Cassel, qemu-block, Klaus Jensen, qemu-devel,
	Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 2130 bytes --]

On Oct 13 10:48, Keith Busch wrote:
> On Tue, Oct 13, 2020 at 11:04:01AM +0200, Klaus Jensen wrote:
> > On Sep 30 15:04, Keith Busch wrote:
> > > After going through the zns enabling, I notice the controller enabling
> > > is not correct. Then I just continued maked more stuff. The series, I
> > > think, contains some of the less controversial patches from the two
> > > conflicting zns series, preceeded by some cleanups and fixes from me.
> > > 
> > > If this is all fine, I took the liberty of porting the zns enabling to
> > > it and made a public branch for consideration here:
> > > 
> > >  http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 
> > > 
> > > Dmitry Fomichev (1):
> > >   hw/block/nvme: report actual LBA data shift in LBAF
> > > 
> > > Keith Busch (5):
> > >   hw/block/nvme: remove pointless rw indirection
> > >   hw/block/nvme: fix log page offset check
> > >   hw/block/nvme: support per-namespace smart log
> > >   hw/block/nvme: validate command set selected
> > >   hw/block/nvme: support for admin-only command set
> > > 
> > > Klaus Jensen (3):
> > >   hw/block/nvme: reject io commands if only admin command set selected
> > >   hw/block/nvme: add nsid to get/setfeat trace events
> > >   hw/block/nvme: add trace event for requests with non-zero status code
> > > 
> > >  hw/block/nvme-ns.c    |   5 ++
> > >  hw/block/nvme.c       | 194 ++++++++++++++++++++----------------------
> > >  hw/block/trace-events |   6 +-
> > >  include/block/nvme.h  |  11 +++
> > >  4 files changed, 114 insertions(+), 102 deletions(-)
> > > 
> > > -- 
> > > 2.24.1
> > > 
> > > 
> > 
> > These fixes all look good to me apart from the odd fixes that has been
> > mentioned in the reviews. Since soft freeze is only two weeks away (Oct
> > 27th), it would be nice to get this staged on nvme-next so we can get a
> > pull sent off to Peter.
> 
> I've fixed up the comments mentioned and added the received reviews.
> Since it was pretty trivial fixups and passes my basic santify tests, I
> went ahead and pushed to nvme-next.
> 

Looks good, thanks!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2020-10-13 18:41 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 22:04 [PATCH 0/9] nvme qemu cleanups and fixes Keith Busch
2020-09-30 22:04 ` [PATCH 1/9] hw/block/nvme: remove pointless rw indirection Keith Busch
2020-10-01  4:05   ` Klaus Jensen
2020-10-01  8:48     ` Klaus Jensen
2020-10-01 15:24       ` Keith Busch
2020-10-01 18:34     ` Klaus Jensen
2020-10-06  1:49   ` Dmitry Fomichev
2020-09-30 22:04 ` [PATCH 2/9] hw/block/nvme: fix log page offset check Keith Busch
2020-09-30 23:18   ` Dmitry Fomichev
2020-10-01  4:05   ` Klaus Jensen
2020-10-01 10:11   ` Philippe Mathieu-Daudé
2020-09-30 22:04 ` [PATCH 3/9] hw/block/nvme: support per-namespace smart log Keith Busch
2020-10-01  4:10   ` Klaus Jensen
2020-10-01 15:20     ` Keith Busch
2020-10-01 17:18       ` Klaus Jensen
2020-10-01 17:30         ` Keith Busch
2020-10-01 17:34           ` Klaus Jensen
2020-10-02  8:48   ` Klaus Jensen
2020-10-06  1:57   ` Dmitry Fomichev
2020-09-30 22:04 ` [PATCH 4/9] hw/block/nvme: validate command set selected Keith Busch
2020-10-01  4:14   ` Klaus Jensen
2020-09-30 22:04 ` [PATCH 5/9] hw/block/nvme: support for admin-only command set Keith Busch
2020-10-01  0:11   ` Dmitry Fomichev
2020-10-01  4:17   ` Klaus Jensen
2020-09-30 22:04 ` [PATCH 6/9] hw/block/nvme: reject io commands if only admin command set selected Keith Busch
2020-09-30 23:11   ` Dmitry Fomichev
2020-09-30 22:04 ` [PATCH 7/9] hw/block/nvme: add nsid to get/setfeat trace events Keith Busch
2020-09-30 22:04 ` [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code Keith Busch
2020-09-30 23:21   ` Dmitry Fomichev
2020-10-01 15:25   ` Philippe Mathieu-Daudé
2020-09-30 22:04 ` [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF Keith Busch
2020-10-01  9:48   ` Klaus Jensen
2020-10-01 18:46 ` [PATCH 0/9] nvme qemu cleanups and fixes Klaus Jensen
2020-10-13  9:04 ` Klaus Jensen
2020-10-13 17:48   ` Keith Busch
2020-10-13 18:36     ` Klaus Jensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.