All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces
@ 2020-04-15  5:50 Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 01/48] nvme: rename trace events to nvme_dev Klaus Jensen
                   ` (49 more replies)
  0 siblings, 50 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Hi,

v7 is mostly just changes proposed by Maxim. Also, Gollu's patch for the
bit bucket sgl descriptor has been added (patch #43), but it is of a
pretty manageable size.



Changes since v6
~~~~~~~~~~~~~~~~

* 01/48 ("nvme: rename trace events to nvme_dev")
  - indentation nitpicks (Maxim)

* 04/48 ("nvme: bump spec data structures to v1.3")
  - added defines for temperature threshold and sensor selection (Maxim)
  - added defines for the EUI64 and NGUID NIDT types (Maxim)

* 09/48 ("nvme: add max_ioqpairs device parameter")
  - hoisted a bugfix out (Maxim)

* 11/48 ("nvme: refactor device realization")
  - pulled the removal of the duplicated cmbsz/cmbloc struct members to
    separate patch (Maxim)
  - fixed an incomplete error message (Maxim)

* 12/48 ("nvme: add temperature threshold feature")
  - use the defines added in 04/48 (Maxim)
  - remove the addition of the unused 'temperature' NvmeCtrl member

* 13/48 ("nvme: add support for the get log page command")
  - use the already defined NvmeErrorLog type (Maxim)
  - add the 'temperature' NvmeCtrl member here instead (Maxim)

* 15/48 ("nvme: add missing mandatory features")
  - fix off-by-one when validating dw11 for the interrupt vector config
    feature (Maxim)
  - some nitpicks (Maxim)

* 16/48 ("nvme: additional tracing")
  - moved some traces (that was added in later patches) to here

* 19/48 ("nvme: support identify namespace descriptor list")
  - refactored for the nth time; I think I'm finally kind of happy with
    it

* 20/48 ("nvme: enforce valid queue creation sequence")
  - added a comment (Maxim)

* 24/48 ("nvme: add mapping helpers")
  - fixed an off-by-one error that would cause valid read/writes to the
    CMB to be rejected (Maxim)
  - add a check on len so we don't add 0 sized elements to the iovec
    (Maxim)

* 26/48 ("nvme: remove redundant has_sg member")
  - pulled the s/dma_acct/block_acct change to a separate patch (Maxim)

* 36/48 ("nvme: allow multiple aios per command")
  - renamed nvme_req_register_aio to nvme_req_add_aio and added some
    comments (Maxim)
  - killed the prinfo patch
  - pulled a bunch of refactoring stuff into other patches to make this
    one easier to read (Maxim)

* 42/48 ("nvme: add support for scatter gather lists")
  - pulled the nvme_addr_read hardening into its own patch (Maxim)
  - refactored the logic around allowed sgl descriptor types (Maxim)
  - additional comments (Maxim)

* 45/48 ("nvme: support multiple namespaces")
  - fix an inconsistent return value
  - fix an edge error case where all namespaces are allocated (Maxim)



git-backport-diff against rebased v6
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/48:[0006] [FC] 'nvme: rename trace events to nvme_dev'
002/48:[----] [--] 'nvme: remove superfluous breaks'
003/48:[----] [--] 'nvme: move device parameters to separate struct'
004/48:[0017] [FC] 'nvme: bump spec data structures to v1.3'
005/48:[down]      'nvme: use constants in identify'
006/48:[----] [--] 'nvme: refactor nvme_addr_read'
007/48:[----] [--] 'nvme: add support for the abort command'
008/48:[down]      'nvme: fix pci doorbell size calculation'
009/48:[0002] [FC] 'nvme: add max_ioqpairs device parameter'
010/48:[down]      'nvme: remove redundant cmbloc/cmbsz members'
011/48:[0022] [FC] 'nvme: refactor device realization'
012/48:[0013] [FC] 'nvme: add temperature threshold feature'
013/48:[0007] [FC] 'nvme: add support for the get log page command'
014/48:[----] [-C] 'nvme: add support for the asynchronous event request command'
015/48:[0002] [FC] 'nvme: add missing mandatory features'
016/48:[0014] [FC] 'nvme: additional tracing'
017/48:[----] [--] 'nvme: make sure ncqr and nsqr is valid'
018/48:[----] [--] 'nvme: add log specific field to trace events'
019/48:[0029] [FC] 'nvme: support identify namespace descriptor list'
020/48:[0005] [FC] 'nvme: enforce valid queue creation sequence'
021/48:[----] [--] 'nvme: provide the mandatory subnqn field'
022/48:[----] [--] 'nvme: bump supported version to v1.3'
023/48:[----] [--] 'nvme: memset preallocated requests structures'
024/48:[0010] [FC] 'nvme: add mapping helpers'
025/48:[down]      'nvme: replace dma_acct with blk_acct equivalent'
026/48:[0007] [FC] 'nvme: remove redundant has_sg member'
027/48:[0011] [FC] 'nvme: refactor dma read/write'
028/48:[0004] [FC] 'nvme: pass request along for tracing'
029/48:[----] [--] 'nvme: add request mapping helper'
030/48:[----] [--] 'nvme: verify validity of prp lists in the cmb'
031/48:[----] [-C] 'nvme: refactor request bounds checking'
032/48:[----] [--] 'nvme: add check for mdts'
033/48:[down]      'nvme: be consistent about zeros vs zeroes'
034/48:[down]      'nvme: refactor NvmeRequest'
035/48:[down]      'nvme: remove NvmeCmd parameter'
036/48:[0218] [FC] 'nvme: allow multiple aios per command'
037/48:[down]      'nvme: add nvme_check_rw helper'
038/48:[----] [--] 'nvme: use preallocated qsg/iov in nvme_dma_prp'
039/48:[----] [--] 'pci: pass along the return value of dma_memory_rw'
040/48:[----] [-C] 'nvme: handle dma errors'
041/48:[down]      'nvme: harden cmb access'
042/48:[0102] [FC] 'nvme: add support for scatter gather lists'
043/48:[down]      'nvme: add support for sgl bit bucket descriptor'
044/48:[----] [-C] 'nvme: refactor identify active namespace id list'
045/48:[0036] [FC] 'nvme: support multiple namespaces'
046/48:[----] [--] 'pci: allocate pci id for nvme'
047/48:[----] [--] 'nvme: change controller pci id'
048/48:[----] [-C] 'nvme: make lba data size configurable'



Gollu Appalanaidu (1):
  nvme: add support for sgl bit bucket descriptor

Klaus Jensen (47):
  nvme: rename trace events to nvme_dev
  nvme: remove superfluous breaks
  nvme: move device parameters to separate struct
  nvme: bump spec data structures to v1.3
  nvme: use constants in identify
  nvme: refactor nvme_addr_read
  nvme: add support for the abort command
  nvme: fix pci doorbell size calculation
  nvme: add max_ioqpairs device parameter
  nvme: remove redundant cmbloc/cmbsz members
  nvme: refactor device realization
  nvme: add temperature threshold feature
  nvme: add support for the get log page command
  nvme: add support for the asynchronous event request command
  nvme: add missing mandatory features
  nvme: additional tracing
  nvme: make sure ncqr and nsqr is valid
  nvme: add log specific field to trace events
  nvme: support identify namespace descriptor list
  nvme: enforce valid queue creation sequence
  nvme: provide the mandatory subnqn field
  nvme: bump supported version to v1.3
  nvme: memset preallocated requests structures
  nvme: add mapping helpers
  nvme: replace dma_acct with blk_acct equivalent
  nvme: remove redundant has_sg member
  nvme: refactor dma read/write
  nvme: pass request along for tracing
  nvme: add request mapping helper
  nvme: verify validity of prp lists in the cmb
  nvme: refactor request bounds checking
  nvme: add check for mdts
  nvme: be consistent about zeros vs zeroes
  nvme: refactor NvmeRequest
  nvme: remove NvmeCmd parameter
  nvme: allow multiple aios per command
  nvme: add nvme_check_rw helper
  nvme: use preallocated qsg/iov in nvme_dma_prp
  pci: pass along the return value of dma_memory_rw
  nvme: handle dma errors
  nvme: harden cmb access
  nvme: add support for scatter gather lists
  nvme: refactor identify active namespace id list
  nvme: support multiple namespaces
  pci: allocate pci id for nvme
  nvme: change controller pci id
  nvme: make lba data size configurable

 MAINTAINERS            |    1 +
 block/nvme.c           |   22 +-
 docs/specs/nvme.txt    |   25 +
 docs/specs/pci-ids.txt |    1 +
 hw/block/Makefile.objs |    2 +-
 hw/block/nvme-ns.c     |  162 +++
 hw/block/nvme-ns.h     |   62 ++
 hw/block/nvme.c        | 2139 ++++++++++++++++++++++++++++++++--------
 hw/block/nvme.h        |  177 +++-
 hw/block/trace-events  |  206 ++--
 hw/core/machine.c      |    1 +
 include/block/nvme.h   |  190 +++-
 include/hw/pci/pci.h   |    4 +-
 13 files changed, 2413 insertions(+), 579 deletions(-)
 create mode 100644 docs/specs/nvme.txt
 create mode 100644 hw/block/nvme-ns.c
 create mode 100644 hw/block/nvme-ns.h

-- 
2.26.0



^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 01/48] nvme: rename trace events to nvme_dev
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 02/48] nvme: remove superfluous breaks Klaus Jensen
                   ` (48 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Change the prefix of all nvme device related trace events to 'nvme_dev'
to not clash with trace events from the nvme block driver.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 190 +++++++++++++++++++++---------------------
 hw/block/trace-events | 172 +++++++++++++++++++-------------------
 2 files changed, 180 insertions(+), 182 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d28335cbf377..01e18fb9eb1f 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -112,16 +112,16 @@ static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
 {
     if (cq->irq_enabled) {
         if (msix_enabled(&(n->parent_obj))) {
-            trace_nvme_irq_msix(cq->vector);
+            trace_nvme_dev_irq_msix(cq->vector);
             msix_notify(&(n->parent_obj), cq->vector);
         } else {
-            trace_nvme_irq_pin();
+            trace_nvme_dev_irq_pin();
             assert(cq->cqid < 64);
             n->irq_status |= 1 << cq->cqid;
             nvme_irq_check(n);
         }
     } else {
-        trace_nvme_irq_masked();
+        trace_nvme_dev_irq_masked();
     }
 }
 
@@ -146,7 +146,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
     int num_prps = (len >> n->page_bits) + 1;
 
     if (unlikely(!prp1)) {
-        trace_nvme_err_invalid_prp();
+        trace_nvme_dev_err_invalid_prp();
         return NVME_INVALID_FIELD | NVME_DNR;
     } else if (n->cmbsz && prp1 >= n->ctrl_mem.addr &&
                prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
@@ -160,7 +160,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
     len -= trans_len;
     if (len) {
         if (unlikely(!prp2)) {
-            trace_nvme_err_invalid_prp2_missing();
+            trace_nvme_dev_err_invalid_prp2_missing();
             goto unmap;
         }
         if (len > n->page_size) {
@@ -176,7 +176,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
 
                 if (i == n->max_prp_ents - 1 && len > n->page_size) {
                     if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
-                        trace_nvme_err_invalid_prplist_ent(prp_ent);
+                        trace_nvme_dev_err_invalid_prplist_ent(prp_ent);
                         goto unmap;
                     }
 
@@ -189,7 +189,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
                 }
 
                 if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
-                    trace_nvme_err_invalid_prplist_ent(prp_ent);
+                    trace_nvme_dev_err_invalid_prplist_ent(prp_ent);
                     goto unmap;
                 }
 
@@ -204,7 +204,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
             }
         } else {
             if (unlikely(prp2 & (n->page_size - 1))) {
-                trace_nvme_err_invalid_prp2_align(prp2);
+                trace_nvme_dev_err_invalid_prp2_align(prp2);
                 goto unmap;
             }
             if (qsg->nsg) {
@@ -252,20 +252,20 @@ static uint16_t nvme_dma_read_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
     QEMUIOVector iov;
     uint16_t status = NVME_SUCCESS;
 
-    trace_nvme_dma_read(prp1, prp2);
+    trace_nvme_dev_dma_read(prp1, prp2);
 
     if (nvme_map_prp(&qsg, &iov, prp1, prp2, len, n)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
     if (qsg.nsg > 0) {
         if (unlikely(dma_buf_read(ptr, len, &qsg))) {
-            trace_nvme_err_invalid_dma();
+            trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
         qemu_sglist_destroy(&qsg);
     } else {
         if (unlikely(qemu_iovec_from_buf(&iov, 0, ptr, len) != len)) {
-            trace_nvme_err_invalid_dma();
+            trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
         qemu_iovec_destroy(&iov);
@@ -354,7 +354,7 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     uint32_t count = nlb << data_shift;
 
     if (unlikely(slba + nlb > ns->id_ns.nsze)) {
-        trace_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
+        trace_nvme_dev_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
         return NVME_LBA_RANGE | NVME_DNR;
     }
 
@@ -382,11 +382,11 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     int is_write = rw->opcode == NVME_CMD_WRITE ? 1 : 0;
     enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
 
-    trace_nvme_rw(is_write ? "write" : "read", nlb, data_size, slba);
+    trace_nvme_dev_rw(is_write ? "write" : "read", nlb, data_size, slba);
 
     if (unlikely((slba + nlb) > ns->id_ns.nsze)) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
-        trace_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
+        trace_nvme_dev_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
         return NVME_LBA_RANGE | NVME_DNR;
     }
 
@@ -421,7 +421,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t nsid = le32_to_cpu(cmd->nsid);
 
     if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
-        trace_nvme_err_invalid_ns(nsid, n->num_namespaces);
+        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
@@ -435,7 +435,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_CMD_READ:
         return nvme_rw(n, ns, cmd, req);
     default:
-        trace_nvme_err_invalid_opc(cmd->opcode);
+        trace_nvme_dev_err_invalid_opc(cmd->opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
     }
 }
@@ -460,11 +460,11 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeCmd *cmd)
     uint16_t qid = le16_to_cpu(c->qid);
 
     if (unlikely(!qid || nvme_check_sqid(n, qid))) {
-        trace_nvme_err_invalid_del_sq(qid);
+        trace_nvme_dev_err_invalid_del_sq(qid);
         return NVME_INVALID_QID | NVME_DNR;
     }
 
-    trace_nvme_del_sq(qid);
+    trace_nvme_dev_del_sq(qid);
 
     sq = n->sq[qid];
     while (!QTAILQ_EMPTY(&sq->out_req_list)) {
@@ -528,26 +528,26 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     uint16_t qflags = le16_to_cpu(c->sq_flags);
     uint64_t prp1 = le64_to_cpu(c->prp1);
 
-    trace_nvme_create_sq(prp1, sqid, cqid, qsize, qflags);
+    trace_nvme_dev_create_sq(prp1, sqid, cqid, qsize, qflags);
 
     if (unlikely(!cqid || nvme_check_cqid(n, cqid))) {
-        trace_nvme_err_invalid_create_sq_cqid(cqid);
+        trace_nvme_dev_err_invalid_create_sq_cqid(cqid);
         return NVME_INVALID_CQID | NVME_DNR;
     }
     if (unlikely(!sqid || !nvme_check_sqid(n, sqid))) {
-        trace_nvme_err_invalid_create_sq_sqid(sqid);
+        trace_nvme_dev_err_invalid_create_sq_sqid(sqid);
         return NVME_INVALID_QID | NVME_DNR;
     }
     if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
-        trace_nvme_err_invalid_create_sq_size(qsize);
+        trace_nvme_dev_err_invalid_create_sq_size(qsize);
         return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
     }
     if (unlikely(!prp1 || prp1 & (n->page_size - 1))) {
-        trace_nvme_err_invalid_create_sq_addr(prp1);
+        trace_nvme_dev_err_invalid_create_sq_addr(prp1);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
     if (unlikely(!(NVME_SQ_FLAGS_PC(qflags)))) {
-        trace_nvme_err_invalid_create_sq_qflags(NVME_SQ_FLAGS_PC(qflags));
+        trace_nvme_dev_err_invalid_create_sq_qflags(NVME_SQ_FLAGS_PC(qflags));
         return NVME_INVALID_FIELD | NVME_DNR;
     }
     sq = g_malloc0(sizeof(*sq));
@@ -573,17 +573,17 @@ static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeCmd *cmd)
     uint16_t qid = le16_to_cpu(c->qid);
 
     if (unlikely(!qid || nvme_check_cqid(n, qid))) {
-        trace_nvme_err_invalid_del_cq_cqid(qid);
+        trace_nvme_dev_err_invalid_del_cq_cqid(qid);
         return NVME_INVALID_CQID | NVME_DNR;
     }
 
     cq = n->cq[qid];
     if (unlikely(!QTAILQ_EMPTY(&cq->sq_list))) {
-        trace_nvme_err_invalid_del_cq_notempty(qid);
+        trace_nvme_dev_err_invalid_del_cq_notempty(qid);
         return NVME_INVALID_QUEUE_DEL;
     }
     nvme_irq_deassert(n, cq);
-    trace_nvme_del_cq(qid);
+    trace_nvme_dev_del_cq(qid);
     nvme_free_cq(cq, n);
     return NVME_SUCCESS;
 }
@@ -616,27 +616,27 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
     uint16_t qflags = le16_to_cpu(c->cq_flags);
     uint64_t prp1 = le64_to_cpu(c->prp1);
 
-    trace_nvme_create_cq(prp1, cqid, vector, qsize, qflags,
-                         NVME_CQ_FLAGS_IEN(qflags) != 0);
+    trace_nvme_dev_create_cq(prp1, cqid, vector, qsize, qflags,
+                             NVME_CQ_FLAGS_IEN(qflags) != 0);
 
     if (unlikely(!cqid || !nvme_check_cqid(n, cqid))) {
-        trace_nvme_err_invalid_create_cq_cqid(cqid);
+        trace_nvme_dev_err_invalid_create_cq_cqid(cqid);
         return NVME_INVALID_CQID | NVME_DNR;
     }
     if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
-        trace_nvme_err_invalid_create_cq_size(qsize);
+        trace_nvme_dev_err_invalid_create_cq_size(qsize);
         return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
     }
     if (unlikely(!prp1)) {
-        trace_nvme_err_invalid_create_cq_addr(prp1);
+        trace_nvme_dev_err_invalid_create_cq_addr(prp1);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
     if (unlikely(vector > n->num_queues)) {
-        trace_nvme_err_invalid_create_cq_vector(vector);
+        trace_nvme_dev_err_invalid_create_cq_vector(vector);
         return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
     }
     if (unlikely(!(NVME_CQ_FLAGS_PC(qflags)))) {
-        trace_nvme_err_invalid_create_cq_qflags(NVME_CQ_FLAGS_PC(qflags));
+        trace_nvme_dev_err_invalid_create_cq_qflags(NVME_CQ_FLAGS_PC(qflags));
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
@@ -651,7 +651,7 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c)
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
 
-    trace_nvme_identify_ctrl();
+    trace_nvme_dev_identify_ctrl();
 
     return nvme_dma_read_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl),
         prp1, prp2);
@@ -664,10 +664,10 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
 
-    trace_nvme_identify_ns(nsid);
+    trace_nvme_dev_identify_ns(nsid);
 
     if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
-        trace_nvme_err_invalid_ns(nsid, n->num_namespaces);
+        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
@@ -687,7 +687,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
     uint16_t ret;
     int i, j = 0;
 
-    trace_nvme_identify_nslist(min_nsid);
+    trace_nvme_dev_identify_nslist(min_nsid);
 
     list = g_malloc0(data_len);
     for (i = 0; i < n->num_namespaces; i++) {
@@ -716,14 +716,14 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
     case 0x02:
         return nvme_identify_nslist(n, c);
     default:
-        trace_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
+        trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 }
 
 static inline void nvme_set_timestamp(NvmeCtrl *n, uint64_t ts)
 {
-    trace_nvme_setfeat_timestamp(ts);
+    trace_nvme_dev_setfeat_timestamp(ts);
 
     n->host_timestamp = le64_to_cpu(ts);
     n->timestamp_set_qemu_clock_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
@@ -756,7 +756,7 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
     /* If the host timestamp is non-zero, set the timestamp origin */
     ts.origin = n->host_timestamp ? 0x01 : 0x00;
 
-    trace_nvme_getfeat_timestamp(ts.all);
+    trace_nvme_dev_getfeat_timestamp(ts.all);
 
     return cpu_to_le64(ts.all);
 }
@@ -780,17 +780,17 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     switch (dw10) {
     case NVME_VOLATILE_WRITE_CACHE:
         result = blk_enable_write_cache(n->conf.blk);
-        trace_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
+        trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
         result = cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
-        trace_nvme_getfeat_numq(result);
+        trace_nvme_dev_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
         break;
     default:
-        trace_nvme_err_invalid_getfeat(dw10);
+        trace_nvme_dev_err_invalid_getfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
@@ -826,9 +826,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
-        trace_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
-                                ((dw11 >> 16) & 0xFFFF) + 1,
-                                n->num_queues - 1, n->num_queues - 1);
+        trace_nvme_dev_setfeat_numq((dw11 & 0xFFFF) + 1,
+                                    ((dw11 >> 16) & 0xFFFF) + 1,
+                                    n->num_queues - 1, n->num_queues - 1);
         req->cqe.result =
             cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
         break;
@@ -838,7 +838,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         break;
 
     default:
-        trace_nvme_err_invalid_setfeat(dw10);
+        trace_nvme_dev_err_invalid_setfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
     return NVME_SUCCESS;
@@ -862,7 +862,7 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_ADM_CMD_GET_FEATURES:
         return nvme_get_feature(n, cmd, req);
     default:
-        trace_nvme_err_invalid_admin_opc(cmd->opcode);
+        trace_nvme_dev_err_invalid_admin_opc(cmd->opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
     }
 }
@@ -925,77 +925,77 @@ static int nvme_start_ctrl(NvmeCtrl *n)
     uint32_t page_size = 1 << page_bits;
 
     if (unlikely(n->cq[0])) {
-        trace_nvme_err_startfail_cq();
+        trace_nvme_dev_err_startfail_cq();
         return -1;
     }
     if (unlikely(n->sq[0])) {
-        trace_nvme_err_startfail_sq();
+        trace_nvme_dev_err_startfail_sq();
         return -1;
     }
     if (unlikely(!n->bar.asq)) {
-        trace_nvme_err_startfail_nbarasq();
+        trace_nvme_dev_err_startfail_nbarasq();
         return -1;
     }
     if (unlikely(!n->bar.acq)) {
-        trace_nvme_err_startfail_nbaracq();
+        trace_nvme_dev_err_startfail_nbaracq();
         return -1;
     }
     if (unlikely(n->bar.asq & (page_size - 1))) {
-        trace_nvme_err_startfail_asq_misaligned(n->bar.asq);
+        trace_nvme_dev_err_startfail_asq_misaligned(n->bar.asq);
         return -1;
     }
     if (unlikely(n->bar.acq & (page_size - 1))) {
-        trace_nvme_err_startfail_acq_misaligned(n->bar.acq);
+        trace_nvme_dev_err_startfail_acq_misaligned(n->bar.acq);
         return -1;
     }
     if (unlikely(NVME_CC_MPS(n->bar.cc) <
                  NVME_CAP_MPSMIN(n->bar.cap))) {
-        trace_nvme_err_startfail_page_too_small(
+        trace_nvme_dev_err_startfail_page_too_small(
                     NVME_CC_MPS(n->bar.cc),
                     NVME_CAP_MPSMIN(n->bar.cap));
         return -1;
     }
     if (unlikely(NVME_CC_MPS(n->bar.cc) >
                  NVME_CAP_MPSMAX(n->bar.cap))) {
-        trace_nvme_err_startfail_page_too_large(
+        trace_nvme_dev_err_startfail_page_too_large(
                     NVME_CC_MPS(n->bar.cc),
                     NVME_CAP_MPSMAX(n->bar.cap));
         return -1;
     }
     if (unlikely(NVME_CC_IOCQES(n->bar.cc) <
                  NVME_CTRL_CQES_MIN(n->id_ctrl.cqes))) {
-        trace_nvme_err_startfail_cqent_too_small(
+        trace_nvme_dev_err_startfail_cqent_too_small(
                     NVME_CC_IOCQES(n->bar.cc),
                     NVME_CTRL_CQES_MIN(n->bar.cap));
         return -1;
     }
     if (unlikely(NVME_CC_IOCQES(n->bar.cc) >
                  NVME_CTRL_CQES_MAX(n->id_ctrl.cqes))) {
-        trace_nvme_err_startfail_cqent_too_large(
+        trace_nvme_dev_err_startfail_cqent_too_large(
                     NVME_CC_IOCQES(n->bar.cc),
                     NVME_CTRL_CQES_MAX(n->bar.cap));
         return -1;
     }
     if (unlikely(NVME_CC_IOSQES(n->bar.cc) <
                  NVME_CTRL_SQES_MIN(n->id_ctrl.sqes))) {
-        trace_nvme_err_startfail_sqent_too_small(
+        trace_nvme_dev_err_startfail_sqent_too_small(
                     NVME_CC_IOSQES(n->bar.cc),
                     NVME_CTRL_SQES_MIN(n->bar.cap));
         return -1;
     }
     if (unlikely(NVME_CC_IOSQES(n->bar.cc) >
                  NVME_CTRL_SQES_MAX(n->id_ctrl.sqes))) {
-        trace_nvme_err_startfail_sqent_too_large(
+        trace_nvme_dev_err_startfail_sqent_too_large(
                     NVME_CC_IOSQES(n->bar.cc),
                     NVME_CTRL_SQES_MAX(n->bar.cap));
         return -1;
     }
     if (unlikely(!NVME_AQA_ASQS(n->bar.aqa))) {
-        trace_nvme_err_startfail_asqent_sz_zero();
+        trace_nvme_dev_err_startfail_asqent_sz_zero();
         return -1;
     }
     if (unlikely(!NVME_AQA_ACQS(n->bar.aqa))) {
-        trace_nvme_err_startfail_acqent_sz_zero();
+        trace_nvme_dev_err_startfail_acqent_sz_zero();
         return -1;
     }
 
@@ -1018,14 +1018,14 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
     unsigned size)
 {
     if (unlikely(offset & (sizeof(uint32_t) - 1))) {
-        NVME_GUEST_ERR(nvme_ub_mmiowr_misaligned32,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiowr_misaligned32,
                        "MMIO write not 32-bit aligned,"
                        " offset=0x%"PRIx64"", offset);
         /* should be ignored, fall through for now */
     }
 
     if (unlikely(size < sizeof(uint32_t))) {
-        NVME_GUEST_ERR(nvme_ub_mmiowr_toosmall,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiowr_toosmall,
                        "MMIO write smaller than 32-bits,"
                        " offset=0x%"PRIx64", size=%u",
                        offset, size);
@@ -1035,32 +1035,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
     switch (offset) {
     case 0xc:   /* INTMS */
         if (unlikely(msix_enabled(&(n->parent_obj)))) {
-            NVME_GUEST_ERR(nvme_ub_mmiowr_intmask_with_msix,
+            NVME_GUEST_ERR(nvme_dev_ub_mmiowr_intmask_with_msix,
                            "undefined access to interrupt mask set"
                            " when MSI-X is enabled");
             /* should be ignored, fall through for now */
         }
         n->bar.intms |= data & 0xffffffff;
         n->bar.intmc = n->bar.intms;
-        trace_nvme_mmio_intm_set(data & 0xffffffff,
-                                 n->bar.intmc);
+        trace_nvme_dev_mmio_intm_set(data & 0xffffffff, n->bar.intmc);
         nvme_irq_check(n);
         break;
     case 0x10:  /* INTMC */
         if (unlikely(msix_enabled(&(n->parent_obj)))) {
-            NVME_GUEST_ERR(nvme_ub_mmiowr_intmask_with_msix,
+            NVME_GUEST_ERR(nvme_dev_ub_mmiowr_intmask_with_msix,
                            "undefined access to interrupt mask clr"
                            " when MSI-X is enabled");
             /* should be ignored, fall through for now */
         }
         n->bar.intms &= ~(data & 0xffffffff);
         n->bar.intmc = n->bar.intms;
-        trace_nvme_mmio_intm_clr(data & 0xffffffff,
-                                 n->bar.intmc);
+        trace_nvme_dev_mmio_intm_clr(data & 0xffffffff, n->bar.intmc);
         nvme_irq_check(n);
         break;
     case 0x14:  /* CC */
-        trace_nvme_mmio_cfg(data & 0xffffffff);
+        trace_nvme_dev_mmio_cfg(data & 0xffffffff);
         /* Windows first sends data, then sends enable bit */
         if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) &&
             !NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc))
@@ -1071,42 +1069,42 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
         if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
             n->bar.cc = data;
             if (unlikely(nvme_start_ctrl(n))) {
-                trace_nvme_err_startfail();
+                trace_nvme_dev_err_startfail();
                 n->bar.csts = NVME_CSTS_FAILED;
             } else {
-                trace_nvme_mmio_start_success();
+                trace_nvme_dev_mmio_start_success();
                 n->bar.csts = NVME_CSTS_READY;
             }
         } else if (!NVME_CC_EN(data) && NVME_CC_EN(n->bar.cc)) {
-            trace_nvme_mmio_stopped();
+            trace_nvme_dev_mmio_stopped();
             nvme_clear_ctrl(n);
             n->bar.csts &= ~NVME_CSTS_READY;
         }
         if (NVME_CC_SHN(data) && !(NVME_CC_SHN(n->bar.cc))) {
-            trace_nvme_mmio_shutdown_set();
+            trace_nvme_dev_mmio_shutdown_set();
             nvme_clear_ctrl(n);
             n->bar.cc = data;
             n->bar.csts |= NVME_CSTS_SHST_COMPLETE;
         } else if (!NVME_CC_SHN(data) && NVME_CC_SHN(n->bar.cc)) {
-            trace_nvme_mmio_shutdown_cleared();
+            trace_nvme_dev_mmio_shutdown_cleared();
             n->bar.csts &= ~NVME_CSTS_SHST_COMPLETE;
             n->bar.cc = data;
         }
         break;
     case 0x1C:  /* CSTS */
         if (data & (1 << 4)) {
-            NVME_GUEST_ERR(nvme_ub_mmiowr_ssreset_w1c_unsupported,
+            NVME_GUEST_ERR(nvme_dev_ub_mmiowr_ssreset_w1c_unsupported,
                            "attempted to W1C CSTS.NSSRO"
                            " but CAP.NSSRS is zero (not supported)");
         } else if (data != 0) {
-            NVME_GUEST_ERR(nvme_ub_mmiowr_ro_csts,
+            NVME_GUEST_ERR(nvme_dev_ub_mmiowr_ro_csts,
                            "attempted to set a read only bit"
                            " of controller status");
         }
         break;
     case 0x20:  /* NSSR */
         if (data == 0x4E564D65) {
-            trace_nvme_ub_mmiowr_ssreset_unsupported();
+            trace_nvme_dev_ub_mmiowr_ssreset_unsupported();
         } else {
             /* The spec says that writes of other values have no effect */
             return;
@@ -1114,35 +1112,35 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
         break;
     case 0x24:  /* AQA */
         n->bar.aqa = data & 0xffffffff;
-        trace_nvme_mmio_aqattr(data & 0xffffffff);
+        trace_nvme_dev_mmio_aqattr(data & 0xffffffff);
         break;
     case 0x28:  /* ASQ */
         n->bar.asq = data;
-        trace_nvme_mmio_asqaddr(data);
+        trace_nvme_dev_mmio_asqaddr(data);
         break;
     case 0x2c:  /* ASQ hi */
         n->bar.asq |= data << 32;
-        trace_nvme_mmio_asqaddr_hi(data, n->bar.asq);
+        trace_nvme_dev_mmio_asqaddr_hi(data, n->bar.asq);
         break;
     case 0x30:  /* ACQ */
-        trace_nvme_mmio_acqaddr(data);
+        trace_nvme_dev_mmio_acqaddr(data);
         n->bar.acq = data;
         break;
     case 0x34:  /* ACQ hi */
         n->bar.acq |= data << 32;
-        trace_nvme_mmio_acqaddr_hi(data, n->bar.acq);
+        trace_nvme_dev_mmio_acqaddr_hi(data, n->bar.acq);
         break;
     case 0x38:  /* CMBLOC */
-        NVME_GUEST_ERR(nvme_ub_mmiowr_cmbloc_reserved,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiowr_cmbloc_reserved,
                        "invalid write to reserved CMBLOC"
                        " when CMBSZ is zero, ignored");
         return;
     case 0x3C:  /* CMBSZ */
-        NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiowr_cmbsz_readonly,
                        "invalid write to read only CMBSZ, ignored");
         return;
     default:
-        NVME_GUEST_ERR(nvme_ub_mmiowr_invalid,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiowr_invalid,
                        "invalid MMIO write,"
                        " offset=0x%"PRIx64", data=%"PRIx64"",
                        offset, data);
@@ -1157,12 +1155,12 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
     uint64_t val = 0;
 
     if (unlikely(addr & (sizeof(uint32_t) - 1))) {
-        NVME_GUEST_ERR(nvme_ub_mmiord_misaligned32,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiord_misaligned32,
                        "MMIO read not 32-bit aligned,"
                        " offset=0x%"PRIx64"", addr);
         /* should RAZ, fall through for now */
     } else if (unlikely(size < sizeof(uint32_t))) {
-        NVME_GUEST_ERR(nvme_ub_mmiord_toosmall,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiord_toosmall,
                        "MMIO read smaller than 32-bits,"
                        " offset=0x%"PRIx64"", addr);
         /* should RAZ, fall through for now */
@@ -1171,7 +1169,7 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
     if (addr < sizeof(n->bar)) {
         memcpy(&val, ptr + addr, size);
     } else {
-        NVME_GUEST_ERR(nvme_ub_mmiord_invalid_ofs,
+        NVME_GUEST_ERR(nvme_dev_ub_mmiord_invalid_ofs,
                        "MMIO read beyond last register,"
                        " offset=0x%"PRIx64", returning 0", addr);
     }
@@ -1184,7 +1182,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
     uint32_t qid;
 
     if (unlikely(addr & ((1 << 2) - 1))) {
-        NVME_GUEST_ERR(nvme_ub_db_wr_misaligned,
+        NVME_GUEST_ERR(nvme_dev_ub_db_wr_misaligned,
                        "doorbell write not 32-bit aligned,"
                        " offset=0x%"PRIx64", ignoring", addr);
         return;
@@ -1199,7 +1197,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
 
         qid = (addr - (0x1000 + (1 << 2))) >> 3;
         if (unlikely(nvme_check_cqid(n, qid))) {
-            NVME_GUEST_ERR(nvme_ub_db_wr_invalid_cq,
+            NVME_GUEST_ERR(nvme_dev_ub_db_wr_invalid_cq,
                            "completion queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
@@ -1208,7 +1206,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
 
         cq = n->cq[qid];
         if (unlikely(new_head >= cq->size)) {
-            NVME_GUEST_ERR(nvme_ub_db_wr_invalid_cqhead,
+            NVME_GUEST_ERR(nvme_dev_ub_db_wr_invalid_cqhead,
                            "completion queue doorbell write value"
                            " beyond queue size, sqid=%"PRIu32","
                            " new_head=%"PRIu16", ignoring",
@@ -1237,7 +1235,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
 
         qid = (addr - 0x1000) >> 3;
         if (unlikely(nvme_check_sqid(n, qid))) {
-            NVME_GUEST_ERR(nvme_ub_db_wr_invalid_sq,
+            NVME_GUEST_ERR(nvme_dev_ub_db_wr_invalid_sq,
                            "submission queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
@@ -1246,7 +1244,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
 
         sq = n->sq[qid];
         if (unlikely(new_tail >= sq->size)) {
-            NVME_GUEST_ERR(nvme_ub_db_wr_invalid_sqtail,
+            NVME_GUEST_ERR(nvme_dev_ub_db_wr_invalid_sqtail,
                            "submission queue doorbell write value"
                            " beyond queue size, sqid=%"PRIu32","
                            " new_tail=%"PRIu16", ignoring",
diff --git a/hw/block/trace-events b/hw/block/trace-events
index bf6d11b58b85..75b0c7a0cb60 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -29,96 +29,96 @@ hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int t
 
 # nvme.c
 # nvme traces for successful events
-nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
-nvme_irq_pin(void) "pulsing IRQ pin"
-nvme_irq_masked(void) "IRQ is masked"
-nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
-nvme_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
-nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
-nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
-nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
-nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
-nvme_identify_ctrl(void) "identify controller"
-nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
-nvme_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
-nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
-nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
-nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
-nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
-nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
-nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
-nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
-nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
-nvme_mmio_aqattr(uint64_t data) "wrote MMIO, admin queue attributes=0x%"PRIx64""
-nvme_mmio_asqaddr(uint64_t data) "wrote MMIO, admin submission queue address=0x%"PRIx64""
-nvme_mmio_acqaddr(uint64_t data) "wrote MMIO, admin completion queue address=0x%"PRIx64""
-nvme_mmio_asqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin submission queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
-nvme_mmio_acqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin completion queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
-nvme_mmio_start_success(void) "setting controller enable bit succeeded"
-nvme_mmio_stopped(void) "cleared controller enable bit"
-nvme_mmio_shutdown_set(void) "shutdown bit set"
-nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
+nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
+nvme_dev_irq_pin(void) "pulsing IRQ pin"
+nvme_dev_irq_masked(void) "IRQ is masked"
+nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
+nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
+nvme_dev_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
+nvme_dev_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
+nvme_dev_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
+nvme_dev_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
+nvme_dev_identify_ctrl(void) "identify controller"
+nvme_dev_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
+nvme_dev_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
+nvme_dev_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
+nvme_dev_getfeat_numq(int result) "get feature number of queues, result=%d"
+nvme_dev_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
+nvme_dev_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
+nvme_dev_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
+nvme_dev_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
+nvme_dev_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
+nvme_dev_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
+nvme_dev_mmio_aqattr(uint64_t data) "wrote MMIO, admin queue attributes=0x%"PRIx64""
+nvme_dev_mmio_asqaddr(uint64_t data) "wrote MMIO, admin submission queue address=0x%"PRIx64""
+nvme_dev_mmio_acqaddr(uint64_t data) "wrote MMIO, admin completion queue address=0x%"PRIx64""
+nvme_dev_mmio_asqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin submission queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
+nvme_dev_mmio_acqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin completion queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
+nvme_dev_mmio_start_success(void) "setting controller enable bit succeeded"
+nvme_dev_mmio_stopped(void) "cleared controller enable bit"
+nvme_dev_mmio_shutdown_set(void) "shutdown bit set"
+nvme_dev_mmio_shutdown_cleared(void) "shutdown bit cleared"
 
 # nvme traces for error conditions
-nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
-nvme_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
-nvme_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
-nvme_err_invalid_prp2_missing(void) "PRP2 is null and more data to be transferred"
-nvme_err_invalid_prp(void) "invalid PRP"
-nvme_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not within 1-%u"
-nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
-nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
-nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
-nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16""
-nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16""
-nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16""
-nvme_err_invalid_create_sq_size(uint16_t qsize) "failed creating submission queue, invalid qsize=%"PRIu16""
-nvme_err_invalid_create_sq_addr(uint64_t addr) "failed creating submission queue, addr=0x%"PRIx64""
-nvme_err_invalid_create_sq_qflags(uint16_t qflags) "failed creating submission queue, qflags=%"PRIu16""
-nvme_err_invalid_del_cq_cqid(uint16_t cqid) "failed deleting completion queue, cqid=%"PRIu16""
-nvme_err_invalid_del_cq_notempty(uint16_t cqid) "failed deleting completion queue, it is not empty, cqid=%"PRIu16""
-nvme_err_invalid_create_cq_cqid(uint16_t cqid) "failed creating completion queue, cqid=%"PRIu16""
-nvme_err_invalid_create_cq_size(uint16_t size) "failed creating completion queue, size=%"PRIu16""
-nvme_err_invalid_create_cq_addr(uint64_t addr) "failed creating completion queue, addr=0x%"PRIx64""
-nvme_err_invalid_create_cq_vector(uint16_t vector) "failed creating completion queue, vector=%"PRIu16""
-nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completion queue, qflags=%"PRIu16""
-nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
-nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
-nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
-nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
-nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
-nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
-nvme_err_startfail_nbaracq(void) "nvme_start_ctrl failed because the admin completion queue address is null"
-nvme_err_startfail_asq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin submission queue address is misaligned: 0x%"PRIx64""
-nvme_err_startfail_acq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin completion queue address is misaligned: 0x%"PRIx64""
-nvme_err_startfail_page_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too small: log2size=%u, min=%u"
-nvme_err_startfail_page_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too large: log2size=%u, max=%u"
-nvme_err_startfail_cqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too small: log2size=%u, min=%u"
-nvme_err_startfail_cqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too large: log2size=%u, max=%u"
-nvme_err_startfail_sqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too small: log2size=%u, min=%u"
-nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too large: log2size=%u, max=%u"
-nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero"
-nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero"
-nvme_err_startfail(void) "setting controller enable bit failed"
+nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
+nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
+nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
+nvme_dev_err_invalid_prp2_missing(void) "PRP2 is null and more data to be transferred"
+nvme_dev_err_invalid_prp(void) "invalid PRP"
+nvme_dev_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not within 1-%u"
+nvme_dev_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
+nvme_dev_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
+nvme_dev_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
+nvme_dev_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16""
+nvme_dev_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16""
+nvme_dev_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16""
+nvme_dev_err_invalid_create_sq_size(uint16_t qsize) "failed creating submission queue, invalid qsize=%"PRIu16""
+nvme_dev_err_invalid_create_sq_addr(uint64_t addr) "failed creating submission queue, addr=0x%"PRIx64""
+nvme_dev_err_invalid_create_sq_qflags(uint16_t qflags) "failed creating submission queue, qflags=%"PRIu16""
+nvme_dev_err_invalid_del_cq_cqid(uint16_t cqid) "failed deleting completion queue, cqid=%"PRIu16""
+nvme_dev_err_invalid_del_cq_notempty(uint16_t cqid) "failed deleting completion queue, it is not empty, cqid=%"PRIu16""
+nvme_dev_err_invalid_create_cq_cqid(uint16_t cqid) "failed creating completion queue, cqid=%"PRIu16""
+nvme_dev_err_invalid_create_cq_size(uint16_t size) "failed creating completion queue, size=%"PRIu16""
+nvme_dev_err_invalid_create_cq_addr(uint64_t addr) "failed creating completion queue, addr=0x%"PRIx64""
+nvme_dev_err_invalid_create_cq_vector(uint16_t vector) "failed creating completion queue, vector=%"PRIu16""
+nvme_dev_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completion queue, qflags=%"PRIu16""
+nvme_dev_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
+nvme_dev_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
+nvme_dev_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
+nvme_dev_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
+nvme_dev_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
+nvme_dev_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
+nvme_dev_err_startfail_nbaracq(void) "nvme_start_ctrl failed because the admin completion queue address is null"
+nvme_dev_err_startfail_asq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin submission queue address is misaligned: 0x%"PRIx64""
+nvme_dev_err_startfail_acq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin completion queue address is misaligned: 0x%"PRIx64""
+nvme_dev_err_startfail_page_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too small: log2size=%u, min=%u"
+nvme_dev_err_startfail_page_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too large: log2size=%u, max=%u"
+nvme_dev_err_startfail_cqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too small: log2size=%u, min=%u"
+nvme_dev_err_startfail_cqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too large: log2size=%u, max=%u"
+nvme_dev_err_startfail_sqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too small: log2size=%u, min=%u"
+nvme_dev_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too large: log2size=%u, max=%u"
+nvme_dev_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero"
+nvme_dev_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero"
+nvme_dev_err_startfail(void) "setting controller enable bit failed"
 
 # Traces for undefined behavior
-nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64""
-nvme_ub_mmiowr_toosmall(uint64_t offset, unsigned size) "MMIO write smaller than 32 bits, offset=0x%"PRIx64", size=%u"
-nvme_ub_mmiowr_intmask_with_msix(void) "undefined access to interrupt mask set when MSI-X is enabled"
-nvme_ub_mmiowr_ro_csts(void) "attempted to set a read only bit of controller status"
-nvme_ub_mmiowr_ssreset_w1c_unsupported(void) "attempted to W1C CSTS.NSSRO but CAP.NSSRS is zero (not supported)"
-nvme_ub_mmiowr_ssreset_unsupported(void) "attempted NVM subsystem reset but CAP.NSSRS is zero (not supported)"
-nvme_ub_mmiowr_cmbloc_reserved(void) "invalid write to reserved CMBLOC when CMBSZ is zero, ignored"
-nvme_ub_mmiowr_cmbsz_readonly(void) "invalid write to read only CMBSZ, ignored"
-nvme_ub_mmiowr_invalid(uint64_t offset, uint64_t data) "invalid MMIO write, offset=0x%"PRIx64", data=0x%"PRIx64""
-nvme_ub_mmiord_misaligned32(uint64_t offset) "MMIO read not 32-bit aligned, offset=0x%"PRIx64""
-nvme_ub_mmiord_toosmall(uint64_t offset) "MMIO read smaller than 32-bits, offset=0x%"PRIx64""
-nvme_ub_mmiord_invalid_ofs(uint64_t offset) "MMIO read beyond last register, offset=0x%"PRIx64", returning 0"
-nvme_ub_db_wr_misaligned(uint64_t offset) "doorbell write not 32-bit aligned, offset=0x%"PRIx64", ignoring"
-nvme_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for nonexistent queue, cqid=%"PRIu32", ignoring"
-nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring"
-nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring"
-nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring"
+nvme_dev_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64""
+nvme_dev_ub_mmiowr_toosmall(uint64_t offset, unsigned size) "MMIO write smaller than 32 bits, offset=0x%"PRIx64", size=%u"
+nvme_dev_ub_mmiowr_intmask_with_msix(void) "undefined access to interrupt mask set when MSI-X is enabled"
+nvme_dev_ub_mmiowr_ro_csts(void) "attempted to set a read only bit of controller status"
+nvme_dev_ub_mmiowr_ssreset_w1c_unsupported(void) "attempted to W1C CSTS.NSSRO but CAP.NSSRS is zero (not supported)"
+nvme_dev_ub_mmiowr_ssreset_unsupported(void) "attempted NVM subsystem reset but CAP.NSSRS is zero (not supported)"
+nvme_dev_ub_mmiowr_cmbloc_reserved(void) "invalid write to reserved CMBLOC when CMBSZ is zero, ignored"
+nvme_dev_ub_mmiowr_cmbsz_readonly(void) "invalid write to read only CMBSZ, ignored"
+nvme_dev_ub_mmiowr_invalid(uint64_t offset, uint64_t data) "invalid MMIO write, offset=0x%"PRIx64", data=0x%"PRIx64""
+nvme_dev_ub_mmiord_misaligned32(uint64_t offset) "MMIO read not 32-bit aligned, offset=0x%"PRIx64""
+nvme_dev_ub_mmiord_toosmall(uint64_t offset) "MMIO read smaller than 32-bits, offset=0x%"PRIx64""
+nvme_dev_ub_mmiord_invalid_ofs(uint64_t offset) "MMIO read beyond last register, offset=0x%"PRIx64", returning 0"
+nvme_dev_ub_db_wr_misaligned(uint64_t offset) "doorbell write not 32-bit aligned, offset=0x%"PRIx64", ignoring"
+nvme_dev_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for nonexistent queue, cqid=%"PRIu32", ignoring"
+nvme_dev_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring"
+nvme_dev_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring"
+nvme_dev_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring"
 
 # xen-block.c
 xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s d%up%u"
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 02/48] nvme: remove superfluous breaks
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 01/48] nvme: rename trace events to nvme_dev Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  6:57   ` Philippe Mathieu-Daudé
  2020-04-15  5:50 ` [PATCH v7 03/48] nvme: move device parameters to separate struct Klaus Jensen
                   ` (47 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

These break statements was left over when commit 3036a626e9ef ("nvme:
add Get/Set Feature Timestamp support") was merged.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 01e18fb9eb1f..da0e8af42823 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -788,7 +788,6 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         break;
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
-        break;
     default:
         trace_nvme_dev_err_invalid_getfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -832,11 +831,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         req->cqe.result =
             cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
         break;
-
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
-        break;
-
     default:
         trace_nvme_dev_err_invalid_setfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 03/48] nvme: move device parameters to separate struct
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 01/48] nvme: rename trace events to nvme_dev Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 02/48] nvme: remove superfluous breaks Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  6:58   ` Philippe Mathieu-Daudé
  2020-04-15  5:50 ` [PATCH v7 04/48] nvme: bump spec data structures to v1.3 Klaus Jensen
                   ` (46 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Move device configuration parameters to separate struct to make it
explicit what is configurable and what is set internally.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 44 ++++++++++++++++++++++----------------------
 hw/block/nvme.h | 16 +++++++++++++---
 2 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index da0e8af42823..249f759f076e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -64,12 +64,12 @@ static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
 {
-    return sqid < n->num_queues && n->sq[sqid] != NULL ? 0 : -1;
+    return sqid < n->params.num_queues && n->sq[sqid] != NULL ? 0 : -1;
 }
 
 static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
 {
-    return cqid < n->num_queues && n->cq[cqid] != NULL ? 0 : -1;
+    return cqid < n->params.num_queues && n->cq[cqid] != NULL ? 0 : -1;
 }
 
 static void nvme_inc_cq_tail(NvmeCQueue *cq)
@@ -631,7 +631,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
         trace_nvme_dev_err_invalid_create_cq_addr(prp1);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
-    if (unlikely(vector > n->num_queues)) {
+    if (unlikely(vector > n->params.num_queues)) {
         trace_nvme_dev_err_invalid_create_cq_vector(vector);
         return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
     }
@@ -783,7 +783,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
-        result = cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
+        result = cpu_to_le32((n->params.num_queues - 2) |
+                             ((n->params.num_queues - 2) << 16));
         trace_nvme_dev_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
@@ -827,9 +828,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_NUMBER_OF_QUEUES:
         trace_nvme_dev_setfeat_numq((dw11 & 0xFFFF) + 1,
                                     ((dw11 >> 16) & 0xFFFF) + 1,
-                                    n->num_queues - 1, n->num_queues - 1);
-        req->cqe.result =
-            cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
+                                    n->params.num_queues - 1,
+                                    n->params.num_queues - 1);
+        req->cqe.result = cpu_to_le32((n->params.num_queues - 2) |
+                                      ((n->params.num_queues - 2) << 16));
         break;
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
@@ -900,12 +902,12 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
 
     blk_drain(n->conf.blk);
 
-    for (i = 0; i < n->num_queues; i++) {
+    for (i = 0; i < n->params.num_queues; i++) {
         if (n->sq[i] != NULL) {
             nvme_free_sq(n->sq[i], n);
         }
     }
-    for (i = 0; i < n->num_queues; i++) {
+    for (i = 0; i < n->params.num_queues; i++) {
         if (n->cq[i] != NULL) {
             nvme_free_cq(n->cq[i], n);
         }
@@ -1306,7 +1308,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     int64_t bs_size;
     uint8_t *pci_conf;
 
-    if (!n->num_queues) {
+    if (!n->params.num_queues) {
         error_setg(errp, "num_queues can't be zero");
         return;
     }
@@ -1322,7 +1324,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
         return;
     }
 
-    if (!n->serial) {
+    if (!n->params.serial) {
         error_setg(errp, "serial property not set");
         return;
     }
@@ -1339,25 +1341,25 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     pcie_endpoint_cap_init(pci_dev, 0x80);
 
     n->num_namespaces = 1;
-    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
+    n->reg_size = pow2ceil(0x1004 + 2 * (n->params.num_queues + 1) * 4);
     n->ns_size = bs_size / (uint64_t)n->num_namespaces;
 
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
-    n->sq = g_new0(NvmeSQueue *, n->num_queues);
-    n->cq = g_new0(NvmeCQueue *, n->num_queues);
+    n->sq = g_new0(NvmeSQueue *, n->params.num_queues);
+    n->cq = g_new0(NvmeCQueue *, n->params.num_queues);
 
     memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n,
                           "nvme", n->reg_size);
     pci_register_bar(pci_dev, 0,
         PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
         &n->iomem);
-    msix_init_exclusive_bar(pci_dev, n->num_queues, 4, NULL);
+    msix_init_exclusive_bar(pci_dev, n->params.num_queues, 4, NULL);
 
     id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
     id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
     strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' ');
     strpadcpy((char *)id->fr, sizeof(id->fr), "1.0", ' ');
-    strpadcpy((char *)id->sn, sizeof(id->sn), n->serial, ' ');
+    strpadcpy((char *)id->sn, sizeof(id->sn), n->params.serial, ' ');
     id->rab = 6;
     id->ieee[0] = 0x00;
     id->ieee[1] = 0x02;
@@ -1386,7 +1388,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     n->bar.vs = 0x00010200;
     n->bar.intmc = n->bar.intms = 0;
 
-    if (n->cmb_size_mb) {
+    if (n->params.cmb_size_mb) {
 
         NVME_CMBLOC_SET_BIR(n->bar.cmbloc, 2);
         NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
@@ -1397,7 +1399,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
         NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
         NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
         NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
-        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->cmb_size_mb);
+        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
 
         n->cmbloc = n->bar.cmbloc;
         n->cmbsz = n->bar.cmbsz;
@@ -1436,7 +1438,7 @@ static void nvme_exit(PCIDevice *pci_dev)
     g_free(n->cq);
     g_free(n->sq);
 
-    if (n->cmb_size_mb) {
+    if (n->params.cmb_size_mb) {
         g_free(n->cmbuf);
     }
     msix_uninit_exclusive_bar(pci_dev);
@@ -1444,9 +1446,7 @@ static void nvme_exit(PCIDevice *pci_dev)
 
 static Property nvme_props[] = {
     DEFINE_BLOCK_PROPERTIES(NvmeCtrl, conf),
-    DEFINE_PROP_STRING("serial", NvmeCtrl, serial),
-    DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, cmb_size_mb, 0),
-    DEFINE_PROP_UINT32("num_queues", NvmeCtrl, num_queues, 64),
+    DEFINE_NVME_PROPERTIES(NvmeCtrl, params),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 557194ee1954..9957c4a200e2 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -1,7 +1,19 @@
 #ifndef HW_NVME_H
 #define HW_NVME_H
+
 #include "block/nvme.h"
 
+#define DEFINE_NVME_PROPERTIES(_state, _props) \
+    DEFINE_PROP_STRING("serial", _state, _props.serial), \
+    DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
+    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 64)
+
+typedef struct NvmeParams {
+    char     *serial;
+    uint32_t num_queues;
+    uint32_t cmb_size_mb;
+} NvmeParams;
+
 typedef struct NvmeAsyncEvent {
     QSIMPLEQ_ENTRY(NvmeAsyncEvent) entry;
     NvmeAerResult result;
@@ -63,6 +75,7 @@ typedef struct NvmeCtrl {
     MemoryRegion ctrl_mem;
     NvmeBar      bar;
     BlockConf    conf;
+    NvmeParams   params;
 
     uint32_t    page_size;
     uint16_t    page_bits;
@@ -71,10 +84,8 @@ typedef struct NvmeCtrl {
     uint16_t    sqe_size;
     uint32_t    reg_size;
     uint32_t    num_namespaces;
-    uint32_t    num_queues;
     uint32_t    max_q_ents;
     uint64_t    ns_size;
-    uint32_t    cmb_size_mb;
     uint32_t    cmbsz;
     uint32_t    cmbloc;
     uint8_t     *cmbuf;
@@ -82,7 +93,6 @@ typedef struct NvmeCtrl {
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
     uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
 
-    char            *serial;
     NvmeNamespace   *namespaces;
     NvmeSQueue      **sq;
     NvmeCQueue      **cq;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 04/48] nvme: bump spec data structures to v1.3
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (2 preceding siblings ...)
  2020-04-15  5:50 ` [PATCH v7 03/48] nvme: move device parameters to separate struct Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  5:50 ` [PATCH v7 05/48] nvme: use constants in identify Klaus Jensen
                   ` (45 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, Beata Michalska, Klaus Jensen, qemu-devel,
	Max Reitz, Klaus Jensen, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add missing fields in the Identify Controller and Identify Namespace
data structures to bring them in line with NVMe v1.3.

This also adds data structures and defines for SGL support which
requires a couple of trivial changes to the nvme block driver as well.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 block/nvme.c         |  18 ++---
 hw/block/nvme.c      |  12 ++--
 include/block/nvme.h | 162 ++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 160 insertions(+), 32 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 7b7c0cc5d673..7302cc19ade4 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -446,7 +446,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
         error_setg(errp, "Cannot map buffer for DMA");
         goto out;
     }
-    cmd.prp1 = cpu_to_le64(iova);
+    cmd.dptr.prp1 = cpu_to_le64(iova);
 
     if (nvme_cmd_sync(bs, s->queues[0], &cmd)) {
         error_setg(errp, "Failed to identify controller");
@@ -545,7 +545,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
     }
     cmd = (NvmeCmd) {
         .opcode = NVME_ADM_CMD_CREATE_CQ,
-        .prp1 = cpu_to_le64(q->cq.iova),
+        .dptr.prp1 = cpu_to_le64(q->cq.iova),
         .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
         .cdw11 = cpu_to_le32(0x3),
     };
@@ -556,7 +556,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
     }
     cmd = (NvmeCmd) {
         .opcode = NVME_ADM_CMD_CREATE_SQ,
-        .prp1 = cpu_to_le64(q->sq.iova),
+        .dptr.prp1 = cpu_to_le64(q->sq.iova),
         .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
         .cdw11 = cpu_to_le32(0x1 | (n << 16)),
     };
@@ -906,16 +906,16 @@ try_map:
     case 0:
         abort();
     case 1:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = 0;
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = 0;
         break;
     case 2:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = pagelist[1];
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = pagelist[1];
         break;
     default:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
         break;
     }
     trace_nvme_cmd_map_qiov(s, cmd, req, qiov, entries);
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 249f759f076e..088668f28bae 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -372,8 +372,8 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
     uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
     uint64_t slba = le64_to_cpu(rw->slba);
-    uint64_t prp1 = le64_to_cpu(rw->prp1);
-    uint64_t prp2 = le64_to_cpu(rw->prp2);
+    uint64_t prp1 = le64_to_cpu(rw->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(rw->dptr.prp2);
 
     uint8_t lba_index  = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
@@ -763,8 +763,8 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
 
 static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 {
-    uint64_t prp1 = le64_to_cpu(cmd->prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->prp2);
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
     uint64_t timestamp = nvme_get_timestamp(n);
 
@@ -802,8 +802,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 {
     uint16_t ret;
     uint64_t timestamp;
-    uint64_t prp1 = le64_to_cpu(cmd->prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->prp2);
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
     ret = nvme_dma_write_prp(n, (uint8_t *)&timestamp,
                                 sizeof(timestamp), prp1, prp2);
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 8fb941c6537c..b30744068d46 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -205,15 +205,53 @@ enum NvmeCmbszMask {
 #define NVME_CMBSZ_GETSIZE(cmbsz) \
     (NVME_CMBSZ_SZ(cmbsz) * (1 << (12 + 4 * NVME_CMBSZ_SZU(cmbsz))))
 
+enum NvmeSglDescriptorType {
+    NVME_SGL_DESCR_TYPE_DATA_BLOCK          = 0x0,
+    NVME_SGL_DESCR_TYPE_BIT_BUCKET          = 0x1,
+    NVME_SGL_DESCR_TYPE_SEGMENT             = 0x2,
+    NVME_SGL_DESCR_TYPE_LAST_SEGMENT        = 0x3,
+    NVME_SGL_DESCR_TYPE_KEYED_DATA_BLOCK    = 0x4,
+
+    NVME_SGL_DESCR_TYPE_VENDOR_SPECIFIC     = 0xf,
+};
+
+enum NvmeSglDescriptorSubtype {
+    NVME_SGL_DESCR_SUBTYPE_ADDRESS = 0x0,
+};
+
+typedef struct NvmeSglDescriptor {
+    uint64_t addr;
+    uint32_t len;
+    uint8_t  rsvd[3];
+    uint8_t  type;
+} NvmeSglDescriptor;
+
+#define NVME_SGL_TYPE(type)     ((type >> 4) & 0xf)
+#define NVME_SGL_SUBTYPE(type)  (type & 0xf)
+
+typedef union NvmeCmdDptr {
+    struct {
+        uint64_t    prp1;
+        uint64_t    prp2;
+    };
+
+    NvmeSglDescriptor sgl;
+} NvmeCmdDptr;
+
+enum NvmePsdt {
+    PSDT_PRP                 = 0x0,
+    PSDT_SGL_MPTR_CONTIGUOUS = 0x1,
+    PSDT_SGL_MPTR_SGL        = 0x2,
+};
+
 typedef struct NvmeCmd {
     uint8_t     opcode;
-    uint8_t     fuse;
+    uint8_t     flags;
     uint16_t    cid;
     uint32_t    nsid;
     uint64_t    res1;
     uint64_t    mptr;
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint32_t    cdw10;
     uint32_t    cdw11;
     uint32_t    cdw12;
@@ -222,6 +260,9 @@ typedef struct NvmeCmd {
     uint32_t    cdw15;
 } NvmeCmd;
 
+#define NVME_CMD_FLAGS_FUSE(flags) (flags & 0x3)
+#define NVME_CMD_FLAGS_PSDT(flags) ((flags >> 6) & 0x3)
+
 enum NvmeAdminCommands {
     NVME_ADM_CMD_DELETE_SQ      = 0x00,
     NVME_ADM_CMD_CREATE_SQ      = 0x01,
@@ -321,8 +362,7 @@ typedef struct NvmeRwCmd {
     uint32_t    nsid;
     uint64_t    rsvd2;
     uint64_t    mptr;
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint64_t    slba;
     uint16_t    nlb;
     uint16_t    control;
@@ -362,8 +402,7 @@ typedef struct NvmeDsmCmd {
     uint16_t    cid;
     uint32_t    nsid;
     uint64_t    rsvd2[2];
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint32_t    nr;
     uint32_t    attributes;
     uint32_t    rsvd12[4];
@@ -427,6 +466,12 @@ enum NvmeStatusCodes {
     NVME_CMD_ABORT_MISSING_FUSE = 0x000a,
     NVME_INVALID_NSID           = 0x000b,
     NVME_CMD_SEQ_ERROR          = 0x000c,
+    NVME_INVALID_SGL_SEG_DESCR  = 0x000d,
+    NVME_INVALID_NUM_SGL_DESCRS = 0x000e,
+    NVME_DATA_SGL_LEN_INVALID   = 0x000f,
+    NVME_MD_SGL_LEN_INVALID     = 0x0010,
+    NVME_SGL_DESCR_TYPE_INVALID = 0x0011,
+    NVME_INVALID_USE_OF_CMB     = 0x0012,
     NVME_LBA_RANGE              = 0x0080,
     NVME_CAP_EXCEEDED           = 0x0081,
     NVME_NS_NOT_READY           = 0x0082,
@@ -515,7 +560,7 @@ enum NvmeSmartWarn {
     NVME_SMART_FAILED_VOLATILE_MEDIA  = 1 << 4,
 };
 
-enum LogIdentifier {
+enum NvmeLogIdentifier {
     NVME_LOG_ERROR_INFO     = 0x01,
     NVME_LOG_SMART_INFO     = 0x02,
     NVME_LOG_FW_SLOT_INFO   = 0x03,
@@ -533,6 +578,15 @@ typedef struct NvmePSD {
     uint8_t     resv[16];
 } NvmePSD;
 
+#define NVME_IDENTIFY_DATA_SIZE 4096
+
+enum {
+    NVME_ID_CNS_NS             = 0x0,
+    NVME_ID_CNS_CTRL           = 0x1,
+    NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
+    NVME_ID_CNS_NS_DESCR_LIST  = 0x3,
+};
+
 typedef struct NvmeIdCtrl {
     uint16_t    vid;
     uint16_t    ssvid;
@@ -543,7 +597,15 @@ typedef struct NvmeIdCtrl {
     uint8_t     ieee[3];
     uint8_t     cmic;
     uint8_t     mdts;
-    uint8_t     rsvd255[178];
+    uint16_t    cntlid;
+    uint32_t    ver;
+    uint32_t    rtd3r;
+    uint32_t    rtd3e;
+    uint32_t    oaes;
+    uint32_t    ctratt;
+    uint8_t     rsvd100[12];
+    uint8_t     fguid[16];
+    uint8_t     rsvd128[128];
     uint16_t    oacs;
     uint8_t     acl;
     uint8_t     aerl;
@@ -551,10 +613,28 @@ typedef struct NvmeIdCtrl {
     uint8_t     lpa;
     uint8_t     elpe;
     uint8_t     npss;
-    uint8_t     rsvd511[248];
+    uint8_t     avscc;
+    uint8_t     apsta;
+    uint16_t    wctemp;
+    uint16_t    cctemp;
+    uint16_t    mtfa;
+    uint32_t    hmpre;
+    uint32_t    hmmin;
+    uint8_t     tnvmcap[16];
+    uint8_t     unvmcap[16];
+    uint32_t    rpmbs;
+    uint16_t    edstt;
+    uint8_t     dsto;
+    uint8_t     fwug;
+    uint16_t    kas;
+    uint16_t    hctma;
+    uint16_t    mntmt;
+    uint16_t    mxtmt;
+    uint32_t    sanicap;
+    uint8_t     rsvd332[180];
     uint8_t     sqes;
     uint8_t     cqes;
-    uint16_t    rsvd515;
+    uint16_t    maxcmd;
     uint32_t    nn;
     uint16_t    oncs;
     uint16_t    fuses;
@@ -562,8 +642,14 @@ typedef struct NvmeIdCtrl {
     uint8_t     vwc;
     uint16_t    awun;
     uint16_t    awupf;
-    uint8_t     rsvd703[174];
-    uint8_t     rsvd2047[1344];
+    uint8_t     nvscc;
+    uint8_t     rsvd531;
+    uint16_t    acwu;
+    uint8_t     rsvd534[2];
+    uint32_t    sgls;
+    uint8_t     rsvd540[228];
+    uint8_t     subnqn[256];
+    uint8_t     rsvd1024[1024];
     NvmePSD     psd[32];
     uint8_t     vs[1024];
 } NvmeIdCtrl;
@@ -589,6 +675,16 @@ enum NvmeIdCtrlOncs {
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
 #define NVME_CTRL_CQES_MAX(cqes) (((cqes) >> 4) & 0xf)
 
+#define NVME_CTRL_SGLS_SUPPORTED_MASK            (0x3 <<  0)
+#define NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT    (0x1 <<  0)
+#define NVME_CTRL_SGLS_SUPPORTED_DWORD_ALIGNMENT (0x1 <<  1)
+#define NVME_CTRL_SGLS_KEYED                     (0x1 <<  2)
+#define NVME_CTRL_SGLS_BITBUCKET                 (0x1 << 16)
+#define NVME_CTRL_SGLS_MPTR_CONTIGUOUS           (0x1 << 17)
+#define NVME_CTRL_SGLS_EXCESS_LENGTH             (0x1 << 18)
+#define NVME_CTRL_SGLS_MPTR_SGL                  (0x1 << 19)
+#define NVME_CTRL_SGLS_ADDR_OFFSET               (0x1 << 20)
+
 typedef struct NvmeFeatureVal {
     uint32_t    arbitration;
     uint32_t    power_mgmt;
@@ -611,6 +707,15 @@ typedef struct NvmeFeatureVal {
 #define NVME_INTC_THR(intc)     (intc & 0xff)
 #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
 
+#define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
+#define NVME_TEMP_THSEL_OVER   0x0
+#define NVME_TEMP_THSEL_UNDER  0x1
+
+#define NVME_TEMP_TMPSEL(temp)     ((temp >> 16) & 0xf)
+#define NVME_TEMP_TMPSEL_COMPOSITE 0x0
+
+#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
+
 enum NvmeFeatureIds {
     NVME_ARBITRATION                = 0x1,
     NVME_POWER_MANAGEMENT           = 0x2,
@@ -653,18 +758,41 @@ typedef struct NvmeIdNs {
     uint8_t     mc;
     uint8_t     dpc;
     uint8_t     dps;
-
     uint8_t     nmic;
     uint8_t     rescap;
     uint8_t     fpi;
     uint8_t     dlfeat;
-
-    uint8_t     res34[94];
+    uint16_t    nawun;
+    uint16_t    nawupf;
+    uint16_t    nacwu;
+    uint16_t    nabsn;
+    uint16_t    nabo;
+    uint16_t    nabspf;
+    uint16_t    noiob;
+    uint8_t     nvmcap[16];
+    uint8_t     rsvd64[40];
+    uint8_t     nguid[16];
+    uint64_t    eui64;
     NvmeLBAF    lbaf[16];
-    uint8_t     res192[192];
+    uint8_t     rsvd192[192];
     uint8_t     vs[3712];
 } NvmeIdNs;
 
+typedef struct NvmeIdNsDescr {
+    uint8_t nidt;
+    uint8_t nidl;
+    uint8_t rsvd2[2];
+} NvmeIdNsDescr;
+
+#define NVME_NIDT_EUI64_LEN 8
+#define NVME_NIDT_NGUID_LEN 16
+#define NVME_NIDT_UUID_LEN  16
+
+enum {
+    NVME_NIDT_EUI64 = 0x1,
+    NVME_NIDT_NGUID = 0x2,
+    NVME_NIDT_UUID  = 0x3,
+};
 
 /*Deallocate Logical Block Features*/
 #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat)       ((dlfeat) & 0x10)
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 05/48] nvme: use constants in identify
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (3 preceding siblings ...)
  2020-04-15  5:50 ` [PATCH v7 04/48] nvme: bump spec data structures to v1.3 Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  7:01   ` Philippe Mathieu-Daudé
  2020-04-15  5:50 ` [PATCH v7 06/48] nvme: refactor nvme_addr_read Klaus Jensen
                   ` (44 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 088668f28bae..622103c42d0a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -679,7 +679,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
 
 static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
 {
-    static const int data_len = 4 * KiB;
+    static const int data_len = NVME_IDENTIFY_DATA_SIZE;
     uint32_t min_nsid = le32_to_cpu(c->nsid);
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
@@ -709,11 +709,11 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
     NvmeIdentify *c = (NvmeIdentify *)cmd;
 
     switch (le32_to_cpu(c->cns)) {
-    case 0x00:
+    case NVME_ID_CNS_NS:
         return nvme_identify_ns(n, c);
-    case 0x01:
+    case NVME_ID_CNS_CTRL:
         return nvme_identify_ctrl(n, c);
-    case 0x02:
+    case NVME_ID_CNS_NS_ACTIVE_LIST:
         return nvme_identify_nslist(n, c);
     default:
         trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 06/48] nvme: refactor nvme_addr_read
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (4 preceding siblings ...)
  2020-04-15  5:50 ` [PATCH v7 05/48] nvme: use constants in identify Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  7:03   ` Philippe Mathieu-Daudé
  2020-04-15  5:50 ` [PATCH v7 07/48] nvme: add support for the abort command Klaus Jensen
                   ` (43 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Pull the controller memory buffer check to its own function. The check
will be used on its own in later patches.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 622103c42d0a..02d3dde90842 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -52,14 +52,22 @@
 
 static void nvme_process_sq(void *opaque);
 
+static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
+{
+    hwaddr low = n->ctrl_mem.addr;
+    hwaddr hi  = n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size);
+
+    return addr >= low && addr < hi;
+}
+
 static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
-    if (n->cmbsz && addr >= n->ctrl_mem.addr &&
-                addr < (n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size))) {
+    if (n->cmbsz && nvme_addr_is_cmb(n, addr)) {
         memcpy(buf, (void *)&n->cmbuf[addr - n->ctrl_mem.addr], size);
-    } else {
-        pci_dma_read(&n->parent_obj, addr, buf, size);
+        return;
     }
+
+    pci_dma_read(&n->parent_obj, addr, buf, size);
 }
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 07/48] nvme: add support for the abort command
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (5 preceding siblings ...)
  2020-04-15  5:50 ` [PATCH v7 06/48] nvme: refactor nvme_addr_read Klaus Jensen
@ 2020-04-15  5:50 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 08/48] nvme: fix pci doorbell size calculation Klaus Jensen
                   ` (42 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Required for compliance with NVMe revision 1.2.1. See NVM Express 1.2.1,
Section 5.1 ("Abort command").

The Abort command is a best effort command; for now, the device always
fails to abort the given command.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 02d3dde90842..bea37c73732a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -729,6 +729,18 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
     }
 }
 
+static uint16_t nvme_abort(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    uint16_t sqid = le32_to_cpu(cmd->cdw10) & 0xffff;
+
+    req->cqe.result = 1;
+    if (nvme_check_sqid(n, sqid)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    return NVME_SUCCESS;
+}
+
 static inline void nvme_set_timestamp(NvmeCtrl *n, uint64_t ts)
 {
     trace_nvme_dev_setfeat_timestamp(ts);
@@ -863,6 +875,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_create_cq(n, cmd);
     case NVME_ADM_CMD_IDENTIFY:
         return nvme_identify(n, cmd);
+    case NVME_ADM_CMD_ABORT:
+        return nvme_abort(n, cmd, req);
     case NVME_ADM_CMD_SET_FEATURES:
         return nvme_set_feature(n, cmd, req);
     case NVME_ADM_CMD_GET_FEATURES:
@@ -1373,6 +1387,19 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     id->ieee[1] = 0x02;
     id->ieee[2] = 0xb3;
     id->oacs = cpu_to_le16(0);
+
+    /*
+     * Because the controller always completes the Abort command immediately,
+     * there can never be more than one concurrently executing Abort command,
+     * so this value is never used for anything. Note that there can easily be
+     * many Abort commands in the queues, but they are not considered
+     * "executing" until processed by nvme_abort.
+     *
+     * The specification recommends a value of 3 for Abort Command Limit (four
+     * concurrently outstanding Abort commands), so lets use that though it is
+     * inconsequential.
+     */
+    id->acl = 3;
     id->frmw = 7 << 1;
     id->lpa = 1 << 0;
     id->sqes = (0x6 << 4) | 0x6;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 08/48] nvme: fix pci doorbell size calculation
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (6 preceding siblings ...)
  2020-04-15  5:50 ` [PATCH v7 07/48] nvme: add support for the abort command Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 09/48] nvme: add max_ioqpairs device parameter Klaus Jensen
                   ` (41 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The size of the BAR is 0x1000 (main registers) + 8 bytes for each
queue. Currently, the size of the BAR is calculated like so:

    n->reg_size = pow2ceil(0x1004 + 2 * (n->params.num_queues + 1) * 4);

Since the 'num_queues' parameter already accounts for the admin queue,
this should in any case not need to be incremented by one. Also, the
size should be initialized to (0x1000).

    n->reg_size = pow2ceil(0x1000 + 2 * n->params.num_queues * 4);

This, with the default value of num_queues (64), we will set aside room
for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
per queue).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index bea37c73732a..03278726422d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1363,7 +1363,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     pcie_endpoint_cap_init(pci_dev, 0x80);
 
     n->num_namespaces = 1;
-    n->reg_size = pow2ceil(0x1004 + 2 * (n->params.num_queues + 1) * 4);
+    n->reg_size = pow2ceil(0x1000 + 2 * n->params.num_queues * 4);
     n->ns_size = bs_size / (uint64_t)n->num_namespaces;
 
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 09/48] nvme: add max_ioqpairs device parameter
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (7 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 08/48] nvme: fix pci doorbell size calculation Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members Klaus Jensen
                   ` (40 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The num_queues device paramater has a slightly confusing meaning because
it accounts for the admin queue pair which is not really optional.
Secondly, it is really a maximum value of queues allowed.

Add a new max_ioqpairs parameter that only accounts for I/O queue pairs,
but keep num_queues for compatibility.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 45 ++++++++++++++++++++++++++-------------------
 hw/block/nvme.h |  4 +++-
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 03278726422d..f45909dad480 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -19,7 +19,7 @@
  *      -drive file=<file>,if=none,id=<drive_id>
  *      -device nvme,drive=<drive_id>,serial=<serial>,id=<id[optional]>, \
  *              cmb_size_mb=<cmb_size_mb[optional]>, \
- *              num_queues=<N[optional]>
+ *              max_ioqpairs=<N[optional]>
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
@@ -27,6 +27,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
+#include "qemu/error-report.h"
 #include "hw/block/block.h"
 #include "hw/pci/msix.h"
 #include "hw/pci/pci.h"
@@ -72,12 +73,12 @@ static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
 {
-    return sqid < n->params.num_queues && n->sq[sqid] != NULL ? 0 : -1;
+    return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
 }
 
 static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
 {
-    return cqid < n->params.num_queues && n->cq[cqid] != NULL ? 0 : -1;
+    return cqid < n->params.max_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
 }
 
 static void nvme_inc_cq_tail(NvmeCQueue *cq)
@@ -639,7 +640,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
         trace_nvme_dev_err_invalid_create_cq_addr(prp1);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
-    if (unlikely(vector > n->params.num_queues)) {
+    if (unlikely(vector > n->params.max_ioqpairs + 1)) {
         trace_nvme_dev_err_invalid_create_cq_vector(vector);
         return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
     }
@@ -803,8 +804,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
-        result = cpu_to_le32((n->params.num_queues - 2) |
-                             ((n->params.num_queues - 2) << 16));
+        result = cpu_to_le32((n->params.max_ioqpairs - 1) |
+                             ((n->params.max_ioqpairs - 1) << 16));
         trace_nvme_dev_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
@@ -848,10 +849,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_NUMBER_OF_QUEUES:
         trace_nvme_dev_setfeat_numq((dw11 & 0xFFFF) + 1,
                                     ((dw11 >> 16) & 0xFFFF) + 1,
-                                    n->params.num_queues - 1,
-                                    n->params.num_queues - 1);
-        req->cqe.result = cpu_to_le32((n->params.num_queues - 2) |
-                                      ((n->params.num_queues - 2) << 16));
+                                    n->params.max_ioqpairs,
+                                    n->params.max_ioqpairs);
+        req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
+                                      ((n->params.max_ioqpairs - 1) << 16));
         break;
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
@@ -924,12 +925,12 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
 
     blk_drain(n->conf.blk);
 
-    for (i = 0; i < n->params.num_queues; i++) {
+    for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
         if (n->sq[i] != NULL) {
             nvme_free_sq(n->sq[i], n);
         }
     }
-    for (i = 0; i < n->params.num_queues; i++) {
+    for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
         if (n->cq[i] != NULL) {
             nvme_free_cq(n->cq[i], n);
         }
@@ -1330,9 +1331,15 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     int64_t bs_size;
     uint8_t *pci_conf;
 
-    if (!n->params.num_queues) {
-        error_setg(errp, "num_queues can't be zero");
-        return;
+    if (n->params.num_queues) {
+        warn_report("nvme: num_queues is deprecated; please use max_ioqpairs "
+                    "instead");
+
+        n->params.max_ioqpairs = n->params.num_queues - 1;
+    }
+
+    if (!n->params.max_ioqpairs) {
+        error_setg(errp, "max_ioqpairs can't be less than 1");
     }
 
     if (!n->conf.blk) {
@@ -1363,19 +1370,19 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     pcie_endpoint_cap_init(pci_dev, 0x80);
 
     n->num_namespaces = 1;
-    n->reg_size = pow2ceil(0x1000 + 2 * n->params.num_queues * 4);
+    n->reg_size = pow2ceil(0x1008 + 2 * (n->params.max_ioqpairs) * 4);
     n->ns_size = bs_size / (uint64_t)n->num_namespaces;
 
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
-    n->sq = g_new0(NvmeSQueue *, n->params.num_queues);
-    n->cq = g_new0(NvmeCQueue *, n->params.num_queues);
+    n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
+    n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
 
     memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n,
                           "nvme", n->reg_size);
     pci_register_bar(pci_dev, 0,
         PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
         &n->iomem);
-    msix_init_exclusive_bar(pci_dev, n->params.num_queues, 4, NULL);
+    msix_init_exclusive_bar(pci_dev, n->params.max_ioqpairs + 1, 4, NULL);
 
     id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
     id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 9957c4a200e2..98f5b9479244 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -6,11 +6,13 @@
 #define DEFINE_NVME_PROPERTIES(_state, _props) \
     DEFINE_PROP_STRING("serial", _state, _props.serial), \
     DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
-    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 64)
+    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 0), \
+    DEFINE_PROP_UINT32("max_ioqpairs", _state, _props.max_ioqpairs, 64)
 
 typedef struct NvmeParams {
     char     *serial;
     uint32_t num_queues;
+    uint32_t max_ioqpairs;
     uint32_t cmb_size_mb;
 } NvmeParams;
 
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (8 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 09/48] nvme: add max_ioqpairs device parameter Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:10   ` Philippe Mathieu-Daudé
  2020-04-15  5:51 ` [PATCH v7 11/48] nvme: refactor device realization Klaus Jensen
                   ` (39 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 7 ++-----
 hw/block/nvme.h | 2 --
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index f45909dad480..123539a5d0ae 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -63,7 +63,7 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
 
 static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
-    if (n->cmbsz && nvme_addr_is_cmb(n, addr)) {
+    if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
         memcpy(buf, (void *)&n->cmbuf[addr - n->ctrl_mem.addr], size);
         return;
     }
@@ -157,7 +157,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
     if (unlikely(!prp1)) {
         trace_nvme_dev_err_invalid_prp();
         return NVME_INVALID_FIELD | NVME_DNR;
-    } else if (n->cmbsz && prp1 >= n->ctrl_mem.addr &&
+    } else if (n->bar.cmbsz && prp1 >= n->ctrl_mem.addr &&
                prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
         qsg->nsg = 0;
         qemu_iovec_init(iov, num_prps);
@@ -1443,9 +1443,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
         NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
         NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
 
-        n->cmbloc = n->bar.cmbloc;
-        n->cmbsz = n->bar.cmbsz;
-
         n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
         memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
                               "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 98f5b9479244..ad1786953be9 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -88,8 +88,6 @@ typedef struct NvmeCtrl {
     uint32_t    num_namespaces;
     uint32_t    max_q_ents;
     uint64_t    ns_size;
-    uint32_t    cmbsz;
-    uint32_t    cmbloc;
     uint8_t     *cmbuf;
     uint64_t    irq_status;
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 11/48] nvme: refactor device realization
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (9 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:14   ` Philippe Mathieu-Daudé
  2020-04-15  5:51 ` [PATCH v7 12/48] nvme: add temperature threshold feature Klaus Jensen
                   ` (38 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

This patch splits up nvme_realize into multiple individual functions,
each initializing a different subset of the device.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c | 178 +++++++++++++++++++++++++++++++-----------------
 hw/block/nvme.h |  21 ++++++
 2 files changed, 136 insertions(+), 63 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 123539a5d0ae..d1c42ee4765c 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -44,6 +44,8 @@
 #include "trace.h"
 #include "nvme.h"
 
+#define NVME_CMB_BIR 2
+
 #define NVME_GUEST_ERR(trace, fmt, ...) \
     do { \
         (trace_##trace)(__VA_ARGS__); \
@@ -1322,73 +1324,112 @@ static const MemoryRegionOps nvme_cmb_ops = {
     },
 };
 
-static void nvme_realize(PCIDevice *pci_dev, Error **errp)
+static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
 {
-    NvmeCtrl *n = NVME(pci_dev);
-    NvmeIdCtrl *id = &n->id_ctrl;
+    NvmeParams *params = &n->params;
 
-    int i;
-    int64_t bs_size;
-    uint8_t *pci_conf;
-
-    if (n->params.num_queues) {
+    if (params->num_queues) {
         warn_report("nvme: num_queues is deprecated; please use max_ioqpairs "
                     "instead");
 
-        n->params.max_ioqpairs = n->params.num_queues - 1;
+        params->max_ioqpairs = params->num_queues - 1;
     }
 
-    if (!n->params.max_ioqpairs) {
-        error_setg(errp, "max_ioqpairs can't be less than 1");
+    if (params->max_ioqpairs < 1 ||
+        params->max_ioqpairs > PCI_MSIX_FLAGS_QSIZE) {
+        error_setg(errp, "nvme: max_ioqpairs must be between 1 and %d",
+                   PCI_MSIX_FLAGS_QSIZE);
+        return -1;
     }
 
     if (!n->conf.blk) {
-        error_setg(errp, "drive property not set");
-        return;
+        error_setg(errp, "nvme: block backend not configured");
+        return -1;
     }
 
-    bs_size = blk_getlength(n->conf.blk);
-    if (bs_size < 0) {
-        error_setg(errp, "could not get backing file size");
-        return;
+    if (!params->serial) {
+        error_setg(errp, "nvme: serial not configured");
+        return -1;
     }
 
-    if (!n->params.serial) {
-        error_setg(errp, "serial property not set");
-        return;
-    }
+    return 0;
+}
+
+static int nvme_init_blk(NvmeCtrl *n, Error **errp)
+{
     blkconf_blocksizes(&n->conf);
     if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk),
                                        false, errp)) {
-        return;
+        return -1;
     }
 
-    pci_conf = pci_dev->config;
-    pci_conf[PCI_INTERRUPT_PIN] = 1;
-    pci_config_set_prog_interface(pci_dev->config, 0x2);
-    pci_config_set_class(pci_dev->config, PCI_CLASS_STORAGE_EXPRESS);
-    pcie_endpoint_cap_init(pci_dev, 0x80);
+    return 0;
+}
 
+static void nvme_init_state(NvmeCtrl *n)
+{
     n->num_namespaces = 1;
     n->reg_size = pow2ceil(0x1008 + 2 * (n->params.max_ioqpairs) * 4);
-    n->ns_size = bs_size / (uint64_t)n->num_namespaces;
-
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+}
 
-    memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n,
-                          "nvme", n->reg_size);
-    pci_register_bar(pci_dev, 0,
-        PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
-        &n->iomem);
+static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
+{
+    NVME_CMBLOC_SET_BIR(n->bar.cmbloc, NVME_CMB_BIR);
+    NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
+
+    NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
+    NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
+    NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
+    NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
+    NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
+    NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2);
+    NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
+
+    n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
+    memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
+                          "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
+    pci_register_bar(pci_dev, NVME_CMBLOC_BIR(n->bar.cmbloc),
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                     PCI_BASE_ADDRESS_MEM_TYPE_64 |
+                     PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem);
+}
+
+static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev)
+{
+    uint8_t *pci_conf = pci_dev->config;
+
+    pci_conf[PCI_INTERRUPT_PIN] = 1;
+    pci_config_set_prog_interface(pci_conf, 0x2);
+    pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
+    pci_config_set_device_id(pci_conf, 0x5845);
+    pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS);
+    pcie_endpoint_cap_init(pci_dev, 0x80);
+
+    memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme",
+                          n->reg_size);
+    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
+                     PCI_BASE_ADDRESS_MEM_TYPE_64, &n->iomem);
     msix_init_exclusive_bar(pci_dev, n->params.max_ioqpairs + 1, 4, NULL);
 
+    if (n->params.cmb_size_mb) {
+        nvme_init_cmb(n, pci_dev);
+    }
+}
+
+static void nvme_init_ctrl(NvmeCtrl *n)
+{
+    NvmeIdCtrl *id = &n->id_ctrl;
+    NvmeParams *params = &n->params;
+    uint8_t *pci_conf = n->parent_obj.config;
+
     id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
     id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
     strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' ');
     strpadcpy((char *)id->fr, sizeof(id->fr), "1.0", ' ');
-    strpadcpy((char *)id->sn, sizeof(id->sn), n->params.serial, ' ');
+    strpadcpy((char *)id->sn, sizeof(id->sn), params->serial, ' ');
     id->rab = 6;
     id->ieee[0] = 0x00;
     id->ieee[1] = 0x02;
@@ -1429,43 +1470,54 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 
     n->bar.vs = 0x00010200;
     n->bar.intmc = n->bar.intms = 0;
+}
 
-    if (n->params.cmb_size_mb) {
+static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
+{
+    int64_t bs_size;
+    NvmeIdNs *id_ns = &ns->id_ns;
 
-        NVME_CMBLOC_SET_BIR(n->bar.cmbloc, 2);
-        NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
+    bs_size = blk_getlength(n->conf.blk);
+    if (bs_size < 0) {
+        error_setg_errno(errp, -bs_size, "blk_getlength");
+        return -1;
+    }
 
-        NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
-        NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
-        NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
-        NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
-        NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
-        NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
-        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
+    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
+    n->ns_size = bs_size;
 
-        n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
-        memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
-                              "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
-        pci_register_bar(pci_dev, NVME_CMBLOC_BIR(n->bar.cmbloc),
-            PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64 |
-            PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem);
+    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns));
 
+    /* no thin provisioning */
+    id_ns->ncap = id_ns->nsze;
+    id_ns->nuse = id_ns->ncap;
+
+    return 0;
+}
+
+static void nvme_realize(PCIDevice *pci_dev, Error **errp)
+{
+    NvmeCtrl *n = NVME(pci_dev);
+    int i;
+
+    if (nvme_check_constraints(n, errp)) {
+        return;
+    }
+
+    nvme_init_state(n);
+
+    if (nvme_init_blk(n, errp)) {
+        return;
     }
 
     for (i = 0; i < n->num_namespaces; i++) {
-        NvmeNamespace *ns = &n->namespaces[i];
-        NvmeIdNs *id_ns = &ns->id_ns;
-        id_ns->nsfeat = 0;
-        id_ns->nlbaf = 0;
-        id_ns->flbas = 0;
-        id_ns->mc = 0;
-        id_ns->dpc = 0;
-        id_ns->dps = 0;
-        id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
-        id_ns->ncap  = id_ns->nuse = id_ns->nsze =
-            cpu_to_le64(n->ns_size >>
-                id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas)].ds);
+        if (nvme_init_namespace(n, &n->namespaces[i], errp)) {
+            return;
+        }
     }
+
+    nvme_init_pci(n, pci_dev);
+    nvme_init_ctrl(n);
 }
 
 static void nvme_exit(PCIDevice *pci_dev)
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index ad1786953be9..b7c465560eea 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -67,6 +67,22 @@ typedef struct NvmeNamespace {
     NvmeIdNs        id_ns;
 } NvmeNamespace;
 
+static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
+{
+    NvmeIdNs *id_ns = &ns->id_ns;
+    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
+}
+
+static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
+{
+    return nvme_ns_lbaf(ns)->ds;
+}
+
+static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
+{
+    return 1 << nvme_ns_lbads(ns);
+}
+
 #define TYPE_NVME "nvme"
 #define NVME(obj) \
         OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
@@ -101,4 +117,9 @@ typedef struct NvmeCtrl {
     NvmeIdCtrl      id_ctrl;
 } NvmeCtrl;
 
+static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
+{
+    return n->ns_size >> nvme_ns_lbads(ns);
+}
+
 #endif /* HW_NVME_H */
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 12/48] nvme: add temperature threshold feature
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (10 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 11/48] nvme: refactor device realization Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:19   ` Philippe Mathieu-Daudé
  2020-04-15  5:51 ` [PATCH v7 13/48] nvme: add support for the get log page command Klaus Jensen
                   ` (37 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

It might seem wierd to implement this feature for an emulated device,
but it is mandatory to support and the feature is useful for testing
asynchronous event request support, which will be added in a later
patch.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c      | 48 ++++++++++++++++++++++++++++++++++++++++++++
 hw/block/nvme.h      |  1 +
 include/block/nvme.h |  8 +++++++-
 3 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d1c42ee4765c..e777cc9075c1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -45,6 +45,9 @@
 #include "nvme.h"
 
 #define NVME_CMB_BIR 2
+#define NVME_TEMPERATURE 0x143
+#define NVME_TEMPERATURE_WARNING 0x157
+#define NVME_TEMPERATURE_CRITICAL 0x175
 
 #define NVME_GUEST_ERR(trace, fmt, ...) \
     do { \
@@ -798,9 +801,31 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
+    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t result;
 
     switch (dw10) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        result = 0;
+
+        /*
+         * The controller only implements the Composite Temperature sensor, so
+         * return 0 for all other sensors.
+         */
+        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        switch (NVME_TEMP_THSEL(dw11)) {
+        case NVME_TEMP_THSEL_OVER:
+            result = cpu_to_le16(n->features.temp_thresh_hi);
+            break;
+        case NVME_TEMP_THSEL_UNDER:
+            result = cpu_to_le16(n->features.temp_thresh_low);
+            break;
+        }
+
+        break;
     case NVME_VOLATILE_WRITE_CACHE:
         result = blk_enable_write_cache(n->conf.blk);
         trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
@@ -845,6 +870,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
 
     switch (dw10) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        switch (NVME_TEMP_THSEL(dw11)) {
+        case NVME_TEMP_THSEL_OVER:
+            n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
+            break;
+        case NVME_TEMP_THSEL_UNDER:
+            n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
+            break;
+        default:
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
+        break;
     case NVME_VOLATILE_WRITE_CACHE:
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
@@ -1373,6 +1415,7 @@ static void nvme_init_state(NvmeCtrl *n)
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+    n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
 }
 
 static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
@@ -1450,6 +1493,11 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->acl = 3;
     id->frmw = 7 << 1;
     id->lpa = 1 << 0;
+
+    /* recommended default value (~70 C) */
+    id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
+    id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
+
     id->sqes = (0x6 << 4) | 0x6;
     id->cqes = (0x4 << 4) | 0x4;
     id->nn = cpu_to_le32(n->num_namespaces);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index b7c465560eea..807c4ad19dcc 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -115,6 +115,7 @@ typedef struct NvmeCtrl {
     NvmeSQueue      admin_sq;
     NvmeCQueue      admin_cq;
     NvmeIdCtrl      id_ctrl;
+    NvmeFeatureVal  features;
 } NvmeCtrl;
 
 static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
diff --git a/include/block/nvme.h b/include/block/nvme.h
index b30744068d46..a0519814ecec 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -688,7 +688,13 @@ enum NvmeIdCtrlOncs {
 typedef struct NvmeFeatureVal {
     uint32_t    arbitration;
     uint32_t    power_mgmt;
-    uint32_t    temp_thresh;
+    union {
+        struct {
+            uint16_t temp_thresh_hi;
+            uint16_t temp_thresh_low;
+        };
+        uint32_t temp_thresh;
+    };
     uint32_t    err_rec;
     uint32_t    volatile_wc;
     uint32_t    num_queues;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 13/48] nvme: add support for the get log page command
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (11 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 12/48] nvme: add temperature threshold feature Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 14/48] nvme: add support for the asynchronous event request command Klaus Jensen
                   ` (36 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add support for the Get Log Page command and basic implementations of
the mandatory Error Information, SMART / Health Information and Firmware
Slot Information log pages.

In violation of the specification, the SMART / Health Information log
page does not persist information over the lifetime of the controller
because the device has no place to store such persistent state.

Note that the LPA field in the Identify Controller data structure
intentionally has bit 0 cleared because there is no namespace specific
information in the SMART / Health information log page.

Required for compliance with NVMe revision 1.2.1. See NVM Express 1.2.1,
Section 5.10 ("Get Log Page command").

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 138 +++++++++++++++++++++++++++++++++++++++++-
 hw/block/nvme.h       |  11 ++++
 hw/block/trace-events |   2 +
 3 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index e777cc9075c1..76acc112fa7e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -569,6 +569,138 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
+{
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    uint32_t nsid = le32_to_cpu(cmd->nsid);
+
+    uint32_t trans_len;
+    time_t current_ms;
+    uint64_t units_read = 0, units_written = 0;
+    uint64_t read_commands = 0, write_commands = 0;
+    NvmeSmartLog smart;
+    BlockAcctStats *s;
+
+    if (nsid && nsid != 0xffffffff) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    s = blk_get_stats(n->conf.blk);
+
+    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
+    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
+    read_commands = s->nr_ops[BLOCK_ACCT_READ];
+    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
+
+    if (off > sizeof(smart)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    trans_len = MIN(sizeof(smart) - off, buf_len);
+
+    memset(&smart, 0x0, sizeof(smart));
+
+    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
+    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
+    smart.host_read_commands[0] = cpu_to_le64(read_commands);
+    smart.host_write_commands[0] = cpu_to_le64(write_commands);
+
+    smart.temperature[0] = n->temperature & 0xff;
+    smart.temperature[1] = (n->temperature >> 8) & 0xff;
+
+    if ((n->temperature > n->features.temp_thresh_hi) ||
+        (n->temperature < n->features.temp_thresh_low)) {
+        smart.critical_warning |= NVME_SMART_TEMPERATURE;
+    }
+
+    current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    smart.power_on_hours[0] =
+        cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
+
+    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
+                             prp2);
+}
+
+static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                 uint64_t off, NvmeRequest *req)
+{
+    uint32_t trans_len;
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    NvmeFwSlotInfoLog fw_log;
+
+    if (off > sizeof(fw_log)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    memset(&fw_log, 0, sizeof(NvmeFwSlotInfoLog));
+
+    trans_len = MIN(sizeof(fw_log) - off, buf_len);
+
+    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
+                             prp2);
+}
+
+static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
+{
+    uint32_t trans_len;
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    NvmeErrorLog errlog;
+
+    if (off > sizeof(errlog)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    memset(&errlog, 0x0, sizeof(errlog));
+
+    trans_len = MIN(sizeof(errlog) - off, buf_len);
+
+    return nvme_dma_read_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2);
+}
+
+static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
+    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
+    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
+    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
+    uint8_t  lid = dw10 & 0xff;
+    uint8_t  rae = (dw10 >> 15) & 0x1;
+    uint32_t numdl, numdu;
+    uint64_t off, lpol, lpou;
+    size_t   len;
+
+    numdl = (dw10 >> 16);
+    numdu = (dw11 & 0xffff);
+    lpol = dw12;
+    lpou = dw13;
+
+    len = (((numdu << 16) | numdl) + 1) << 2;
+    off = (lpou << 32ULL) | lpol;
+
+    if (off & 0x3) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    trace_nvme_dev_get_log(nvme_cid(req), lid, rae, len, off);
+
+    switch (lid) {
+    case NVME_LOG_ERROR_INFO:
+        return nvme_error_info(n, cmd, len, off, req);
+    case NVME_LOG_SMART_INFO:
+        return nvme_smart_info(n, cmd, len, off, req);
+    case NVME_LOG_FW_SLOT_INFO:
+        return nvme_fw_log_info(n, cmd, len, off, req);
+    default:
+        trace_nvme_dev_err_invalid_log_page(nvme_cid(req), lid);
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+}
+
 static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 {
     n->cq[cq->cqid] = NULL;
@@ -914,6 +1046,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_del_sq(n, cmd);
     case NVME_ADM_CMD_CREATE_SQ:
         return nvme_create_sq(n, cmd);
+    case NVME_ADM_CMD_GET_LOG_PAGE:
+        return nvme_get_log(n, cmd, req);
     case NVME_ADM_CMD_DELETE_CQ:
         return nvme_del_cq(n, cmd);
     case NVME_ADM_CMD_CREATE_CQ:
@@ -1415,7 +1549,9 @@ static void nvme_init_state(NvmeCtrl *n)
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+    n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
+    n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
 }
 
 static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
@@ -1492,7 +1628,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
      */
     id->acl = 3;
     id->frmw = 7 << 1;
-    id->lpa = 1 << 0;
+    id->lpa = 1 << 2;
 
     /* recommended default value (~70 C) */
     id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 807c4ad19dcc..54644815e10a 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -108,6 +108,8 @@ typedef struct NvmeCtrl {
     uint64_t    irq_status;
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
     uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
+    uint64_t    starttime_ms;
+    uint16_t    temperature;
 
     NvmeNamespace   *namespaces;
     NvmeSQueue      **sq;
@@ -123,4 +125,13 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
     return n->ns_size >> nvme_ns_lbads(ns);
 }
 
+static inline uint16_t nvme_cid(NvmeRequest *req)
+{
+    if (req) {
+        return le16_to_cpu(req->cqe.cid);
+    }
+
+    return 0xffff;
+}
+
 #endif /* HW_NVME_H */
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 75b0c7a0cb60..1a5b5d764594 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -46,6 +46,7 @@ nvme_dev_getfeat_numq(int result) "get feature number of queues, result=%d"
 nvme_dev_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
 nvme_dev_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
 nvme_dev_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
+nvme_dev_get_log(uint16_t cid, uint8_t lid, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
 nvme_dev_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
@@ -85,6 +86,7 @@ nvme_dev_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completi
 nvme_dev_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
 nvme_dev_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
 nvme_dev_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
+nvme_dev_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16""
 nvme_dev_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
 nvme_dev_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
 nvme_dev_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 14/48] nvme: add support for the asynchronous event request command
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (12 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 13/48] nvme: add support for the get log page command Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 15/48] nvme: add missing mandatory features Klaus Jensen
                   ` (35 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Required for compliance with NVMe revision 1.2.1. See NVM Express 1.2.1,
Section 5.2 ("Asynchronous Event Request command").

Mostly imported from Keith's qemu-nvme tree. Modified with a max number
of queued events (controllable with the aer_max_queued device
parameter). The spec states that the controller *should* retain
events, so we do best effort here.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 178 ++++++++++++++++++++++++++++++++++++++++--
 hw/block/nvme.h       |  14 +++-
 hw/block/trace-events |   9 +++
 include/block/nvme.h  |   8 +-
 4 files changed, 199 insertions(+), 10 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 76acc112fa7e..b45840ddf8b3 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -325,6 +325,85 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
 }
 
+static void nvme_process_aers(void *opaque)
+{
+    NvmeCtrl *n = opaque;
+    NvmeAsyncEvent *event, *next;
+
+    trace_nvme_dev_process_aers(n->aer_queued);
+
+    QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
+        NvmeRequest *req;
+        NvmeAerResult *result;
+
+        /* can't post cqe if there is nothing to complete */
+        if (!n->outstanding_aers) {
+            trace_nvme_dev_no_outstanding_aers();
+            break;
+        }
+
+        /* ignore if masked (cqe posted, but event not cleared) */
+        if (n->aer_mask & (1 << event->result.event_type)) {
+            trace_nvme_dev_aer_masked(event->result.event_type, n->aer_mask);
+            continue;
+        }
+
+        QTAILQ_REMOVE(&n->aer_queue, event, entry);
+        n->aer_queued--;
+
+        n->aer_mask |= 1 << event->result.event_type;
+        n->outstanding_aers--;
+
+        req = n->aer_reqs[n->outstanding_aers];
+
+        result = (NvmeAerResult *) &req->cqe.result;
+        result->event_type = event->result.event_type;
+        result->event_info = event->result.event_info;
+        result->log_page = event->result.log_page;
+        g_free(event);
+
+        req->status = NVME_SUCCESS;
+
+        trace_nvme_dev_aer_post_cqe(result->event_type, result->event_info,
+                                    result->log_page);
+
+        nvme_enqueue_req_completion(&n->admin_cq, req);
+    }
+}
+
+static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
+                               uint8_t event_info, uint8_t log_page)
+{
+    NvmeAsyncEvent *event;
+
+    trace_nvme_dev_enqueue_event(event_type, event_info, log_page);
+
+    if (n->aer_queued == n->params.aer_max_queued) {
+        trace_nvme_dev_enqueue_event_noqueue(n->aer_queued);
+        return;
+    }
+
+    event = g_new(NvmeAsyncEvent, 1);
+    event->result = (NvmeAerResult) {
+        .event_type = event_type,
+        .event_info = event_info,
+        .log_page   = log_page,
+    };
+
+    QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
+    n->aer_queued++;
+
+    nvme_process_aers(n);
+}
+
+static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
+{
+    n->aer_mask &= ~(1 << event_type);
+    if (!QTAILQ_EMPTY(&n->aer_queue)) {
+        nvme_process_aers(n);
+    }
+}
+
 static void nvme_rw_cb(void *opaque, int ret)
 {
     NvmeRequest *req = opaque;
@@ -569,8 +648,9 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
-                                uint64_t off, NvmeRequest *req)
+static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
+                                uint32_t buf_len, uint64_t off,
+                                NvmeRequest *req)
 {
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
@@ -619,6 +699,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
     smart.power_on_hours[0] =
         cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
 
+    if (!rae) {
+        nvme_clear_events(n, NVME_AER_TYPE_SMART);
+    }
+
     return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
                              prp2);
 }
@@ -643,14 +727,19 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
                              prp2);
 }
 
-static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
-                                uint64_t off, NvmeRequest *req)
+static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
+                                uint32_t buf_len, uint64_t off,
+                                NvmeRequest *req)
 {
     uint32_t trans_len;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeErrorLog errlog;
 
+    if (!rae) {
+        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
+    }
+
     if (off > sizeof(errlog)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
@@ -690,9 +779,9 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
     switch (lid) {
     case NVME_LOG_ERROR_INFO:
-        return nvme_error_info(n, cmd, len, off, req);
+        return nvme_error_info(n, cmd, rae, len, off, req);
     case NVME_LOG_SMART_INFO:
-        return nvme_smart_info(n, cmd, len, off, req);
+        return nvme_smart_info(n, cmd, rae, len, off, req);
     case NVME_LOG_FW_SLOT_INFO:
         return nvme_fw_log_info(n, cmd, len, off, req);
     default:
@@ -969,6 +1058,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         break;
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
+    case NVME_ASYNCHRONOUS_EVENT_CONF:
+        result = cpu_to_le32(n->features.async_config);
+        break;
     default:
         trace_nvme_dev_err_invalid_getfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1018,6 +1110,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
             return NVME_INVALID_FIELD | NVME_DNR;
         }
 
+        if (((n->temperature > n->features.temp_thresh_hi) ||
+            (n->temperature < n->features.temp_thresh_low)) &&
+            NVME_AEC_SMART(n->features.async_config) & NVME_SMART_TEMPERATURE) {
+            nvme_enqueue_event(n, NVME_AER_TYPE_SMART,
+                               NVME_AER_INFO_SMART_TEMP_THRESH,
+                               NVME_LOG_SMART_INFO);
+        }
+
         break;
     case NVME_VOLATILE_WRITE_CACHE:
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
@@ -1032,6 +1132,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         break;
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
+    case NVME_ASYNCHRONOUS_EVENT_CONF:
+        n->features.async_config = dw11;
+        break;
     default:
         trace_nvme_dev_err_invalid_setfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1039,6 +1142,25 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    trace_nvme_dev_aer(nvme_cid(req));
+
+    if (n->outstanding_aers > n->params.aerl) {
+        trace_nvme_dev_aer_aerl_exceeded();
+        return NVME_AER_LIMIT_EXCEEDED;
+    }
+
+    n->aer_reqs[n->outstanding_aers] = req;
+    n->outstanding_aers++;
+
+    if (!QTAILQ_EMPTY(&n->aer_queue)) {
+        nvme_process_aers(n);
+    }
+
+    return NVME_NO_COMPLETE;
+}
+
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     switch (cmd->opcode) {
@@ -1060,6 +1182,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_set_feature(n, cmd, req);
     case NVME_ADM_CMD_GET_FEATURES:
         return nvme_get_feature(n, cmd, req);
+    case NVME_ADM_CMD_ASYNC_EV_REQ:
+        return nvme_aer(n, cmd, req);
     default:
         trace_nvme_dev_err_invalid_admin_opc(cmd->opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
@@ -1114,6 +1238,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
         }
     }
 
+    while (!QTAILQ_EMPTY(&n->aer_queue)) {
+        NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
+        QTAILQ_REMOVE(&n->aer_queue, event, entry);
+        g_free(event);
+    }
+
+    n->aer_queued = 0;
+    n->outstanding_aers = 0;
+
     blk_flush(n->conf.blk);
     n->bar.cc = 0;
 }
@@ -1210,6 +1343,8 @@ static int nvme_start_ctrl(NvmeCtrl *n)
 
     nvme_set_timestamp(n, 0ULL);
 
+    QTAILQ_INIT(&n->aer_queue);
+
     return 0;
 }
 
@@ -1400,6 +1535,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            "completion queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1410,6 +1552,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            " beyond queue size, sqid=%"PRIu32","
                            " new_head=%"PRIu16", ignoring",
                            qid, new_head);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1438,6 +1587,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            "submission queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1448,6 +1604,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            " beyond queue size, sqid=%"PRIu32","
                            " new_tail=%"PRIu16", ignoring",
                            qid, new_tail);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1552,6 +1715,7 @@ static void nvme_init_state(NvmeCtrl *n)
     n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
     n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
 }
 
 static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
@@ -1627,6 +1791,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
      * inconsequential.
      */
     id->acl = 3;
+    id->aerl = n->params.aerl;
     id->frmw = 7 << 1;
     id->lpa = 1 << 2;
 
@@ -1712,6 +1877,7 @@ static void nvme_exit(PCIDevice *pci_dev)
     g_free(n->namespaces);
     g_free(n->cq);
     g_free(n->sq);
+    g_free(n->aer_reqs);
 
     if (n->params.cmb_size_mb) {
         g_free(n->cmbuf);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 54644815e10a..321d37aeaca4 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -7,17 +7,21 @@
     DEFINE_PROP_STRING("serial", _state, _props.serial), \
     DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
     DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 0), \
-    DEFINE_PROP_UINT32("max_ioqpairs", _state, _props.max_ioqpairs, 64)
+    DEFINE_PROP_UINT32("max_ioqpairs", _state, _props.max_ioqpairs, 64), \
+    DEFINE_PROP_UINT8("aerl", _state, _props.aerl, 3), \
+    DEFINE_PROP_UINT32("aer_max_queued", _state, _props.aer_max_queued, 64)
 
 typedef struct NvmeParams {
     char     *serial;
     uint32_t num_queues;
     uint32_t max_ioqpairs;
     uint32_t cmb_size_mb;
+    uint8_t  aerl;
+    uint32_t aer_max_queued;
 } NvmeParams;
 
 typedef struct NvmeAsyncEvent {
-    QSIMPLEQ_ENTRY(NvmeAsyncEvent) entry;
+    QTAILQ_ENTRY(NvmeAsyncEvent) entry;
     NvmeAerResult result;
 } NvmeAsyncEvent;
 
@@ -104,6 +108,7 @@ typedef struct NvmeCtrl {
     uint32_t    num_namespaces;
     uint32_t    max_q_ents;
     uint64_t    ns_size;
+    uint8_t     outstanding_aers;
     uint8_t     *cmbuf;
     uint64_t    irq_status;
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
@@ -111,6 +116,11 @@ typedef struct NvmeCtrl {
     uint64_t    starttime_ms;
     uint16_t    temperature;
 
+    uint8_t     aer_mask;
+    NvmeRequest **aer_reqs;
+    QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue;
+    int         aer_queued;
+
     NvmeNamespace   *namespaces;
     NvmeSQueue      **sq;
     NvmeCQueue      **cq;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 1a5b5d764594..a0a6de7dd5aa 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -47,6 +47,15 @@ nvme_dev_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_
 nvme_dev_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
 nvme_dev_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
 nvme_dev_get_log(uint16_t cid, uint8_t lid, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
+nvme_dev_process_aers(int queued) "queued %d"
+nvme_dev_aer(uint16_t cid) "cid %"PRIu16""
+nvme_dev_aer_aerl_exceeded(void) "aerl exceeded"
+nvme_dev_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8""
+nvme_dev_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+nvme_dev_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+nvme_dev_enqueue_event_noqueue(int queued) "queued %d"
+nvme_dev_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
+nvme_dev_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
 nvme_dev_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index a0519814ecec..0df5534cb8bf 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -425,8 +425,8 @@ enum NvmeAsyncEventRequest {
     NVME_AER_TYPE_SMART                     = 1,
     NVME_AER_TYPE_IO_SPECIFIC               = 6,
     NVME_AER_TYPE_VENDOR_SPECIFIC           = 7,
-    NVME_AER_INFO_ERR_INVALID_SQ            = 0,
-    NVME_AER_INFO_ERR_INVALID_DB            = 1,
+    NVME_AER_INFO_ERR_INVALID_DB_REGISTER   = 0,
+    NVME_AER_INFO_ERR_INVALID_DB_VALUE      = 1,
     NVME_AER_INFO_ERR_DIAG_FAIL             = 2,
     NVME_AER_INFO_ERR_PERS_INTERNAL_ERR     = 3,
     NVME_AER_INFO_ERR_TRANS_INTERNAL_ERR    = 4,
@@ -722,6 +722,10 @@ typedef struct NvmeFeatureVal {
 
 #define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
 
+#define NVME_AEC_SMART(aec)         (aec & 0xff)
+#define NVME_AEC_NS_ATTR(aec)       ((aec >> 8) & 0x1)
+#define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1)
+
 enum NvmeFeatureIds {
     NVME_ARBITRATION                = 0x1,
     NVME_POWER_MANAGEMENT           = 0x2,
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 15/48] nvme: add missing mandatory features
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (13 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 14/48] nvme: add support for the asynchronous event request command Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 16/48] nvme: additional tracing Klaus Jensen
                   ` (34 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add support for returning a resonable response to Get/Set Features of
mandatory features.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 60 ++++++++++++++++++++++++++++++++++++++++++-
 hw/block/trace-events |  2 ++
 include/block/nvme.h  |  6 ++++-
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index b45840ddf8b3..e448a1c85cb9 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1025,7 +1025,15 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t result;
 
+    trace_nvme_dev_getfeat(nvme_cid(req), dw10);
+
     switch (dw10) {
+    case NVME_ARBITRATION:
+        result = cpu_to_le32(n->features.arbitration);
+        break;
+    case NVME_POWER_MANAGEMENT:
+        result = cpu_to_le32(n->features.power_mgmt);
+        break;
     case NVME_TEMPERATURE_THRESHOLD:
         result = 0;
 
@@ -1046,9 +1054,12 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
             break;
         }
 
+        break;
+    case NVME_ERROR_RECOVERY:
+        result = cpu_to_le32(n->features.err_rec);
         break;
     case NVME_VOLATILE_WRITE_CACHE:
-        result = blk_enable_write_cache(n->conf.blk);
+        result = cpu_to_le32(blk_enable_write_cache(n->conf.blk));
         trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
@@ -1058,6 +1069,19 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         break;
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
+    case NVME_INTERRUPT_COALESCING:
+        result = cpu_to_le32(n->features.int_coalescing);
+        break;
+    case NVME_INTERRUPT_VECTOR_CONF:
+        if ((dw11 & 0xffff) >= n->params.max_ioqpairs + 1) {
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
+        result = cpu_to_le32(n->features.int_vector_config[dw11 & 0xffff]);
+        break;
+    case NVME_WRITE_ATOMICITY:
+        result = cpu_to_le32(n->features.write_atomicity);
+        break;
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         result = cpu_to_le32(n->features.async_config);
         break;
@@ -1093,6 +1117,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
 
+    trace_nvme_dev_setfeat(nvme_cid(req), dw10, dw11);
+
     switch (dw10) {
     case NVME_TEMPERATURE_THRESHOLD:
         if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
@@ -1120,6 +1146,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
         break;
     case NVME_VOLATILE_WRITE_CACHE:
+        if (blk_enable_write_cache(n->conf.blk)) {
+            blk_flush(n->conf.blk);
+        }
+
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
@@ -1135,6 +1165,13 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         n->features.async_config = dw11;
         break;
+    case NVME_ARBITRATION:
+    case NVME_POWER_MANAGEMENT:
+    case NVME_ERROR_RECOVERY:
+    case NVME_INTERRUPT_COALESCING:
+    case NVME_INTERRUPT_VECTOR_CONF:
+    case NVME_WRITE_ATOMICITY:
+        return NVME_FEAT_NOT_CHANGABLE | NVME_DNR;
     default:
         trace_nvme_dev_err_invalid_setfeat(dw10);
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1715,6 +1752,25 @@ static void nvme_init_state(NvmeCtrl *n)
     n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
     n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+
+    /*
+     * There is no limit on the number of commands that the controller may
+     * launch at one time from a particular Submission Queue.
+     */
+    n->features.arbitration = NVME_ARB_AB_NOLIMIT;
+
+    n->features.int_vector_config = g_malloc0_n(n->params.max_ioqpairs + 1,
+        sizeof(*n->features.int_vector_config));
+
+    for (int i = 0; i < n->params.max_ioqpairs + 1; i++) {
+        n->features.int_vector_config[i] = i;
+
+        /* interrupt coalescing is not supported for the admin queue */
+        if (i == 0) {
+            n->features.int_vector_config[i] |= NVME_INTVC_NOCOALESCING;
+        }
+    }
+
     n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
 }
 
@@ -1803,6 +1859,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->cqes = (0x4 << 4) | 0x4;
     id->nn = cpu_to_le32(n->num_namespaces);
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
+
     id->psd[0].mp = cpu_to_le16(0x9c4);
     id->psd[0].enlat = cpu_to_le32(0x10);
     id->psd[0].exlat = cpu_to_le32(0x4);
@@ -1878,6 +1935,7 @@ static void nvme_exit(PCIDevice *pci_dev)
     g_free(n->cq);
     g_free(n->sq);
     g_free(n->aer_reqs);
+    g_free(n->features.int_vector_config);
 
     if (n->params.cmb_size_mb) {
         g_free(n->cmbuf);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index a0a6de7dd5aa..b6fde13419bf 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -41,6 +41,8 @@ nvme_dev_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 nvme_dev_identify_ctrl(void) "identify controller"
 nvme_dev_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
 nvme_dev_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
+nvme_dev_getfeat(uint16_t cid, uint32_t fid) "cid %"PRIu16" fid 0x%"PRIx32""
+nvme_dev_setfeat(uint16_t cid, uint32_t fid, uint32_t val) "cid %"PRIu16" fid 0x%"PRIx32" val 0x%"PRIx32""
 nvme_dev_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 nvme_dev_getfeat_numq(int result) "get feature number of queues, result=%d"
 nvme_dev_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 0df5534cb8bf..88e5385a9d3f 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -490,7 +490,8 @@ enum NvmeStatusCodes {
     NVME_FW_REQ_RESET           = 0x010b,
     NVME_INVALID_QUEUE_DEL      = 0x010c,
     NVME_FID_NOT_SAVEABLE       = 0x010d,
-    NVME_FID_NOT_NSID_SPEC      = 0x010f,
+    NVME_FEAT_NOT_CHANGABLE     = 0x010e,
+    NVME_FEAT_NOT_NS_SPEC       = 0x010f,
     NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
     NVME_CONFLICTING_ATTRS      = 0x0180,
     NVME_INVALID_PROT_INFO      = 0x0181,
@@ -706,6 +707,7 @@ typedef struct NvmeFeatureVal {
 } NvmeFeatureVal;
 
 #define NVME_ARB_AB(arb)    (arb & 0x7)
+#define NVME_ARB_AB_NOLIMIT 0x7
 #define NVME_ARB_LPW(arb)   ((arb >> 8) & 0xff)
 #define NVME_ARB_MPW(arb)   ((arb >> 16) & 0xff)
 #define NVME_ARB_HPW(arb)   ((arb >> 24) & 0xff)
@@ -713,6 +715,8 @@ typedef struct NvmeFeatureVal {
 #define NVME_INTC_THR(intc)     (intc & 0xff)
 #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
 
+#define NVME_INTVC_NOCOALESCING (0x1 << 16)
+
 #define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
 #define NVME_TEMP_THSEL_OVER   0x0
 #define NVME_TEMP_THSEL_UNDER  0x1
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 16/48] nvme: additional tracing
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (14 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 15/48] nvme: add missing mandatory features Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 17/48] nvme: make sure ncqr and nsqr is valid Klaus Jensen
                   ` (33 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Also, streamline nvme_identify_ns and nvme_identify_ns_list. They do not
need to repeat the command, it is already in the trace name.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c       | 20 ++++++++++++++++++++
 hw/block/trace-events | 13 +++++++++++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index e448a1c85cb9..7094767eeccb 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -320,6 +320,8 @@ static void nvme_post_cqes(void *opaque)
 static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
 {
     assert(cq->cqid == req->sq->cqid);
+    trace_nvme_dev_enqueue_req_completion(nvme_cid(req), cq->cqid,
+                                          req->status);
     QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
     QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
@@ -411,6 +413,8 @@ static void nvme_rw_cb(void *opaque, int ret)
     NvmeCtrl *n = sq->ctrl;
     NvmeCQueue *cq = n->cq[sq->cqid];
 
+    trace_nvme_dev_rw_cb(nvme_cid(req));
+
     if (!ret) {
         block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
         req->status = NVME_SUCCESS;
@@ -446,6 +450,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     uint64_t offset = slba << data_shift;
     uint32_t count = nlb << data_shift;
 
+    trace_nvme_dev_write_zeroes(nvme_cid(req), slba, nlb);
+
     if (unlikely(slba + nlb > ns->id_ns.nsze)) {
         trace_nvme_dev_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
         return NVME_LBA_RANGE | NVME_DNR;
@@ -513,6 +519,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     NvmeNamespace *ns;
     uint32_t nsid = le32_to_cpu(cmd->nsid);
 
+    trace_nvme_dev_io_cmd(nvme_cid(req), nsid, le16_to_cpu(req->sq->sqid),
+                          cmd->opcode);
+
     if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
         trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
         return NVME_INVALID_NSID | NVME_DNR;
@@ -1200,6 +1209,9 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
+    trace_nvme_dev_admin_cmd(nvme_cid(req), le16_to_cpu(req->sq->sqid),
+                             cmd->opcode);
+
     switch (cmd->opcode) {
     case NVME_ADM_CMD_DELETE_SQ:
         return nvme_del_sq(n, cmd);
@@ -1525,6 +1537,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
     uint8_t *ptr = (uint8_t *)&n->bar;
     uint64_t val = 0;
 
+    trace_nvme_dev_mmio_read(addr);
+
     if (unlikely(addr & (sizeof(uint32_t) - 1))) {
         NVME_GUEST_ERR(nvme_dev_ub_mmiord_misaligned32,
                        "MMIO read not 32-bit aligned,"
@@ -1599,6 +1613,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             return;
         }
 
+        trace_nvme_dev_mmio_doorbell_cq(cq->cqid, new_head);
+
         start_sqs = nvme_cq_full(cq) ? 1 : 0;
         cq->head = new_head;
         if (start_sqs) {
@@ -1651,6 +1667,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             return;
         }
 
+        trace_nvme_dev_mmio_doorbell_sq(sq->sqid, new_tail);
+
         sq->tail = new_tail;
         timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
     }
@@ -1659,6 +1677,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
 static void nvme_mmio_write(void *opaque, hwaddr addr, uint64_t data,
     unsigned size)
 {
+    trace_nvme_dev_mmio_write(addr, data);
+
     NvmeCtrl *n = (NvmeCtrl *)opaque;
     if (addr < sizeof(n->bar)) {
         nvme_write_bar(n, addr, data, size);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index b6fde13419bf..659091fc2fed 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -33,14 +33,18 @@ nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
 nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
+nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
+nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
+nvme_dev_rw_cb(uint16_t cid) "cid %"PRIu16""
+nvme_dev_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
 nvme_dev_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
 nvme_dev_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
 nvme_dev_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
 nvme_dev_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 nvme_dev_identify_ctrl(void) "identify controller"
-nvme_dev_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
-nvme_dev_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
+nvme_dev_identify_ns(uint32_t ns) "nsid %"PRIu32""
+nvme_dev_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 nvme_dev_getfeat(uint16_t cid, uint32_t fid) "cid %"PRIu16" fid 0x%"PRIx32""
 nvme_dev_setfeat(uint16_t cid, uint32_t fid, uint32_t val) "cid %"PRIu16" fid 0x%"PRIx32" val 0x%"PRIx32""
 nvme_dev_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
@@ -54,10 +58,13 @@ nvme_dev_aer(uint16_t cid) "cid %"PRIu16""
 nvme_dev_aer_aerl_exceeded(void) "aerl exceeded"
 nvme_dev_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8""
 nvme_dev_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+nvme_dev_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
 nvme_dev_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
 nvme_dev_enqueue_event_noqueue(int queued) "queued %d"
 nvme_dev_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
 nvme_dev_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
+nvme_dev_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
+nvme_dev_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
 nvme_dev_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 nvme_dev_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
@@ -70,6 +77,8 @@ nvme_dev_mmio_start_success(void) "setting controller enable bit succeeded"
 nvme_dev_mmio_stopped(void) "cleared controller enable bit"
 nvme_dev_mmio_shutdown_set(void) "shutdown bit set"
 nvme_dev_mmio_shutdown_cleared(void) "shutdown bit cleared"
+nvme_dev_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16""
+nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_tail %"PRIu16""
 
 # nvme traces for error conditions
 nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 17/48] nvme: make sure ncqr and nsqr is valid
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (15 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 16/48] nvme: additional tracing Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 18/48] nvme: add log specific field to trace events Klaus Jensen
                   ` (32 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

0xffff is not an allowed value for NCQR and NSQR in Set Features on
Number of Queues.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7094767eeccb..c1e3ae81666a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1162,6 +1162,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
+        /*
+         * NVMe v1.3, Section 5.21.1.7: 0xffff is not an allowed value for NCQR
+         * and NSQR.
+         */
+        if ((dw11 & 0xffff) == 0xffff || ((dw11 >> 16) & 0xffff) == 0xffff) {
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
         trace_nvme_dev_setfeat_numq((dw11 & 0xFFFF) + 1,
                                     ((dw11 >> 16) & 0xFFFF) + 1,
                                     n->params.max_ioqpairs,
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 18/48] nvme: add log specific field to trace events
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (16 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 17/48] nvme: make sure ncqr and nsqr is valid Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 19/48] nvme: support identify namespace descriptor list Klaus Jensen
                   ` (31 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The LSP field is not used directly now, but include it in the trace.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 3 ++-
 hw/block/trace-events | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index c1e3ae81666a..d4622278450e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -767,6 +767,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw12 = le32_to_cpu(cmd->cdw12);
     uint32_t dw13 = le32_to_cpu(cmd->cdw13);
     uint8_t  lid = dw10 & 0xff;
+    uint8_t  lsp = (dw10 >> 8) & 0xf;
     uint8_t  rae = (dw10 >> 15) & 0x1;
     uint32_t numdl, numdu;
     uint64_t off, lpol, lpou;
@@ -784,7 +785,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    trace_nvme_dev_get_log(nvme_cid(req), lid, rae, len, off);
+    trace_nvme_dev_get_log(nvme_cid(req), lid, lsp, rae, len, off);
 
     switch (lid) {
     case NVME_LOG_ERROR_INFO:
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 659091fc2fed..fb5b26f6f5f6 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -52,7 +52,7 @@ nvme_dev_getfeat_numq(int result) "get feature number of queues, result=%d"
 nvme_dev_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
 nvme_dev_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
 nvme_dev_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
-nvme_dev_get_log(uint16_t cid, uint8_t lid, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
+nvme_dev_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
 nvme_dev_process_aers(int queued) "queued %d"
 nvme_dev_aer(uint16_t cid) "cid %"PRIu16""
 nvme_dev_aer_aerl_exceeded(void) "aerl exceeded"
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 19/48] nvme: support identify namespace descriptor list
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (17 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 18/48] nvme: add log specific field to trace events Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 20/48] nvme: enforce valid queue creation sequence Klaus Jensen
                   ` (30 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Since we are not providing the NGUID or EUI64 fields, we must support
the Namespace UUID. We do not have any way of storing a persistent
unique identifier, so conjure up a UUID that is just the namespace id.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c       | 39 +++++++++++++++++++++++++++++++++++++++
 hw/block/trace-events |  1 +
 2 files changed, 40 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d4622278450e..f40bc861facc 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -949,6 +949,43 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
     return ret;
 }
 
+static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
+{
+    uint32_t nsid = le32_to_cpu(c->nsid);
+    uint64_t prp1 = le64_to_cpu(c->prp1);
+    uint64_t prp2 = le64_to_cpu(c->prp2);
+
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE];
+
+    struct data {
+        struct {
+            NvmeIdNsDescr hdr;
+            uint8_t v[16];
+        } uuid;
+    };
+
+    struct data *ns_descrs = (struct data *)list;
+
+    trace_nvme_dev_identify_ns_descr_list(nsid);
+
+    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
+        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
+        return NVME_INVALID_NSID | NVME_DNR;
+    }
+
+    /*
+     * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
+     * structure, a Namespace UUID (nidt = 0x3) must be reported in the
+     * Namespace Identification Descriptor. Add a very basic Namespace UUID
+     * here.
+     */
+    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
+    ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
+    stl_be_p(&ns_descrs->uuid.v, nsid);
+
+    return nvme_dma_read_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2);
+}
+
 static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
 {
     NvmeIdentify *c = (NvmeIdentify *)cmd;
@@ -960,6 +997,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
         return nvme_identify_ctrl(n, c);
     case NVME_ID_CNS_NS_ACTIVE_LIST:
         return nvme_identify_nslist(n, c);
+    case NVME_ID_CNS_NS_DESCR_LIST:
+        return nvme_identify_ns_descr_list(n, c);
     default:
         trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index fb5b26f6f5f6..7ecd47131ac2 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -45,6 +45,7 @@ nvme_dev_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 nvme_dev_identify_ctrl(void) "identify controller"
 nvme_dev_identify_ns(uint32_t ns) "nsid %"PRIu32""
 nvme_dev_identify_nslist(uint32_t ns) "nsid %"PRIu32""
+nvme_dev_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
 nvme_dev_getfeat(uint16_t cid, uint32_t fid) "cid %"PRIu16" fid 0x%"PRIx32""
 nvme_dev_setfeat(uint16_t cid, uint32_t fid, uint32_t val) "cid %"PRIu16" fid 0x%"PRIx32" val 0x%"PRIx32""
 nvme_dev_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 20/48] nvme: enforce valid queue creation sequence
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (18 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 19/48] nvme: support identify namespace descriptor list Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 21/48] nvme: provide the mandatory subnqn field Klaus Jensen
                   ` (29 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Support returning Command Sequence Error if Set Features on Number of
Queues is called after queues have been created.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 12 ++++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index f40bc861facc..d88e21a14b77 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -888,6 +888,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
     cq = g_malloc0(sizeof(*cq));
     nvme_init_cq(cq, n, prp1, cqid, vector, qsize + 1,
         NVME_CQ_FLAGS_IEN(qflags));
+
+    /*
+     * It is only required to set qs_created when creating a completion queue;
+     * creating a submission queue without a matching completion queue will
+     * fail.
+     */
+    n->qs_created = true;
     return NVME_SUCCESS;
 }
 
@@ -1202,6 +1209,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
+        if (n->qs_created) {
+            return NVME_CMD_SEQ_ERROR | NVME_DNR;
+        }
+
         /*
          * NVMe v1.3, Section 5.21.1.7: 0xffff is not an allowed value for NCQR
          * and NSQR.
@@ -1343,6 +1354,7 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
 
     n->aer_queued = 0;
     n->outstanding_aers = 0;
+    n->qs_created = false;
 
     blk_flush(n->conf.blk);
     n->bar.cc = 0;
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 321d37aeaca4..f72ffddae160 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -99,6 +99,7 @@ typedef struct NvmeCtrl {
     BlockConf    conf;
     NvmeParams   params;
 
+    bool        qs_created;
     uint32_t    page_size;
     uint16_t    page_bits;
     uint16_t    max_prp_ents;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 21/48] nvme: provide the mandatory subnqn field
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (19 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 20/48] nvme: enforce valid queue creation sequence Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 22/48] nvme: bump supported version to v1.3 Klaus Jensen
                   ` (28 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d88e21a14b77..d5c293476411 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1940,6 +1940,9 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->nn = cpu_to_le32(n->num_namespaces);
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
 
+    pstrcpy((char *) id->subnqn, sizeof(id->subnqn), "nqn.2019-08.org.qemu:");
+    pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);
+
     id->psd[0].mp = cpu_to_le16(0x9c4);
     id->psd[0].enlat = cpu_to_le32(0x10);
     id->psd[0].exlat = cpu_to_le32(0x4);
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 22/48] nvme: bump supported version to v1.3
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (20 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 21/48] nvme: provide the mandatory subnqn field Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 23/48] nvme: memset preallocated requests structures Klaus Jensen
                   ` (27 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d5c293476411..59935d4641a6 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -44,6 +44,7 @@
 #include "trace.h"
 #include "nvme.h"
 
+#define NVME_SPEC_VER 0x00010300
 #define NVME_CMB_BIR 2
 #define NVME_TEMPERATURE 0x143
 #define NVME_TEMPERATURE_WARNING 0x157
@@ -1913,6 +1914,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->ieee[0] = 0x00;
     id->ieee[1] = 0x02;
     id->ieee[2] = 0xb3;
+    id->ver = cpu_to_le32(NVME_SPEC_VER);
     id->oacs = cpu_to_le16(0);
 
     /*
@@ -1957,7 +1959,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     NVME_CAP_SET_CSS(n->bar.cap, 1);
     NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
-    n->bar.vs = 0x00010200;
+    n->bar.vs = NVME_SPEC_VER;
     n->bar.intmc = n->bar.intms = 0;
 }
 
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 23/48] nvme: memset preallocated requests structures
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (21 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 22/48] nvme: bump supported version to v1.3 Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 24/48] nvme: add mapping helpers Klaus Jensen
                   ` (26 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

This is preparatory to subsequent patches that change how QSGs/IOVs are
handled. It is important that the qsg and iov members of the NvmeRequest
are initially zeroed.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 59935d4641a6..1d4705693287 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -604,7 +604,7 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
     sq->size = size;
     sq->cqid = cqid;
     sq->head = sq->tail = 0;
-    sq->io_req = g_new(NvmeRequest, sq->size);
+    sq->io_req = g_new0(NvmeRequest, sq->size);
 
     QTAILQ_INIT(&sq->req_list);
     QTAILQ_INIT(&sq->out_req_list);
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 24/48] nvme: add mapping helpers
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (22 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 23/48] nvme: memset preallocated requests structures Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 25/48] nvme: replace dma_acct with blk_acct equivalent Klaus Jensen
                   ` (25 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add nvme_map_addr, nvme_map_addr_cmb and nvme_addr_to_cmb helpers and
use them in nvme_map_prp.

This fixes a bug where in the case of a CMB transfer, the device would
map to the buffer with a wrong length.

Fixes: b2b2b67a00574 ("nvme: Add support for Read Data and Write Data in CMBs.")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c       | 105 +++++++++++++++++++++++++++++++++++-------
 hw/block/trace-events |   1 +
 2 files changed, 89 insertions(+), 17 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1d4705693287..b62b053d7c38 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -59,6 +59,11 @@
 
 static void nvme_process_sq(void *opaque);
 
+static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr)
+{
+    return &n->cmbuf[addr - n->ctrl_mem.addr];
+}
+
 static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
 {
     hwaddr low = n->ctrl_mem.addr;
@@ -70,7 +75,7 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
 static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
     if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
-        memcpy(buf, (void *)&n->cmbuf[addr - n->ctrl_mem.addr], size);
+        memcpy(buf, nvme_addr_to_cmb(n, addr), size);
         return;
     }
 
@@ -153,29 +158,87 @@ static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
     }
 }
 
+static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
+                                  size_t len)
+{
+    if (!len) {
+        return NVME_SUCCESS;
+    }
+
+    if (!nvme_addr_is_cmb(n, addr) || !nvme_addr_is_cmb(n, addr + len - 1)) {
+        return NVME_DATA_TRAS_ERROR;
+    }
+
+    qemu_iovec_add(iov, nvme_addr_to_cmb(n, addr), len);
+
+    return NVME_SUCCESS;
+}
+
+static uint16_t nvme_map_addr(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
+                              hwaddr addr, size_t len)
+{
+    if (!len) {
+        return NVME_SUCCESS;
+    }
+
+    if (nvme_addr_is_cmb(n, addr)) {
+        if (qsg && qsg->sg) {
+            return NVME_INVALID_USE_OF_CMB | NVME_DNR;
+        }
+
+        assert(iov);
+
+        if (!iov->iov) {
+            qemu_iovec_init(iov, 1);
+        }
+
+        return nvme_map_addr_cmb(n, iov, addr, len);
+    }
+
+    if (iov && iov->iov) {
+        return NVME_INVALID_USE_OF_CMB | NVME_DNR;
+    }
+
+    assert(qsg);
+
+    if (!qsg->sg) {
+        pci_dma_sglist_init(qsg, &n->parent_obj, 1);
+    }
+
+    qemu_sglist_add(qsg, addr, len);
+
+    return NVME_SUCCESS;
+}
+
 static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
                              uint64_t prp2, uint32_t len, NvmeCtrl *n)
 {
     hwaddr trans_len = n->page_size - (prp1 % n->page_size);
     trans_len = MIN(len, trans_len);
     int num_prps = (len >> n->page_bits) + 1;
+    uint16_t status;
 
     if (unlikely(!prp1)) {
         trace_nvme_dev_err_invalid_prp();
         return NVME_INVALID_FIELD | NVME_DNR;
-    } else if (n->bar.cmbsz && prp1 >= n->ctrl_mem.addr &&
-               prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
-        qsg->nsg = 0;
+    }
+
+    if (nvme_addr_is_cmb(n, prp1)) {
         qemu_iovec_init(iov, num_prps);
-        qemu_iovec_add(iov, (void *)&n->cmbuf[prp1 - n->ctrl_mem.addr], trans_len);
     } else {
         pci_dma_sglist_init(qsg, &n->parent_obj, num_prps);
-        qemu_sglist_add(qsg, prp1, trans_len);
     }
+
+    status = nvme_map_addr(n, qsg, iov, prp1, trans_len);
+    if (status) {
+        goto unmap;
+    }
+
     len -= trans_len;
     if (len) {
         if (unlikely(!prp2)) {
             trace_nvme_dev_err_invalid_prp2_missing();
+            status = NVME_INVALID_FIELD | NVME_DNR;
             goto unmap;
         }
         if (len > n->page_size) {
@@ -192,6 +255,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
                 if (i == n->max_prp_ents - 1 && len > n->page_size) {
                     if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
                         trace_nvme_dev_err_invalid_prplist_ent(prp_ent);
+                        status = NVME_INVALID_FIELD | NVME_DNR;
                         goto unmap;
                     }
 
@@ -205,14 +269,14 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
 
                 if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
                     trace_nvme_dev_err_invalid_prplist_ent(prp_ent);
+                    status = NVME_INVALID_FIELD | NVME_DNR;
                     goto unmap;
                 }
 
                 trans_len = MIN(len, n->page_size);
-                if (qsg->nsg){
-                    qemu_sglist_add(qsg, prp_ent, trans_len);
-                } else {
-                    qemu_iovec_add(iov, (void *)&n->cmbuf[prp_ent - n->ctrl_mem.addr], trans_len);
+                status = nvme_map_addr(n, qsg, iov, prp_ent, trans_len);
+                if (status) {
+                    goto unmap;
                 }
                 len -= trans_len;
                 i++;
@@ -220,20 +284,27 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
         } else {
             if (unlikely(prp2 & (n->page_size - 1))) {
                 trace_nvme_dev_err_invalid_prp2_align(prp2);
+                status = NVME_INVALID_FIELD | NVME_DNR;
                 goto unmap;
             }
-            if (qsg->nsg) {
-                qemu_sglist_add(qsg, prp2, len);
-            } else {
-                qemu_iovec_add(iov, (void *)&n->cmbuf[prp2 - n->ctrl_mem.addr], trans_len);
+            status = nvme_map_addr(n, qsg, iov, prp2, len);
+            if (status) {
+                goto unmap;
             }
         }
     }
     return NVME_SUCCESS;
 
- unmap:
-    qemu_sglist_destroy(qsg);
-    return NVME_INVALID_FIELD | NVME_DNR;
+unmap:
+    if (iov && iov->iov) {
+        qemu_iovec_destroy(iov);
+    }
+
+    if (qsg && qsg->sg) {
+        qemu_sglist_destroy(qsg);
+    }
+
+    return status;
 }
 
 static uint16_t nvme_dma_write_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 7ecd47131ac2..aff67b52d1e8 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -33,6 +33,7 @@ nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
 nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
+nvme_dev_map_prp(uint16_t cid, uint8_t opc, uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" opc 0x%"PRIx8" trans_len %"PRIu64" len %"PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d"
 nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 25/48] nvme: replace dma_acct with blk_acct equivalent
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (23 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 24/48] nvme: add mapping helpers Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 26/48] nvme: remove redundant has_sg member Klaus Jensen
                   ` (24 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The QSG isn't always initialized, so accounting could be wrong. Issue a
call to blk_acct_start instead with the size taken from the QSG or IOV
depending on the kind of I/O.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index b62b053d7c38..c9f7badd5a15 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -566,9 +566,10 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    dma_acct_start(n->conf.blk, &req->acct, &req->qsg, acct);
     if (req->qsg.nsg > 0) {
         req->has_sg = true;
+        block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->qsg.size,
+                         acct);
         req->aiocb = is_write ?
             dma_blk_write(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
                           nvme_rw_cb, req) :
@@ -576,6 +577,8 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
                          nvme_rw_cb, req);
     } else {
         req->has_sg = false;
+        block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->iov.size,
+                         acct);
         req->aiocb = is_write ?
             blk_aio_pwritev(n->conf.blk, data_offset, &req->iov, 0, nvme_rw_cb,
                             req) :
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 26/48] nvme: remove redundant has_sg member
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (24 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 25/48] nvme: replace dma_acct with blk_acct equivalent Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 27/48] nvme: refactor dma read/write Klaus Jensen
                   ` (23 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Remove the has_sg member from NvmeRequest since it's redundant.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 11 ++++++-----
 hw/block/nvme.h |  1 -
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index c9f7badd5a15..3e41b1337bf7 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -494,16 +494,20 @@ static void nvme_rw_cb(void *opaque, int ret)
         block_acct_failed(blk_get_stats(n->conf.blk), &req->acct);
         req->status = NVME_INTERNAL_DEV_ERROR;
     }
-    if (req->has_sg) {
+
+    if (req->qsg.nalloc) {
         qemu_sglist_destroy(&req->qsg);
     }
+    if (req->iov.nalloc) {
+        qemu_iovec_destroy(&req->iov);
+    }
+
     nvme_enqueue_req_completion(cq, req);
 }
 
 static uint16_t nvme_flush(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     NvmeRequest *req)
 {
-    req->has_sg = false;
     block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
          BLOCK_ACCT_FLUSH);
     req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
@@ -529,7 +533,6 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
         return NVME_LBA_RANGE | NVME_DNR;
     }
 
-    req->has_sg = false;
     block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
                      BLOCK_ACCT_WRITE);
     req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
@@ -567,7 +570,6 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     }
 
     if (req->qsg.nsg > 0) {
-        req->has_sg = true;
         block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->qsg.size,
                          acct);
         req->aiocb = is_write ?
@@ -576,7 +578,6 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
             dma_blk_read(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
                          nvme_rw_cb, req);
     } else {
-        req->has_sg = false;
         block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->iov.size,
                          acct);
         req->aiocb = is_write ?
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index f72ffddae160..a946ae88d817 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -29,7 +29,6 @@ typedef struct NvmeRequest {
     struct NvmeSQueue       *sq;
     BlockAIOCB              *aiocb;
     uint16_t                status;
-    bool                    has_sg;
     NvmeCqe                 cqe;
     BlockAcctCookie         acct;
     QEMUSGList              qsg;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 27/48] nvme: refactor dma read/write
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (25 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 26/48] nvme: remove redundant has_sg member Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 28/48] nvme: pass request along for tracing Klaus Jensen
                   ` (22 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Refactor the nvme_dma_{read,write}_prp functions into a common function
taking a DMADirection parameter.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 88 ++++++++++++++++++++++++-------------------------
 1 file changed, 43 insertions(+), 45 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 3e41b1337bf7..2ff7dd695cd7 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -307,55 +307,50 @@ unmap:
     return status;
 }
 
-static uint16_t nvme_dma_write_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
-                                   uint64_t prp1, uint64_t prp2)
+static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
+                             uint64_t prp1, uint64_t prp2, DMADirection dir)
 {
     QEMUSGList qsg;
     QEMUIOVector iov;
     uint16_t status = NVME_SUCCESS;
 
-    if (nvme_map_prp(&qsg, &iov, prp1, prp2, len, n)) {
-        return NVME_INVALID_FIELD | NVME_DNR;
+    status = nvme_map_prp(&qsg, &iov, prp1, prp2, len, n);
+    if (status) {
+        return status;
     }
+
     if (qsg.nsg > 0) {
-        if (dma_buf_write(ptr, len, &qsg)) {
-            status = NVME_INVALID_FIELD | NVME_DNR;
+        uint64_t residual;
+
+        if (dir == DMA_DIRECTION_TO_DEVICE) {
+            residual = dma_buf_write(ptr, len, &qsg);
+        } else {
+            residual = dma_buf_read(ptr, len, &qsg);
         }
-        qemu_sglist_destroy(&qsg);
-    } else {
-        if (qemu_iovec_to_buf(&iov, 0, ptr, len) != len) {
-            status = NVME_INVALID_FIELD | NVME_DNR;
-        }
-        qemu_iovec_destroy(&iov);
-    }
-    return status;
-}
 
-static uint16_t nvme_dma_read_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
-    uint64_t prp1, uint64_t prp2)
-{
-    QEMUSGList qsg;
-    QEMUIOVector iov;
-    uint16_t status = NVME_SUCCESS;
-
-    trace_nvme_dev_dma_read(prp1, prp2);
-
-    if (nvme_map_prp(&qsg, &iov, prp1, prp2, len, n)) {
-        return NVME_INVALID_FIELD | NVME_DNR;
-    }
-    if (qsg.nsg > 0) {
-        if (unlikely(dma_buf_read(ptr, len, &qsg))) {
+        if (unlikely(residual)) {
             trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
+
         qemu_sglist_destroy(&qsg);
     } else {
-        if (unlikely(qemu_iovec_from_buf(&iov, 0, ptr, len) != len)) {
+        size_t bytes;
+
+        if (dir == DMA_DIRECTION_TO_DEVICE) {
+            bytes = qemu_iovec_to_buf(&iov, 0, ptr, len);
+        } else {
+            bytes = qemu_iovec_from_buf(&iov, 0, ptr, len);
+        }
+
+        if (unlikely(bytes != len)) {
             trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
+
         qemu_iovec_destroy(&iov);
     }
+
     return status;
 }
 
@@ -788,8 +783,8 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
         nvme_clear_events(n, NVME_AER_TYPE_SMART);
     }
 
-    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
-                             prp2);
+    return nvme_dma_prp(n, (uint8_t *) &smart + off, trans_len, prp1, prp2,
+                        DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
@@ -808,8 +803,8 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
 
     trans_len = MIN(sizeof(fw_log) - off, buf_len);
 
-    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
-                             prp2);
+    return nvme_dma_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1, prp2,
+                        DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
@@ -833,7 +828,8 @@ static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
 
     trans_len = MIN(sizeof(errlog) - off, buf_len);
 
-    return nvme_dma_read_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2);
+    return nvme_dma_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2,
+                        DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
@@ -981,8 +977,8 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c)
 
     trace_nvme_dev_identify_ctrl();
 
-    return nvme_dma_read_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl),
-        prp1, prp2);
+    return nvme_dma_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), prp1,
+                        prp2, DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
@@ -1001,8 +997,8 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
 
     ns = &n->namespaces[nsid - 1];
 
-    return nvme_dma_read_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns),
-        prp1, prp2);
+    return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1,
+                        prp2, DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
@@ -1027,7 +1023,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
             break;
         }
     }
-    ret = nvme_dma_read_prp(n, (uint8_t *)list, data_len, prp1, prp2);
+    ret = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2,
+                       DMA_DIRECTION_FROM_DEVICE);
     g_free(list);
     return ret;
 }
@@ -1066,7 +1063,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
     ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
     stl_be_p(&ns_descrs->uuid.v, nsid);
 
-    return nvme_dma_read_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2);
+    return nvme_dma_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2,
+                        DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
@@ -1147,8 +1145,8 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 
     uint64_t timestamp = nvme_get_timestamp(n);
 
-    return nvme_dma_read_prp(n, (uint8_t *)&timestamp,
-                                 sizeof(timestamp), prp1, prp2);
+    return nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
+                        prp2, DMA_DIRECTION_FROM_DEVICE);
 }
 
 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
@@ -1233,8 +1231,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
-    ret = nvme_dma_write_prp(n, (uint8_t *)&timestamp,
-                                sizeof(timestamp), prp1, prp2);
+    ret = nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
+                       prp2, DMA_DIRECTION_TO_DEVICE);
     if (ret != NVME_SUCCESS) {
         return ret;
     }
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 28/48] nvme: pass request along for tracing
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (26 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 27/48] nvme: refactor dma read/write Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 29/48] nvme: add request mapping helper Klaus Jensen
                   ` (21 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 67 +++++++++++++++++++++++++------------------
 hw/block/trace-events |  2 +-
 2 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 2ff7dd695cd7..66f92f6f6f2d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -210,14 +210,18 @@ static uint16_t nvme_map_addr(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
-                             uint64_t prp2, uint32_t len, NvmeCtrl *n)
+static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
+                             uint64_t prp1, uint64_t prp2, uint32_t len,
+                             NvmeRequest *req)
 {
     hwaddr trans_len = n->page_size - (prp1 % n->page_size);
     trans_len = MIN(len, trans_len);
     int num_prps = (len >> n->page_bits) + 1;
     uint16_t status;
 
+    trace_nvme_dev_map_prp(nvme_cid(req), trans_len, len, prp1, prp2,
+                           num_prps);
+
     if (unlikely(!prp1)) {
         trace_nvme_dev_err_invalid_prp();
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -308,13 +312,14 @@ unmap:
 }
 
 static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
-                             uint64_t prp1, uint64_t prp2, DMADirection dir)
+                             uint64_t prp1, uint64_t prp2, DMADirection dir,
+                             NvmeRequest *req)
 {
     QEMUSGList qsg;
     QEMUIOVector iov;
     uint16_t status = NVME_SUCCESS;
 
-    status = nvme_map_prp(&qsg, &iov, prp1, prp2, len, n);
+    status = nvme_map_prp(n, &qsg, &iov, prp1, prp2, len, req);
     if (status) {
         return status;
     }
@@ -559,7 +564,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
         return NVME_LBA_RANGE | NVME_DNR;
     }
 
-    if (nvme_map_prp(&req->qsg, &req->iov, prp1, prp2, data_size, n)) {
+    if (nvme_map_prp(n, &req->qsg, &req->iov, prp1, prp2, data_size, req)) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
@@ -784,7 +789,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
     }
 
     return nvme_dma_prp(n, (uint8_t *) &smart + off, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE);
+                        DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
@@ -804,7 +809,7 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
     trans_len = MIN(sizeof(fw_log) - off, buf_len);
 
     return nvme_dma_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE);
+                        DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
@@ -829,7 +834,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
     trans_len = MIN(sizeof(errlog) - off, buf_len);
 
     return nvme_dma_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE);
+                        DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
@@ -970,7 +975,8 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c)
+static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c,
+                                   NvmeRequest *req)
 {
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
@@ -978,10 +984,11 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c)
     trace_nvme_dev_identify_ctrl();
 
     return nvme_dma_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE);
+                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
+static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c,
+                                 NvmeRequest *req)
 {
     NvmeNamespace *ns;
     uint32_t nsid = le32_to_cpu(c->nsid);
@@ -998,10 +1005,11 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
     ns = &n->namespaces[nsid - 1];
 
     return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE);
+                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
+static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c,
+                                     NvmeRequest *req)
 {
     static const int data_len = NVME_IDENTIFY_DATA_SIZE;
     uint32_t min_nsid = le32_to_cpu(c->nsid);
@@ -1024,12 +1032,13 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
         }
     }
     ret = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2,
-                       DMA_DIRECTION_FROM_DEVICE);
+                       DMA_DIRECTION_FROM_DEVICE, req);
     g_free(list);
     return ret;
 }
 
-static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
+static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c,
+                                            NvmeRequest *req)
 {
     uint32_t nsid = le32_to_cpu(c->nsid);
     uint64_t prp1 = le64_to_cpu(c->prp1);
@@ -1064,22 +1073,22 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
     stl_be_p(&ns_descrs->uuid.v, nsid);
 
     return nvme_dma_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE);
+                        DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     NvmeIdentify *c = (NvmeIdentify *)cmd;
 
     switch (le32_to_cpu(c->cns)) {
     case NVME_ID_CNS_NS:
-        return nvme_identify_ns(n, c);
+        return nvme_identify_ns(n, c, req);
     case NVME_ID_CNS_CTRL:
-        return nvme_identify_ctrl(n, c);
+        return nvme_identify_ctrl(n, c, req);
     case NVME_ID_CNS_NS_ACTIVE_LIST:
-        return nvme_identify_nslist(n, c);
+        return nvme_identify_nslist(n, c, req);
     case NVME_ID_CNS_NS_DESCR_LIST:
-        return nvme_identify_ns_descr_list(n, c);
+        return nvme_identify_ns_descr_list(n, c, req);
     default:
         trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1138,7 +1147,8 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
     return cpu_to_le64(ts.all);
 }
 
-static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
+                                           NvmeRequest *req)
 {
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
@@ -1146,7 +1156,7 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
     uint64_t timestamp = nvme_get_timestamp(n);
 
     return nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE);
+                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
@@ -1198,7 +1208,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         trace_nvme_dev_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
-        return nvme_get_feature_timestamp(n, cmd);
+        return nvme_get_feature_timestamp(n, cmd, req);
     case NVME_INTERRUPT_COALESCING:
         result = cpu_to_le32(n->features.int_coalescing);
         break;
@@ -1224,7 +1234,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
+                                           NvmeRequest *req)
 {
     uint16_t ret;
     uint64_t timestamp;
@@ -1232,7 +1243,7 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
     ret = nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
-                       prp2, DMA_DIRECTION_TO_DEVICE);
+                       prp2, DMA_DIRECTION_TO_DEVICE, req);
     if (ret != NVME_SUCCESS) {
         return ret;
     }
@@ -1303,7 +1314,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
                                       ((n->params.max_ioqpairs - 1) << 16));
         break;
     case NVME_TIMESTAMP:
-        return nvme_set_feature_timestamp(n, cmd);
+        return nvme_set_feature_timestamp(n, cmd, req);
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         n->features.async_config = dw11;
         break;
@@ -1357,7 +1368,7 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_ADM_CMD_CREATE_CQ:
         return nvme_create_cq(n, cmd);
     case NVME_ADM_CMD_IDENTIFY:
-        return nvme_identify(n, cmd);
+        return nvme_identify(n, cmd, req);
     case NVME_ADM_CMD_ABORT:
         return nvme_abort(n, cmd, req);
     case NVME_ADM_CMD_SET_FEATURES:
diff --git a/hw/block/trace-events b/hw/block/trace-events
index aff67b52d1e8..e050af87ece4 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -33,7 +33,7 @@ nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
 nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
-nvme_dev_map_prp(uint16_t cid, uint8_t opc, uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" opc 0x%"PRIx8" trans_len %"PRIu64" len %"PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d"
+nvme_dev_map_prp(uint16_t cid, uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" trans_len %"PRIu64" len %"PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d"
 nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 29/48] nvme: add request mapping helper
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (27 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 28/48] nvme: pass request along for tracing Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 30/48] nvme: verify validity of prp lists in the cmb Klaus Jensen
                   ` (20 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Introduce the nvme_map helper to remove some noise in the main nvme_rw
function.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 66f92f6f6f2d..1f4ce48b9cbb 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -359,6 +359,15 @@ static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
     return status;
 }
 
+static uint16_t nvme_map(NvmeCtrl *n, NvmeCmd *cmd, QEMUSGList *qsg,
+                         QEMUIOVector *iov, size_t len, NvmeRequest *req)
+{
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+
+    return nvme_map_prp(n, qsg, iov, prp1, prp2, len, req);
+}
+
 static void nvme_post_cqes(void *opaque)
 {
     NvmeCQueue *cq = opaque;
@@ -546,8 +555,6 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
     uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
     uint64_t slba = le64_to_cpu(rw->slba);
-    uint64_t prp1 = le64_to_cpu(rw->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(rw->dptr.prp2);
 
     uint8_t lba_index  = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
@@ -564,7 +571,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
         return NVME_LBA_RANGE | NVME_DNR;
     }
 
-    if (nvme_map_prp(n, &req->qsg, &req->iov, prp1, prp2, data_size, req)) {
+    if (nvme_map(n, cmd, &req->qsg, &req->iov, data_size, req)) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 30/48] nvme: verify validity of prp lists in the cmb
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (28 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 29/48] nvme: add request mapping helper Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 31/48] nvme: refactor request bounds checking Klaus Jensen
                   ` (19 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Before this patch the device already supported this, but it did not
check for the validity of it nor announced the support in the LISTS
field.

If some of the PRPs in a PRP list are in the CMB, then ALL entries must
be there. This patch makes sure that is verified as well as properly
announcing support for PRP lists in the CMB.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1f4ce48b9cbb..3e5e99644a4e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -218,6 +218,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
     trans_len = MIN(len, trans_len);
     int num_prps = (len >> n->page_bits) + 1;
     uint16_t status;
+    bool prp_list_in_cmb = false;
 
     trace_nvme_dev_map_prp(nvme_cid(req), trans_len, len, prp1, prp2,
                            num_prps);
@@ -245,11 +246,16 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
             status = NVME_INVALID_FIELD | NVME_DNR;
             goto unmap;
         }
+
         if (len > n->page_size) {
             uint64_t prp_list[n->max_prp_ents];
             uint32_t nents, prp_trans;
             int i = 0;
 
+            if (nvme_addr_is_cmb(n, prp2)) {
+                prp_list_in_cmb = true;
+            }
+
             nents = (len + n->page_size - 1) >> n->page_bits;
             prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
             nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
@@ -263,6 +269,11 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
                         goto unmap;
                     }
 
+                    if (prp_list_in_cmb != nvme_addr_is_cmb(n, prp_ent)) {
+                        status = NVME_INVALID_USE_OF_CMB | NVME_DNR;
+                        goto unmap;
+                    }
+
                     i = 0;
                     nents = (len + n->page_size - 1) >> n->page_bits;
                     prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
@@ -282,6 +293,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
                 if (status) {
                     goto unmap;
                 }
+
                 len -= trans_len;
                 i++;
             }
@@ -1953,7 +1965,7 @@ static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
 
     NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
     NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
-    NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
+    NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 1);
     NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
     NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
     NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2);
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 31/48] nvme: refactor request bounds checking
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (29 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 30/48] nvme: verify validity of prp lists in the cmb Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 32/48] nvme: add check for mdts Klaus Jensen
                   ` (18 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 3e5e99644a4e..7528d75905d4 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -499,6 +499,20 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
     }
 }
 
+static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns,
+                                         uint64_t slba, uint32_t nlb,
+                                         NvmeRequest *req)
+{
+    uint64_t nsze = le64_to_cpu(ns->id_ns.nsze);
+
+    if (unlikely(UINT64_MAX - slba < nlb || slba + nlb > nsze)) {
+        trace_nvme_dev_err_invalid_lba_range(slba, nlb, nsze);
+        return NVME_LBA_RANGE | NVME_DNR;
+    }
+
+    return NVME_SUCCESS;
+}
+
 static void nvme_rw_cb(void *opaque, int ret)
 {
     NvmeRequest *req = opaque;
@@ -546,12 +560,13 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     uint32_t nlb  = le16_to_cpu(rw->nlb) + 1;
     uint64_t offset = slba << data_shift;
     uint32_t count = nlb << data_shift;
+    uint16_t status;
 
     trace_nvme_dev_write_zeroes(nvme_cid(req), slba, nlb);
 
-    if (unlikely(slba + nlb > ns->id_ns.nsze)) {
-        trace_nvme_dev_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
-        return NVME_LBA_RANGE | NVME_DNR;
+    status = nvme_check_bounds(n, ns, slba, nlb, req);
+    if (status) {
+        return status;
     }
 
     block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
@@ -574,13 +589,14 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     uint64_t data_offset = slba << data_shift;
     int is_write = rw->opcode == NVME_CMD_WRITE ? 1 : 0;
     enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
+    uint16_t status;
 
     trace_nvme_dev_rw(is_write ? "write" : "read", nlb, data_size, slba);
 
-    if (unlikely((slba + nlb) > ns->id_ns.nsze)) {
+    status = nvme_check_bounds(n, ns, slba, nlb, req);
+    if (status) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
-        trace_nvme_dev_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
-        return NVME_LBA_RANGE | NVME_DNR;
+        return status;
     }
 
     if (nvme_map(n, cmd, &req->qsg, &req->iov, data_size, req)) {
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 32/48] nvme: add check for mdts
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (30 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 31/48] nvme: refactor request bounds checking Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 33/48] nvme: be consistent about zeros vs zeroes Klaus Jensen
                   ` (17 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add 'mdts' device parameter to control the Maximum Data Transfer Size of
the controller and check that it is respected.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 29 ++++++++++++++++++++++++++++-
 hw/block/nvme.h       |  4 +++-
 hw/block/trace-events |  1 +
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7528d75905d4..d8edd071b261 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -19,7 +19,8 @@
  *      -drive file=<file>,if=none,id=<drive_id>
  *      -device nvme,drive=<drive_id>,serial=<serial>,id=<id[optional]>, \
  *              cmb_size_mb=<cmb_size_mb[optional]>, \
- *              max_ioqpairs=<N[optional]>
+ *              max_ioqpairs=<N[optional]>, \
+ *              mdts=<mdts[optional]>
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
@@ -499,6 +500,19 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
     }
 }
 
+static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len,
+                                       NvmeRequest *req)
+{
+    uint8_t mdts = n->params.mdts;
+
+    if (mdts && len > n->page_size << mdts) {
+        trace_nvme_dev_err_mdts(nvme_cid(req), n->page_size << mdts, len);
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    return NVME_SUCCESS;
+}
+
 static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns,
                                          uint64_t slba, uint32_t nlb,
                                          NvmeRequest *req)
@@ -593,6 +607,12 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
 
     trace_nvme_dev_rw(is_write ? "write" : "read", nlb, data_size, slba);
 
+    status = nvme_check_mdts(n, data_size, req);
+    if (status) {
+        block_acct_invalid(blk_get_stats(n->conf.blk), acct);
+        return status;
+    }
+
     status = nvme_check_bounds(n, ns, slba, nlb, req);
     if (status) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
@@ -884,6 +904,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t numdl, numdu;
     uint64_t off, lpol, lpou;
     size_t   len;
+    uint16_t status;
 
     numdl = (dw10 >> 16);
     numdu = (dw11 & 0xffff);
@@ -899,6 +920,11 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
     trace_nvme_dev_get_log(nvme_cid(req), lid, lsp, rae, len, off);
 
+    status = nvme_check_mdts(n, len, req);
+    if (status) {
+        return status;
+    }
+
     switch (lid) {
     case NVME_LOG_ERROR_INFO:
         return nvme_error_info(n, cmd, rae, len, off, req);
@@ -2033,6 +2059,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->ieee[0] = 0x00;
     id->ieee[1] = 0x02;
     id->ieee[2] = 0xb3;
+    id->mdts = params->mdts;
     id->ver = cpu_to_le32(NVME_SPEC_VER);
     id->oacs = cpu_to_le16(0);
 
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index a946ae88d817..a25568723d0d 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -9,7 +9,8 @@
     DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 0), \
     DEFINE_PROP_UINT32("max_ioqpairs", _state, _props.max_ioqpairs, 64), \
     DEFINE_PROP_UINT8("aerl", _state, _props.aerl, 3), \
-    DEFINE_PROP_UINT32("aer_max_queued", _state, _props.aer_max_queued, 64)
+    DEFINE_PROP_UINT32("aer_max_queued", _state, _props.aer_max_queued, 64), \
+    DEFINE_PROP_UINT8("mdts", _state, _props.mdts, 7)
 
 typedef struct NvmeParams {
     char     *serial;
@@ -18,6 +19,7 @@ typedef struct NvmeParams {
     uint32_t cmb_size_mb;
     uint8_t  aerl;
     uint32_t aer_max_queued;
+    uint8_t  mdts;
 } NvmeParams;
 
 typedef struct NvmeAsyncEvent {
diff --git a/hw/block/trace-events b/hw/block/trace-events
index e050af87ece4..291422a5b77d 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -83,6 +83,7 @@ nvme_dev_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_
 nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_tail %"PRIu16""
 
 # nvme traces for error conditions
+nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" mdts %"PRIu64" len %"PRIu64""
 nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
 nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
 nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 33/48] nvme: be consistent about zeros vs zeroes
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (31 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 32/48] nvme: add check for mdts Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 34/48] nvme: refactor NvmeRequest Klaus Jensen
                   ` (16 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The spec in general uses 'zeroes' and not 'zeros'.

Now, according to the Oxford dictionary, 'zeroes' is the action of
zeroing something, i.e. "he zeroes the range" and 'zeros' is the plural
of zero. Thus, Write Zeroes should actually be called Write Zeros, but
alas, let us align with the spec.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 block/nvme.c         | 4 ++--
 hw/block/nvme.c      | 8 ++++----
 include/block/nvme.h | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 7302cc19ade4..304e975e0270 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -465,7 +465,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
                           s->page_size / sizeof(uint64_t) * s->page_size);
 
     oncs = le16_to_cpu(idctrl->oncs);
-    s->supports_write_zeroes = !!(oncs & NVME_ONCS_WRITE_ZEROS);
+    s->supports_write_zeroes = !!(oncs & NVME_ONCS_WRITE_ZEROES);
     s->supports_discard = !!(oncs & NVME_ONCS_DSM);
 
     memset(resp, 0, 4096);
@@ -1119,7 +1119,7 @@ static coroutine_fn int nvme_co_pwrite_zeroes(BlockDriverState *bs,
     }
 
     NvmeCmd cmd = {
-        .opcode = NVME_CMD_WRITE_ZEROS,
+        .opcode = NVME_CMD_WRITE_ZEROES,
         .nsid = cpu_to_le32(s->nsid),
         .cdw10 = cpu_to_le32((offset >> s->blkshift) & 0xFFFFFFFF),
         .cdw11 = cpu_to_le32(((offset >> s->blkshift) >> 32) & 0xFFFFFFFF),
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d8edd071b261..94d42046149e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -564,7 +564,7 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
+static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     NvmeRequest *req)
 {
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
@@ -662,8 +662,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     switch (cmd->opcode) {
     case NVME_CMD_FLUSH:
         return nvme_flush(n, ns, cmd, req);
-    case NVME_CMD_WRITE_ZEROS:
-        return nvme_write_zeros(n, ns, cmd, req);
+    case NVME_CMD_WRITE_ZEROES:
+        return nvme_write_zeroes(n, ns, cmd, req);
     case NVME_CMD_WRITE:
     case NVME_CMD_READ:
         return nvme_rw(n, ns, cmd, req);
@@ -2086,7 +2086,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->sqes = (0x6 << 4) | 0x6;
     id->cqes = (0x4 << 4) | 0x4;
     id->nn = cpu_to_le32(n->num_namespaces);
-    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
+    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP);
 
     pstrcpy((char *) id->subnqn, sizeof(id->subnqn), "nqn.2019-08.org.qemu:");
     pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 88e5385a9d3f..c4c669e32fc4 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -287,7 +287,7 @@ enum NvmeIoCommands {
     NVME_CMD_READ               = 0x02,
     NVME_CMD_WRITE_UNCOR        = 0x04,
     NVME_CMD_COMPARE            = 0x05,
-    NVME_CMD_WRITE_ZEROS        = 0x08,
+    NVME_CMD_WRITE_ZEROES       = 0x08,
     NVME_CMD_DSM                = 0x09,
 };
 
@@ -665,7 +665,7 @@ enum NvmeIdCtrlOncs {
     NVME_ONCS_COMPARE       = 1 << 0,
     NVME_ONCS_WRITE_UNCORR  = 1 << 1,
     NVME_ONCS_DSM           = 1 << 2,
-    NVME_ONCS_WRITE_ZEROS   = 1 << 3,
+    NVME_ONCS_WRITE_ZEROES  = 1 << 3,
     NVME_ONCS_FEATURES      = 1 << 4,
     NVME_ONCS_RESRVATIONS   = 1 << 5,
     NVME_ONCS_TIMESTAMP     = 1 << 6,
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 34/48] nvme: refactor NvmeRequest
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (32 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 33/48] nvme: be consistent about zeros vs zeroes Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 35/48] nvme: remove NvmeCmd parameter Klaus Jensen
                   ` (15 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Add a reference to the NvmeNamespace and move clearing of the structure
from "clear before use" to "clear after use".

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 38 +++++++++++++++++++++-----------------
 hw/block/nvme.h |  1 +
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 94d42046149e..a7c5f93fc545 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -159,6 +159,12 @@ static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
     }
 }
 
+static void nvme_req_clear(NvmeRequest *req)
+{
+    req->ns = NULL;
+    memset(&req->cqe, 0x0, sizeof(req->cqe));
+}
+
 static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
                                   size_t len)
 {
@@ -404,6 +410,7 @@ static void nvme_post_cqes(void *opaque)
         nvme_inc_cq_tail(cq);
         pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
             sizeof(req->cqe));
+        nvme_req_clear(req);
         QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
     }
     if (cq->tail != cq->head) {
@@ -513,10 +520,10 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len,
     return NVME_SUCCESS;
 }
 
-static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns,
-                                         uint64_t slba, uint32_t nlb,
-                                         NvmeRequest *req)
+static inline uint16_t nvme_check_bounds(NvmeCtrl *n, uint64_t slba,
+                                         uint32_t nlb, NvmeRequest *req)
 {
+    NvmeNamespace *ns = req->ns;
     uint64_t nsze = le64_to_cpu(ns->id_ns.nsze);
 
     if (unlikely(UINT64_MAX - slba < nlb || slba + nlb > nsze)) {
@@ -554,8 +561,7 @@ static void nvme_rw_cb(void *opaque, int ret)
     nvme_enqueue_req_completion(cq, req);
 }
 
-static uint16_t nvme_flush(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
-    NvmeRequest *req)
+static uint16_t nvme_flush(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
          BLOCK_ACCT_FLUSH);
@@ -564,10 +570,10 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
-    NvmeRequest *req)
+static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
+    NvmeNamespace *ns = req->ns;
     const uint8_t lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     const uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
     uint64_t slba = le64_to_cpu(rw->slba);
@@ -578,7 +584,7 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
 
     trace_nvme_dev_write_zeroes(nvme_cid(req), slba, nlb);
 
-    status = nvme_check_bounds(n, ns, slba, nlb, req);
+    status = nvme_check_bounds(n, slba, nlb, req);
     if (status) {
         return status;
     }
@@ -590,10 +596,10 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
-    NvmeRequest *req)
+static uint16_t nvme_rw(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
+    NvmeNamespace *ns = req->ns;
     uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
     uint64_t slba = le64_to_cpu(rw->slba);
 
@@ -613,7 +619,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
         return status;
     }
 
-    status = nvme_check_bounds(n, ns, slba, nlb, req);
+    status = nvme_check_bounds(n, slba, nlb, req);
     if (status) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
         return status;
@@ -647,7 +653,6 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
 
 static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
-    NvmeNamespace *ns;
     uint32_t nsid = le32_to_cpu(cmd->nsid);
 
     trace_nvme_dev_io_cmd(nvme_cid(req), nsid, le16_to_cpu(req->sq->sqid),
@@ -658,15 +663,15 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
-    ns = &n->namespaces[nsid - 1];
+    req->ns = &n->namespaces[nsid - 1];
     switch (cmd->opcode) {
     case NVME_CMD_FLUSH:
-        return nvme_flush(n, ns, cmd, req);
+        return nvme_flush(n, cmd, req);
     case NVME_CMD_WRITE_ZEROES:
-        return nvme_write_zeroes(n, ns, cmd, req);
+        return nvme_write_zeroes(n, cmd, req);
     case NVME_CMD_WRITE:
     case NVME_CMD_READ:
-        return nvme_rw(n, ns, cmd, req);
+        return nvme_rw(n, cmd, req);
     default:
         trace_nvme_dev_err_invalid_opc(cmd->opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
@@ -1463,7 +1468,6 @@ static void nvme_process_sq(void *opaque)
         req = QTAILQ_FIRST(&sq->req_list);
         QTAILQ_REMOVE(&sq->req_list, req, entry);
         QTAILQ_INSERT_TAIL(&sq->out_req_list, req, entry);
-        memset(&req->cqe, 0, sizeof(req->cqe));
         req->cqe.cid = cmd.cid;
 
         status = sq->sqid ? nvme_io_cmd(n, &cmd, req) :
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index a25568723d0d..11a42fa213ab 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -29,6 +29,7 @@ typedef struct NvmeAsyncEvent {
 
 typedef struct NvmeRequest {
     struct NvmeSQueue       *sq;
+    struct NvmeNamespace    *ns;
     BlockAIOCB              *aiocb;
     uint16_t                status;
     NvmeCqe                 cqe;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 35/48] nvme: remove NvmeCmd parameter
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (33 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 34/48] nvme: refactor NvmeRequest Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 36/48] nvme: allow multiple aios per command Klaus Jensen
                   ` (14 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Keep a copy of the raw nvme command in the NvmeRequest and remove the
now redundant NvmeCmd parameter.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 177 +++++++++++++++++++++++++-----------------------
 hw/block/nvme.h |   1 +
 2 files changed, 93 insertions(+), 85 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index a7c5f93fc545..846aa31eaae9 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -378,9 +378,10 @@ static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
     return status;
 }
 
-static uint16_t nvme_map(NvmeCtrl *n, NvmeCmd *cmd, QEMUSGList *qsg,
-                         QEMUIOVector *iov, size_t len, NvmeRequest *req)
+static uint16_t nvme_map(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
+                         size_t len, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
@@ -561,7 +562,7 @@ static void nvme_rw_cb(void *opaque, int ret)
     nvme_enqueue_req_completion(cq, req);
 }
 
-static uint16_t nvme_flush(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
 {
     block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
          BLOCK_ACCT_FLUSH);
@@ -570,9 +571,9 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
+    NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
     NvmeNamespace *ns = req->ns;
     const uint8_t lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     const uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
@@ -596,9 +597,9 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_rw(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
+    NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
     NvmeNamespace *ns = req->ns;
     uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
     uint64_t slba = le64_to_cpu(rw->slba);
@@ -625,7 +626,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return status;
     }
 
-    if (nvme_map(n, cmd, &req->qsg, &req->iov, data_size, req)) {
+    if (nvme_map(n, &req->qsg, &req->iov, data_size, req)) {
         block_acct_invalid(blk_get_stats(n->conf.blk), acct);
         return NVME_INVALID_FIELD | NVME_DNR;
     }
@@ -651,12 +652,12 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
 {
-    uint32_t nsid = le32_to_cpu(cmd->nsid);
+    uint32_t nsid = le32_to_cpu(req->cmd.nsid);
 
     trace_nvme_dev_io_cmd(nvme_cid(req), nsid, le16_to_cpu(req->sq->sqid),
-                          cmd->opcode);
+                          req->cmd.opcode);
 
     if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
         trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
@@ -664,16 +665,16 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     }
 
     req->ns = &n->namespaces[nsid - 1];
-    switch (cmd->opcode) {
+    switch (req->cmd.opcode) {
     case NVME_CMD_FLUSH:
-        return nvme_flush(n, cmd, req);
+        return nvme_flush(n, req);
     case NVME_CMD_WRITE_ZEROES:
-        return nvme_write_zeroes(n, cmd, req);
+        return nvme_write_zeroes(n, req);
     case NVME_CMD_WRITE:
     case NVME_CMD_READ:
-        return nvme_rw(n, cmd, req);
+        return nvme_rw(n, req);
     default:
-        trace_nvme_dev_err_invalid_opc(cmd->opcode);
+        trace_nvme_dev_err_invalid_opc(req->cmd.opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
     }
 }
@@ -689,10 +690,10 @@ static void nvme_free_sq(NvmeSQueue *sq, NvmeCtrl *n)
     }
 }
 
-static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeDeleteQ *c = (NvmeDeleteQ *)cmd;
-    NvmeRequest *req, *next;
+    NvmeDeleteQ *c = (NvmeDeleteQ *)&req->cmd;
+    NvmeRequest *r, *next;
     NvmeSQueue *sq;
     NvmeCQueue *cq;
     uint16_t qid = le16_to_cpu(c->qid);
@@ -706,19 +707,19 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeCmd *cmd)
 
     sq = n->sq[qid];
     while (!QTAILQ_EMPTY(&sq->out_req_list)) {
-        req = QTAILQ_FIRST(&sq->out_req_list);
-        assert(req->aiocb);
-        blk_aio_cancel(req->aiocb);
+        r = QTAILQ_FIRST(&sq->out_req_list);
+        assert(r->aiocb);
+        blk_aio_cancel(r->aiocb);
     }
     if (!nvme_check_cqid(n, sq->cqid)) {
         cq = n->cq[sq->cqid];
         QTAILQ_REMOVE(&cq->sq_list, sq, entry);
 
         nvme_post_cqes(cq);
-        QTAILQ_FOREACH_SAFE(req, &cq->req_list, entry, next) {
-            if (req->sq == sq) {
-                QTAILQ_REMOVE(&cq->req_list, req, entry);
-                QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
+        QTAILQ_FOREACH_SAFE(r, &cq->req_list, entry, next) {
+            if (r->sq == sq) {
+                QTAILQ_REMOVE(&cq->req_list, r, entry);
+                QTAILQ_INSERT_TAIL(&sq->req_list, r, entry);
             }
         }
     }
@@ -755,10 +756,10 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
     n->sq[sqid] = sq;
 }
 
-static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeSQueue *sq;
-    NvmeCreateSq *c = (NvmeCreateSq *)cmd;
+    NvmeCreateSq *c = (NvmeCreateSq *)&req->cmd;
 
     uint16_t cqid = le16_to_cpu(c->cqid);
     uint16_t sqid = le16_to_cpu(c->sqid);
@@ -793,10 +794,10 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
-                                uint32_t buf_len, uint64_t off,
-                                NvmeRequest *req)
+static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     uint32_t nsid = le32_to_cpu(cmd->nsid);
@@ -852,10 +853,11 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
                         DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
-                                 uint64_t off, NvmeRequest *req)
+static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
+                                 NvmeRequest *req)
 {
     uint32_t trans_len;
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeFwSlotInfoLog fw_log;
@@ -872,11 +874,11 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
                         DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
-                                uint32_t buf_len, uint64_t off,
-                                NvmeRequest *req)
+static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
 {
     uint32_t trans_len;
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeErrorLog errlog;
@@ -897,8 +899,10 @@ static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
                         DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
+
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t dw12 = le32_to_cpu(cmd->cdw12);
@@ -932,11 +936,11 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
     switch (lid) {
     case NVME_LOG_ERROR_INFO:
-        return nvme_error_info(n, cmd, rae, len, off, req);
+        return nvme_error_info(n, rae, len, off, req);
     case NVME_LOG_SMART_INFO:
-        return nvme_smart_info(n, cmd, rae, len, off, req);
+        return nvme_smart_info(n, rae, len, off, req);
     case NVME_LOG_FW_SLOT_INFO:
-        return nvme_fw_log_info(n, cmd, len, off, req);
+        return nvme_fw_log_info(n, len, off, req);
     default:
         trace_nvme_dev_err_invalid_log_page(nvme_cid(req), lid);
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -954,9 +958,9 @@ static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
     }
 }
 
-static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeDeleteQ *c = (NvmeDeleteQ *)cmd;
+    NvmeDeleteQ *c = (NvmeDeleteQ *)&req->cmd;
     NvmeCQueue *cq;
     uint16_t qid = le16_to_cpu(c->qid);
 
@@ -994,10 +998,10 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
     cq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_post_cqes, cq);
 }
 
-static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
+static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeCQueue *cq;
-    NvmeCreateCq *c = (NvmeCreateCq *)cmd;
+    NvmeCreateCq *c = (NvmeCreateCq *)&req->cmd;
     uint16_t cqid = le16_to_cpu(c->cqid);
     uint16_t vector = le16_to_cpu(c->irq_vector);
     uint16_t qsize = le16_to_cpu(c->qsize);
@@ -1041,9 +1045,9 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c,
-                                   NvmeRequest *req)
+static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
 
@@ -1053,10 +1057,10 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c,
                         prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c,
-                                 NvmeRequest *req)
+static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeNamespace *ns;
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint32_t nsid = le32_to_cpu(c->nsid);
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
@@ -1074,9 +1078,9 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c,
                         prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c,
-                                     NvmeRequest *req)
+static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     static const int data_len = NVME_IDENTIFY_DATA_SIZE;
     uint32_t min_nsid = le32_to_cpu(c->nsid);
     uint64_t prp1 = le64_to_cpu(c->prp1);
@@ -1103,9 +1107,9 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c,
     return ret;
 }
 
-static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c,
-                                            NvmeRequest *req)
+static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint32_t nsid = le32_to_cpu(c->nsid);
     uint64_t prp1 = le64_to_cpu(c->prp1);
     uint64_t prp2 = le64_to_cpu(c->prp2);
@@ -1142,28 +1146,28 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c,
                         DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeIdentify *c = (NvmeIdentify *)cmd;
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
 
     switch (le32_to_cpu(c->cns)) {
     case NVME_ID_CNS_NS:
-        return nvme_identify_ns(n, c, req);
+        return nvme_identify_ns(n, req);
     case NVME_ID_CNS_CTRL:
-        return nvme_identify_ctrl(n, c, req);
+        return nvme_identify_ctrl(n, req);
     case NVME_ID_CNS_NS_ACTIVE_LIST:
-        return nvme_identify_nslist(n, c, req);
+        return nvme_identify_nslist(n, req);
     case NVME_ID_CNS_NS_DESCR_LIST:
-        return nvme_identify_ns_descr_list(n, c, req);
+        return nvme_identify_ns_descr_list(n, req);
     default:
         trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 }
 
-static uint16_t nvme_abort(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req)
 {
-    uint16_t sqid = le32_to_cpu(cmd->cdw10) & 0xffff;
+    uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff;
 
     req->cqe.result = 1;
     if (nvme_check_sqid(n, sqid)) {
@@ -1213,9 +1217,9 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
     return cpu_to_le64(ts.all);
 }
 
-static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
-                                           NvmeRequest *req)
+static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
@@ -1225,8 +1229,9 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
                         prp2, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t result;
@@ -1274,7 +1279,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         trace_nvme_dev_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
-        return nvme_get_feature_timestamp(n, cmd, req);
+        return nvme_get_feature_timestamp(n, req);
     case NVME_INTERRUPT_COALESCING:
         result = cpu_to_le32(n->features.int_coalescing);
         break;
@@ -1300,11 +1305,11 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
-                                           NvmeRequest *req)
+static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
 {
     uint16_t ret;
     uint64_t timestamp;
+    NvmeCmd *cmd = &req->cmd;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
@@ -1319,8 +1324,9 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd,
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeCmd *cmd = &req->cmd;
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
 
@@ -1380,7 +1386,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
                                       ((n->params.max_ioqpairs - 1) << 16));
         break;
     case NVME_TIMESTAMP:
-        return nvme_set_feature_timestamp(n, cmd, req);
+        return nvme_set_feature_timestamp(n, req);
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         n->features.async_config = dw11;
         break;
@@ -1398,7 +1404,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req)
 {
     trace_nvme_dev_aer(nvme_cid(req));
 
@@ -1417,34 +1423,34 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
 {
     trace_nvme_dev_admin_cmd(nvme_cid(req), le16_to_cpu(req->sq->sqid),
-                             cmd->opcode);
+                             req->cmd.opcode);
 
-    switch (cmd->opcode) {
+    switch (req->cmd.opcode) {
     case NVME_ADM_CMD_DELETE_SQ:
-        return nvme_del_sq(n, cmd);
+        return nvme_del_sq(n, req);
     case NVME_ADM_CMD_CREATE_SQ:
-        return nvme_create_sq(n, cmd);
+        return nvme_create_sq(n, req);
     case NVME_ADM_CMD_GET_LOG_PAGE:
-        return nvme_get_log(n, cmd, req);
+        return nvme_get_log(n, req);
     case NVME_ADM_CMD_DELETE_CQ:
-        return nvme_del_cq(n, cmd);
+        return nvme_del_cq(n, req);
     case NVME_ADM_CMD_CREATE_CQ:
-        return nvme_create_cq(n, cmd);
+        return nvme_create_cq(n, req);
     case NVME_ADM_CMD_IDENTIFY:
-        return nvme_identify(n, cmd, req);
+        return nvme_identify(n, req);
     case NVME_ADM_CMD_ABORT:
-        return nvme_abort(n, cmd, req);
+        return nvme_abort(n, req);
     case NVME_ADM_CMD_SET_FEATURES:
-        return nvme_set_feature(n, cmd, req);
+        return nvme_set_feature(n, req);
     case NVME_ADM_CMD_GET_FEATURES:
-        return nvme_get_feature(n, cmd, req);
+        return nvme_get_feature(n, req);
     case NVME_ADM_CMD_ASYNC_EV_REQ:
-        return nvme_aer(n, cmd, req);
+        return nvme_aer(n, req);
     default:
-        trace_nvme_dev_err_invalid_admin_opc(cmd->opcode);
+        trace_nvme_dev_err_invalid_admin_opc(req->cmd.opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
     }
 }
@@ -1469,9 +1475,10 @@ static void nvme_process_sq(void *opaque)
         QTAILQ_REMOVE(&sq->req_list, req, entry);
         QTAILQ_INSERT_TAIL(&sq->out_req_list, req, entry);
         req->cqe.cid = cmd.cid;
+        memcpy(&req->cmd, &cmd, sizeof(NvmeCmd));
 
-        status = sq->sqid ? nvme_io_cmd(n, &cmd, req) :
-            nvme_admin_cmd(n, &cmd, req);
+        status = sq->sqid ? nvme_io_cmd(n, req) :
+            nvme_admin_cmd(n, req);
         if (status != NVME_NO_COMPLETE) {
             req->status = status;
             nvme_enqueue_req_completion(cq, req);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 11a42fa213ab..a781528a0ccd 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -33,6 +33,7 @@ typedef struct NvmeRequest {
     BlockAIOCB              *aiocb;
     uint16_t                status;
     NvmeCqe                 cqe;
+    NvmeCmd                 cmd;
     BlockAcctCookie         acct;
     QEMUSGList              qsg;
     QEMUIOVector            iov;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 36/48] nvme: allow multiple aios per command
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (34 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 35/48] nvme: remove NvmeCmd parameter Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 37/48] nvme: add nvme_check_rw helper Klaus Jensen
                   ` (13 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

This refactors how the device issues asynchronous block backend
requests. The NvmeRequest now holds a queue of NvmeAIOs that are
associated with the command. This allows multiple aios to be issued for
a command. Only when all requests have been completed will the device
post a completion queue entry.

Because the device is currently guaranteed to only issue a single aio
request per command, the benefit is not immediately obvious. But this
functionality is required to support metadata, the dataset management
command and other features.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 328 +++++++++++++++++++++++++++++++++---------
 hw/block/nvme.h       | 101 +++++++++++--
 hw/block/trace-events |   3 +
 3 files changed, 350 insertions(+), 82 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 846aa31eaae9..c123be10fd0d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -59,6 +59,7 @@
     } while (0)
 
 static void nvme_process_sq(void *opaque);
+static void nvme_aio_cb(void *opaque, int ret);
 
 static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr)
 {
@@ -163,6 +164,17 @@ static void nvme_req_clear(NvmeRequest *req)
 {
     req->ns = NULL;
     memset(&req->cqe, 0x0, sizeof(req->cqe));
+    req->status = NVME_SUCCESS;
+    req->slba = req->nlb = 0x0;
+    req->cb = req->cb_arg = NULL;
+
+    if (req->qsg.sg) {
+        qemu_sglist_destroy(&req->qsg);
+    }
+
+    if (req->iov.iov) {
+        qemu_iovec_destroy(&req->iov);
+    }
 }
 
 static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
@@ -388,6 +400,109 @@ static uint16_t nvme_map(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
     return nvme_map_prp(n, qsg, iov, prp1, prp2, len, req);
 }
 
+static void nvme_aio_destroy(NvmeAIO *aio)
+{
+    g_free(aio);
+}
+
+/*
+ * Register an asynchronous I/O operation with the NvmeRequest. The NvmeRequest
+ * will not complete until all registered AIO's have completed and the
+ * aio_tailq goes empty.
+ */
+static inline void nvme_req_add_aio(NvmeRequest *req, NvmeAIO *aio,
+                                    NvmeAIOOp opc)
+{
+    aio->opc = opc;
+
+    trace_nvme_dev_req_add_aio(nvme_cid(req), aio, blk_name(aio->blk),
+                               aio->offset, aio->len,
+                               nvme_aio_opc_str(aio), req);
+
+    if (req) {
+        QTAILQ_INSERT_TAIL(&req->aio_tailq, aio, tailq_entry);
+    }
+}
+
+/*
+ * Submit an asynchronous I/O operation as described by the given NvmeAIO. This
+ * function takes care of accounting and special handling of reads and writes
+ * going to the Controller Memory Buffer.
+ */
+static void nvme_submit_aio(NvmeAIO *aio)
+{
+    BlockBackend *blk = aio->blk;
+    BlockAcctCookie *acct = &aio->acct;
+    BlockAcctStats *stats = blk_get_stats(blk);
+
+    bool is_write;
+
+    switch (aio->opc) {
+    case NVME_AIO_OPC_NONE:
+        break;
+
+    case NVME_AIO_OPC_FLUSH:
+        block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
+        aio->aiocb = blk_aio_flush(blk, nvme_aio_cb, aio);
+        break;
+
+    case NVME_AIO_OPC_WRITE_ZEROES:
+        block_acct_start(stats, acct, aio->len, BLOCK_ACCT_WRITE);
+        aio->aiocb = blk_aio_pwrite_zeroes(blk, aio->offset, aio->len,
+                                           BDRV_REQ_MAY_UNMAP, nvme_aio_cb,
+                                           aio);
+        break;
+
+    case NVME_AIO_OPC_READ:
+    case NVME_AIO_OPC_WRITE:
+        is_write = (aio->opc == NVME_AIO_OPC_WRITE);
+
+        block_acct_start(stats, acct, aio->len,
+                         is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
+
+        if (aio->qsg) {
+            if (is_write) {
+                aio->aiocb = dma_blk_write(blk, aio->qsg, aio->offset,
+                                           BDRV_SECTOR_SIZE, nvme_aio_cb, aio);
+            } else {
+                aio->aiocb = dma_blk_read(blk, aio->qsg, aio->offset,
+                                          BDRV_SECTOR_SIZE, nvme_aio_cb, aio);
+            }
+        } else {
+            if (is_write) {
+                aio->aiocb = blk_aio_pwritev(blk, aio->offset, aio->iov, 0,
+                                             nvme_aio_cb, aio);
+            } else {
+                aio->aiocb = blk_aio_preadv(blk, aio->offset, aio->iov, 0,
+                                            nvme_aio_cb, aio);
+            }
+        }
+
+        break;
+    }
+}
+
+static void nvme_rw_aio(BlockBackend *blk, uint64_t offset, NvmeRequest *req)
+{
+    NvmeAIO *aio;
+    size_t len = req->qsg.nsg > 0 ? req->qsg.size : req->iov.size;
+
+    aio = g_new0(NvmeAIO, 1);
+
+    *aio = (NvmeAIO) {
+        .blk = blk,
+        .offset = offset,
+        .len = len,
+        .req = req,
+        .qsg = req->qsg.sg ? &req->qsg : NULL,
+        .iov = req->iov.iov ? &req->iov : NULL,
+    };
+
+    nvme_req_add_aio(req, aio, nvme_req_is_write(req) ?
+                     NVME_AIO_OPC_WRITE : NVME_AIO_OPC_READ);
+    nvme_submit_aio(aio);
+}
+
 static void nvme_post_cqes(void *opaque)
 {
     NvmeCQueue *cq = opaque;
@@ -424,6 +539,7 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
     assert(cq->cqid == req->sq->cqid);
     trace_nvme_dev_enqueue_req_completion(nvme_cid(req), cq->cqid,
                                           req->status);
+
     QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
     QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
@@ -535,38 +651,104 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, uint64_t slba,
     return NVME_SUCCESS;
 }
 
-static void nvme_rw_cb(void *opaque, int ret)
+static void nvme_rw_cb(NvmeRequest *req, void *opaque)
 {
-    NvmeRequest *req = opaque;
     NvmeSQueue *sq = req->sq;
     NvmeCtrl *n = sq->ctrl;
     NvmeCQueue *cq = n->cq[sq->cqid];
 
     trace_nvme_dev_rw_cb(nvme_cid(req));
 
-    if (!ret) {
-        block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
-        req->status = NVME_SUCCESS;
-    } else {
-        block_acct_failed(blk_get_stats(n->conf.blk), &req->acct);
-        req->status = NVME_INTERNAL_DEV_ERROR;
-    }
-
-    if (req->qsg.nalloc) {
-        qemu_sglist_destroy(&req->qsg);
-    }
-    if (req->iov.nalloc) {
-        qemu_iovec_destroy(&req->iov);
-    }
-
     nvme_enqueue_req_completion(cq, req);
 }
 
+static void nvme_aio_cb(void *opaque, int ret)
+{
+    NvmeAIO *aio = opaque;
+    NvmeRequest *req = aio->req;
+
+    BlockBackend *blk = aio->blk;
+    BlockAcctCookie *acct = &aio->acct;
+    BlockAcctStats *stats = blk_get_stats(blk);
+
+    Error *local_err = NULL;
+
+    trace_nvme_dev_aio_cb(nvme_cid(req), aio, blk_name(blk), aio->offset,
+                          nvme_aio_opc_str(aio), req);
+
+    if (req) {
+        QTAILQ_REMOVE(&req->aio_tailq, aio, tailq_entry);
+    }
+
+    if (!ret) {
+        block_acct_done(stats, acct);
+    } else {
+        block_acct_failed(stats, acct);
+
+        if (req) {
+            uint16_t status;
+
+            switch (aio->opc) {
+            case NVME_AIO_OPC_READ:
+                status = NVME_UNRECOVERED_READ;
+                break;
+            case NVME_AIO_OPC_WRITE:
+            case NVME_AIO_OPC_WRITE_ZEROES:
+                status = NVME_WRITE_FAULT;
+                break;
+            default:
+                status = NVME_INTERNAL_DEV_ERROR;
+                break;
+            }
+
+            trace_nvme_dev_err_aio(nvme_cid(req), aio, blk_name(blk),
+                                   aio->offset, nvme_aio_opc_str(aio), req,
+                                   status);
+
+            error_setg_errno(&local_err, -ret, "aio failed");
+            error_report_err(local_err);
+
+            /*
+             * An Internal Error trumps all other errors. For other errors,
+             * only set the first error encountered. Any additional errors will
+             * be recorded in the error information log page.
+             */
+            if (!req->status || (status & 0xfff) == NVME_INTERNAL_DEV_ERROR) {
+                req->status = status;
+            }
+        }
+    }
+
+    if (aio->cb) {
+        aio->cb(aio, aio->cb_arg, ret);
+    }
+
+    if (req && QTAILQ_EMPTY(&req->aio_tailq)) {
+        if (req->cb) {
+            req->cb(req, req->cb_arg);
+        } else {
+            NvmeSQueue *sq = req->sq;
+            NvmeCtrl *n = sq->ctrl;
+            NvmeCQueue *cq = n->cq[sq->cqid];
+
+            nvme_enqueue_req_completion(cq, req);
+        }
+    }
+
+    nvme_aio_destroy(aio);
+}
+
 static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
 {
-    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
-         BLOCK_ACCT_FLUSH);
-    req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
+    NvmeAIO *aio = g_new0(NvmeAIO, 1);
+
+    *aio = (NvmeAIO) {
+        .blk = n->conf.blk,
+        .req = req,
+    };
+
+    nvme_req_add_aio(req, aio, NVME_AIO_OPC_FLUSH);
+    nvme_submit_aio(aio);
 
     return NVME_NO_COMPLETE;
 }
@@ -575,25 +757,38 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
     NvmeNamespace *ns = req->ns;
-    const uint8_t lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
-    const uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
-    uint64_t slba = le64_to_cpu(rw->slba);
-    uint32_t nlb  = le16_to_cpu(rw->nlb) + 1;
-    uint64_t offset = slba << data_shift;
-    uint32_t count = nlb << data_shift;
+    NvmeAIO *aio;
+
+    int64_t offset;
+    size_t count;
     uint16_t status;
 
-    trace_nvme_dev_write_zeroes(nvme_cid(req), slba, nlb);
 
-    status = nvme_check_bounds(n, slba, nlb, req);
+    req->slba = le64_to_cpu(rw->slba);
+    req->nlb  = le16_to_cpu(rw->nlb) + 1;
+
+    trace_nvme_dev_write_zeroes(nvme_cid(req), req->slba, req->nlb);
+
+    status = nvme_check_bounds(n, req->slba, req->nlb, req);
     if (status) {
         return status;
     }
 
-    block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0,
-                     BLOCK_ACCT_WRITE);
-    req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
-                                        BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
+    offset = req->slba << nvme_ns_lbads(ns);
+    count = req->nlb << nvme_ns_lbads(ns);
+
+    aio = g_new0(NvmeAIO, 1);
+
+    *aio = (NvmeAIO) {
+        .blk = n->conf.blk,
+        .offset = offset,
+        .len = count,
+        .req = req,
+    };
+
+    nvme_req_add_aio(req, aio, NVME_AIO_OPC_WRITE_ZEROES);
+    nvme_submit_aio(aio);
+
     return NVME_NO_COMPLETE;
 }
 
@@ -601,55 +796,43 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
     NvmeNamespace *ns = req->ns;
-    uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
-    uint64_t slba = le64_to_cpu(rw->slba);
+    uint32_t len;
+    int status;
 
-    uint8_t lba_index  = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
-    uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
-    uint64_t data_size = (uint64_t)nlb << data_shift;
-    uint64_t data_offset = slba << data_shift;
-    int is_write = rw->opcode == NVME_CMD_WRITE ? 1 : 0;
-    enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
-    uint16_t status;
+    enum BlockAcctType acct =
+        nvme_req_is_write(req) ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
 
-    trace_nvme_dev_rw(is_write ? "write" : "read", nlb, data_size, slba);
+    req->nlb  = le16_to_cpu(rw->nlb) + 1;
+    req->slba = le64_to_cpu(rw->slba);
 
-    status = nvme_check_mdts(n, data_size, req);
+    len = req->nlb << nvme_ns_lbads(ns);
+
+    trace_nvme_dev_rw(nvme_req_is_write(req) ? "write" : "read", req->nlb,
+                      req->nlb << nvme_ns_lbads(req->ns), req->slba);
+
+    status = nvme_check_mdts(n, len, req);
     if (status) {
-        block_acct_invalid(blk_get_stats(n->conf.blk), acct);
-        return status;
+        goto invalid;
     }
 
-    status = nvme_check_bounds(n, slba, nlb, req);
+    status = nvme_check_bounds(n, req->slba, req->nlb, req);
     if (status) {
-        block_acct_invalid(blk_get_stats(n->conf.blk), acct);
-        return status;
+        goto invalid;
     }
 
-    if (nvme_map(n, &req->qsg, &req->iov, data_size, req)) {
-        block_acct_invalid(blk_get_stats(n->conf.blk), acct);
-        return NVME_INVALID_FIELD | NVME_DNR;
+    status = nvme_map(n, &req->qsg, &req->iov, len, req);
+    if (status) {
+        goto invalid;
     }
 
-    if (req->qsg.nsg > 0) {
-        block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->qsg.size,
-                         acct);
-        req->aiocb = is_write ?
-            dma_blk_write(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
-                          nvme_rw_cb, req) :
-            dma_blk_read(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
-                         nvme_rw_cb, req);
-    } else {
-        block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->iov.size,
-                         acct);
-        req->aiocb = is_write ?
-            blk_aio_pwritev(n->conf.blk, data_offset, &req->iov, 0, nvme_rw_cb,
-                            req) :
-            blk_aio_preadv(n->conf.blk, data_offset, &req->iov, 0, nvme_rw_cb,
-                           req);
-    }
+    nvme_rw_aio(n->conf.blk, req->slba << nvme_ns_lbads(ns), req);
+    nvme_req_set_cb(req, nvme_rw_cb, NULL);
 
     return NVME_NO_COMPLETE;
+
+invalid:
+    block_acct_invalid(blk_get_stats(n->conf.blk), acct);
+    return status;
 }
 
 static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
@@ -696,6 +879,7 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
     NvmeRequest *r, *next;
     NvmeSQueue *sq;
     NvmeCQueue *cq;
+    NvmeAIO *aio;
     uint16_t qid = le16_to_cpu(c->qid);
 
     if (unlikely(!qid || nvme_check_sqid(n, qid))) {
@@ -708,8 +892,11 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
     sq = n->sq[qid];
     while (!QTAILQ_EMPTY(&sq->out_req_list)) {
         r = QTAILQ_FIRST(&sq->out_req_list);
-        assert(r->aiocb);
-        blk_aio_cancel(r->aiocb);
+        while (!QTAILQ_EMPTY(&r->aio_tailq)) {
+            aio = QTAILQ_FIRST(&r->aio_tailq);
+            assert(aio->aiocb);
+            blk_aio_cancel(aio->aiocb);
+        }
     }
     if (!nvme_check_cqid(n, sq->cqid)) {
         cq = n->cq[sq->cqid];
@@ -746,6 +933,7 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
     QTAILQ_INIT(&sq->out_req_list);
     for (i = 0; i < sq->size; i++) {
         sq->io_req[i].sq = sq;
+        QTAILQ_INIT(&(sq->io_req[i].aio_tailq));
         QTAILQ_INSERT_TAIL(&(sq->req_list), &sq->io_req[i], entry);
     }
     sq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index a781528a0ccd..ed4a5ce4121d 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -27,18 +27,41 @@ typedef struct NvmeAsyncEvent {
     NvmeAerResult result;
 } NvmeAsyncEvent;
 
-typedef struct NvmeRequest {
-    struct NvmeSQueue       *sq;
-    struct NvmeNamespace    *ns;
-    BlockAIOCB              *aiocb;
-    uint16_t                status;
-    NvmeCqe                 cqe;
-    NvmeCmd                 cmd;
-    BlockAcctCookie         acct;
-    QEMUSGList              qsg;
-    QEMUIOVector            iov;
-    QTAILQ_ENTRY(NvmeRequest)entry;
-} NvmeRequest;
+typedef struct NvmeRequest NvmeRequest;
+typedef void NvmeRequestCompletionFunc(NvmeRequest *req, void *opaque);
+
+struct NvmeRequest {
+    struct NvmeSQueue    *sq;
+    struct NvmeNamespace *ns;
+
+    NvmeCqe  cqe;
+    NvmeCmd  cmd;
+    uint16_t status;
+
+    uint64_t slba;
+    uint32_t nlb;
+
+    QEMUSGList   qsg;
+    QEMUIOVector iov;
+
+    NvmeRequestCompletionFunc *cb;
+    void                      *cb_arg;
+
+    QTAILQ_HEAD(, NvmeAIO)    aio_tailq;
+    QTAILQ_ENTRY(NvmeRequest) entry;
+};
+
+static inline void nvme_req_set_cb(NvmeRequest *req,
+                                   NvmeRequestCompletionFunc *cb, void *cb_arg)
+{
+    req->cb = cb;
+    req->cb_arg = cb_arg;
+}
+
+static inline void nvme_req_clear_cb(NvmeRequest *req)
+{
+    req->cb = req->cb_arg = NULL;
+}
 
 typedef struct NvmeSQueue {
     struct NvmeCtrl *ctrl;
@@ -90,6 +113,60 @@ static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
     return 1 << nvme_ns_lbads(ns);
 }
 
+typedef enum NvmeAIOOp {
+    NVME_AIO_OPC_NONE         = 0x0,
+    NVME_AIO_OPC_FLUSH        = 0x1,
+    NVME_AIO_OPC_READ         = 0x2,
+    NVME_AIO_OPC_WRITE        = 0x3,
+    NVME_AIO_OPC_WRITE_ZEROES = 0x4,
+} NvmeAIOOp;
+
+typedef struct NvmeAIO NvmeAIO;
+typedef void NvmeAIOCompletionFunc(NvmeAIO *aio, void *opaque, int ret);
+
+struct NvmeAIO {
+    NvmeRequest *req;
+
+    NvmeAIOOp       opc;
+    int64_t         offset;
+    size_t          len;
+    BlockBackend    *blk;
+    BlockAIOCB      *aiocb;
+    BlockAcctCookie acct;
+
+    NvmeAIOCompletionFunc *cb;
+    void                  *cb_arg;
+
+    QEMUSGList   *qsg;
+    QEMUIOVector *iov;
+
+    QTAILQ_ENTRY(NvmeAIO) tailq_entry;
+};
+
+static inline const char *nvme_aio_opc_str(NvmeAIO *aio)
+{
+    switch (aio->opc) {
+    case NVME_AIO_OPC_NONE:         return "NVME_AIO_OP_NONE";
+    case NVME_AIO_OPC_FLUSH:        return "NVME_AIO_OP_FLUSH";
+    case NVME_AIO_OPC_READ:         return "NVME_AIO_OP_READ";
+    case NVME_AIO_OPC_WRITE:        return "NVME_AIO_OP_WRITE";
+    case NVME_AIO_OPC_WRITE_ZEROES: return "NVME_AIO_OP_WRITE_ZEROES";
+    default:                        return "NVME_AIO_OP_UNKNOWN";
+    }
+}
+
+static inline bool nvme_req_is_write(NvmeRequest *req)
+{
+    switch (req->cmd.opcode) {
+    case NVME_CMD_WRITE:
+    case NVME_CMD_WRITE_UNCOR:
+    case NVME_CMD_WRITE_ZEROES:
+        return true;
+    default:
+        return false;
+    }
+}
+
 #define TYPE_NVME "nvme"
 #define NVME(obj) \
         OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 291422a5b77d..7c277a2999c0 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -34,6 +34,8 @@ nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
 nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
 nvme_dev_map_prp(uint16_t cid, uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" trans_len %"PRIu64" len %"PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d"
+nvme_dev_req_add_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, uint64_t count, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" count %"PRIu64" opc \"%s\" req %p"
+nvme_dev_aio_cb(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p"
 nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
@@ -84,6 +86,7 @@ nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_
 
 # nvme traces for error conditions
 nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" mdts %"PRIu64" len %"PRIu64""
+nvme_dev_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16""
 nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
 nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
 nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 37/48] nvme: add nvme_check_rw helper
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (35 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 36/48] nvme: allow multiple aios per command Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 38/48] nvme: use preallocated qsg/iov in nvme_dma_prp Klaus Jensen
                   ` (12 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index c123be10fd0d..ffc49985321b 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -651,6 +651,25 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, uint64_t slba,
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_check_rw(NvmeCtrl *n, NvmeRequest *req)
+{
+    NvmeNamespace *ns = req->ns;
+    size_t len = req->nlb << nvme_ns_lbads(ns);
+    uint16_t status;
+
+    status = nvme_check_mdts(n, len, req);
+    if (status) {
+        return status;
+    }
+
+    status = nvme_check_bounds(n, req->slba, req->nlb, req);
+    if (status) {
+        return status;
+    }
+
+    return NVME_SUCCESS;
+}
+
 static void nvme_rw_cb(NvmeRequest *req, void *opaque)
 {
     NvmeSQueue *sq = req->sq;
@@ -810,12 +829,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
     trace_nvme_dev_rw(nvme_req_is_write(req) ? "write" : "read", req->nlb,
                       req->nlb << nvme_ns_lbads(req->ns), req->slba);
 
-    status = nvme_check_mdts(n, len, req);
-    if (status) {
-        goto invalid;
-    }
-
-    status = nvme_check_bounds(n, req->slba, req->nlb, req);
+    status = nvme_check_rw(n, req);
     if (status) {
         goto invalid;
     }
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 38/48] nvme: use preallocated qsg/iov in nvme_dma_prp
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (36 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 37/48] nvme: add nvme_check_rw helper Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 39/48] pci: pass along the return value of dma_memory_rw Klaus Jensen
                   ` (11 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Since clean up of the request qsg/iov has been moved to the common
nvme_enqueue_req_completion function, there is no need to use a stack
allocated qsg/iov in nvme_dma_prp.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index ffc49985321b..eb15a0bd3cf9 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -346,45 +346,39 @@ static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
                              uint64_t prp1, uint64_t prp2, DMADirection dir,
                              NvmeRequest *req)
 {
-    QEMUSGList qsg;
-    QEMUIOVector iov;
     uint16_t status = NVME_SUCCESS;
 
-    status = nvme_map_prp(n, &qsg, &iov, prp1, prp2, len, req);
+    status = nvme_map_prp(n, &req->qsg, &req->iov, prp1, prp2, len, req);
     if (status) {
         return status;
     }
 
-    if (qsg.nsg > 0) {
+    if (req->qsg.nsg > 0) {
         uint64_t residual;
 
         if (dir == DMA_DIRECTION_TO_DEVICE) {
-            residual = dma_buf_write(ptr, len, &qsg);
+            residual = dma_buf_write(ptr, len, &req->qsg);
         } else {
-            residual = dma_buf_read(ptr, len, &qsg);
+            residual = dma_buf_read(ptr, len, &req->qsg);
         }
 
         if (unlikely(residual)) {
             trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
-
-        qemu_sglist_destroy(&qsg);
     } else {
         size_t bytes;
 
         if (dir == DMA_DIRECTION_TO_DEVICE) {
-            bytes = qemu_iovec_to_buf(&iov, 0, ptr, len);
+            bytes = qemu_iovec_to_buf(&req->iov, 0, ptr, len);
         } else {
-            bytes = qemu_iovec_from_buf(&iov, 0, ptr, len);
+            bytes = qemu_iovec_from_buf(&req->iov, 0, ptr, len);
         }
 
         if (unlikely(bytes != len)) {
             trace_nvme_dev_err_invalid_dma();
             status = NVME_INVALID_FIELD | NVME_DNR;
         }
-
-        qemu_iovec_destroy(&iov);
     }
 
     return status;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 39/48] pci: pass along the return value of dma_memory_rw
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (37 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 38/48] nvme: use preallocated qsg/iov in nvme_dma_prp Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 40/48] nvme: handle dma errors Klaus Jensen
                   ` (10 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Michael S . Tsirkin, Klaus Jensen,
	qemu-devel, Max Reitz, Klaus Jensen, Keith Busch,
	Javier Gonzalez, Maxim Levitsky, Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

The nvme device needs to know the return value of dma_memory_rw to pass
block/011 from blktests. So pass it along instead of ignoring it.

There are no existing users of the return value, so this patch should be
safe.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 include/hw/pci/pci.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index cfedf5a995d7..da9057b8db97 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -784,8 +784,7 @@ static inline AddressSpace *pci_get_address_space(PCIDevice *dev)
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
-    return 0;
+    return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
 }
 
 static inline int pci_dma_read(PCIDevice *dev, dma_addr_t addr,
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 40/48] nvme: handle dma errors
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (38 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 39/48] pci: pass along the return value of dma_memory_rw Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:26   ` Philippe Mathieu-Daudé
  2020-04-15  5:51 ` [PATCH v7 41/48] nvme: harden cmb access Klaus Jensen
                   ` (9 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Handling DMA errors gracefully is required for the device to pass the
block/011 test ("disable PCI device while doing I/O") in the blktests
suite.

With this patch the device passes the test by retrying "critical"
transfers (posting of completion entries and processing of submission
queue entries).

If DMA errors occur at any other point in the execution of the command
(say, while mapping the PRPs), the command is aborted with a Data
Transfer Error status code.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 45 ++++++++++++++++++++++++++++++++-----------
 hw/block/trace-events |  2 ++
 include/block/nvme.h  |  2 +-
 3 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index eb15a0bd3cf9..6dcd9c4b4cd0 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -74,14 +74,14 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
     return addr >= low && addr < hi;
 }
 
-static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
+static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
     if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
         memcpy(buf, nvme_addr_to_cmb(n, addr), size);
-        return;
+        return 0;
     }
 
-    pci_dma_read(&n->parent_obj, addr, buf, size);
+    return pci_dma_read(&n->parent_obj, addr, buf, size);
 }
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
@@ -185,7 +185,7 @@ static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
     }
 
     if (!nvme_addr_is_cmb(n, addr) || !nvme_addr_is_cmb(n, addr + len - 1)) {
-        return NVME_DATA_TRAS_ERROR;
+        return NVME_DATA_TRANSFER_ERROR;
     }
 
     qemu_iovec_add(iov, nvme_addr_to_cmb(n, addr), len);
@@ -238,6 +238,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
     int num_prps = (len >> n->page_bits) + 1;
     uint16_t status;
     bool prp_list_in_cmb = false;
+    int ret;
 
     trace_nvme_dev_map_prp(nvme_cid(req), trans_len, len, prp1, prp2,
                            num_prps);
@@ -277,7 +278,12 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
 
             nents = (len + n->page_size - 1) >> n->page_bits;
             prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
-            nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
+            ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
+            if (ret) {
+                trace_nvme_dev_err_addr_read(prp2);
+                status = NVME_DATA_TRANSFER_ERROR;
+                goto unmap;
+            }
             while (len != 0) {
                 uint64_t prp_ent = le64_to_cpu(prp_list[i]);
 
@@ -296,8 +302,13 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
                     i = 0;
                     nents = (len + n->page_size - 1) >> n->page_bits;
                     prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
-                    nvme_addr_read(n, prp_ent, (void *)prp_list,
-                        prp_trans);
+                    ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
+                                         prp_trans);
+                    if (ret) {
+                        trace_nvme_dev_err_addr_read(prp_ent);
+                        status = NVME_DATA_TRANSFER_ERROR;
+                        goto unmap;
+                    }
                     prp_ent = le64_to_cpu(prp_list[i]);
                 }
 
@@ -502,6 +513,7 @@ static void nvme_post_cqes(void *opaque)
     NvmeCQueue *cq = opaque;
     NvmeCtrl *n = cq->ctrl;
     NvmeRequest *req, *next;
+    int ret;
 
     QTAILQ_FOREACH_SAFE(req, &cq->req_list, entry, next) {
         NvmeSQueue *sq;
@@ -511,15 +523,21 @@ static void nvme_post_cqes(void *opaque)
             break;
         }
 
-        QTAILQ_REMOVE(&cq->req_list, req, entry);
         sq = req->sq;
         req->cqe.status = cpu_to_le16((req->status << 1) | cq->phase);
         req->cqe.sq_id = cpu_to_le16(sq->sqid);
         req->cqe.sq_head = cpu_to_le16(sq->head);
         addr = cq->dma_addr + cq->tail * n->cqe_size;
+        ret = pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
+                            sizeof(req->cqe));
+        if (ret) {
+            trace_nvme_dev_err_addr_write(addr);
+            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                      500 * SCALE_MS);
+            break;
+        }
+        QTAILQ_REMOVE(&cq->req_list, req, entry);
         nvme_inc_cq_tail(cq);
-        pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
-            sizeof(req->cqe));
         nvme_req_clear(req);
         QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
     }
@@ -1664,7 +1682,12 @@ static void nvme_process_sq(void *opaque)
 
     while (!(nvme_sq_empty(sq) || QTAILQ_EMPTY(&sq->req_list))) {
         addr = sq->dma_addr + sq->head * n->sqe_size;
-        nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd));
+        if (nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd))) {
+            trace_nvme_dev_err_addr_read(addr);
+            timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                      500 * SCALE_MS);
+            break;
+        }
         nvme_inc_sq_head(sq);
 
         req = QTAILQ_FIRST(&sq->req_list);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 7c277a2999c0..75bde5e676a5 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -87,6 +87,8 @@ nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_
 # nvme traces for error conditions
 nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" mdts %"PRIu64" len %"PRIu64""
 nvme_dev_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16""
+nvme_dev_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
+nvme_dev_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
 nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
 nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
 nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index c4c669e32fc4..03bee32c27c4 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -457,7 +457,7 @@ enum NvmeStatusCodes {
     NVME_INVALID_OPCODE         = 0x0001,
     NVME_INVALID_FIELD          = 0x0002,
     NVME_CID_CONFLICT           = 0x0003,
-    NVME_DATA_TRAS_ERROR        = 0x0004,
+    NVME_DATA_TRANSFER_ERROR    = 0x0004,
     NVME_POWER_LOSS_ABORT       = 0x0005,
     NVME_INTERNAL_DEV_ERROR     = 0x0006,
     NVME_CMD_ABORT_REQ          = 0x0007,
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 41/48] nvme: harden cmb access
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (39 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 40/48] nvme: handle dma errors Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 42/48] nvme: add support for scatter gather lists Klaus Jensen
                   ` (8 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Since the controller has only supported PRPs so far it has not been
required to check the ending address (addr + len - 1) of the CMB access
for validity since it has been guaranteed to be in range of the CMB.

This changes when the controller adds support for SGLs (next patch), so
add that check.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6dcd9c4b4cd0..5140bc32913d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -76,7 +76,12 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
 
 static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
-    if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
+    hwaddr hi = addr + size - 1;
+    if (hi < addr) {
+        return 1;
+    }
+
+    if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr) && nvme_addr_is_cmb(n, hi)) {
         memcpy(buf, nvme_addr_to_cmb(n, addr), size);
         return 0;
     }
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 42/48] nvme: add support for scatter gather lists
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (40 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 41/48] nvme: harden cmb access Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 43/48] nvme: add support for sgl bit bucket descriptor Klaus Jensen
                   ` (7 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

For now, support the Data Block, Segment and Last Segment descriptor
types.

See NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)").

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 332 ++++++++++++++++++++++++++++++++++--------
 hw/block/trace-events |   4 +
 2 files changed, 278 insertions(+), 58 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 5140bc32913d..a19085e605e7 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -358,13 +358,263 @@ unmap:
     return status;
 }
 
-static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
-                             uint64_t prp1, uint64_t prp2, DMADirection dir,
+/*
+ * Map 'nsgld' data descriptors from 'segment'. The function will subtract the
+ * number of bytes mapped in len.
+ */
+static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
+                                  QEMUIOVector *iov,
+                                  NvmeSglDescriptor *segment, uint64_t nsgld,
+                                  size_t *len, NvmeRequest *req)
+{
+    dma_addr_t addr, trans_len;
+    uint32_t dlen;
+    uint16_t status;
+
+    for (int i = 0; i < nsgld; i++) {
+        uint8_t type = NVME_SGL_TYPE(segment[i].type);
+
+        switch (type) {
+        case NVME_SGL_DESCR_TYPE_DATA_BLOCK:
+            break;
+        case NVME_SGL_DESCR_TYPE_SEGMENT:
+        case NVME_SGL_DESCR_TYPE_LAST_SEGMENT:
+            return NVME_INVALID_NUM_SGL_DESCRS | NVME_DNR;
+        default:
+            return NVME_SGL_DESCR_TYPE_INVALID | NVME_DNR;
+        }
+
+        dlen = le32_to_cpu(segment[i].len);
+        if (!dlen) {
+            continue;
+        }
+
+        if (*len == 0) {
+            /*
+             * All data has been mapped, but the SGL contains additional
+             * segments and/or descriptors. The controller might accept
+             * ignoring the rest of the SGL.
+             */
+            uint16_t sgls = le16_to_cpu(n->id_ctrl.sgls);
+            if (sgls & NVME_CTRL_SGLS_EXCESS_LENGTH) {
+                break;
+            }
+
+            trace_nvme_dev_err_invalid_sgl_excess_length(nvme_cid(req));
+            return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
+        }
+
+        trans_len = MIN(*len, dlen);
+        addr = le64_to_cpu(segment[i].addr);
+
+        if (UINT64_MAX - addr < dlen) {
+            return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
+        }
+
+        status = nvme_map_addr(n, qsg, iov, addr, trans_len);
+        if (status) {
+            return status;
+        }
+
+        *len -= trans_len;
+    }
+
+    return NVME_SUCCESS;
+}
+
+static uint16_t nvme_map_sgl(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
+                             NvmeSglDescriptor sgl, size_t len,
                              NvmeRequest *req)
+{
+    /*
+     * Read the segment in chunks of 256 descriptors (one 4k page) to avoid
+     * dynamically allocating a potentially huge SGL. The spec allows the SGL
+     * to be larger (as in number of bytes required to describe the SGL
+     * descriptors and segment chain) than the command transfer size, so it is
+     * not bounded by MDTS.
+     */
+    const int SEG_CHUNK_SIZE = 256;
+
+    NvmeSglDescriptor segment[SEG_CHUNK_SIZE], *sgld, *last_sgld;
+    uint64_t nsgld;
+    uint32_t seg_len;
+    uint16_t status;
+    bool sgl_in_cmb = false;
+    hwaddr addr;
+    int ret;
+
+    sgld = &sgl;
+    addr = le64_to_cpu(sgl.addr);
+
+    trace_nvme_dev_map_sgl(nvme_cid(req), NVME_SGL_TYPE(sgl.type), req->nlb,
+                           len);
+
+    /*
+     * If the entire transfer can be described with a single data block it can
+     * be mapped directly.
+     */
+    if (NVME_SGL_TYPE(sgl.type) == NVME_SGL_DESCR_TYPE_DATA_BLOCK) {
+        status = nvme_map_sgl_data(n, qsg, iov, sgld, 1, &len, req);
+        if (status) {
+            goto unmap;
+        }
+
+        goto out;
+    }
+
+    /*
+     * If the segment is located in the CMB, the submission queue of the
+     * request must also reside there.
+     */
+    if (nvme_addr_is_cmb(n, addr)) {
+        if (!nvme_addr_is_cmb(n, req->sq->dma_addr)) {
+            return NVME_INVALID_USE_OF_CMB | NVME_DNR;
+        }
+
+        sgl_in_cmb = true;
+    }
+
+    for (;;) {
+        switch (NVME_SGL_TYPE(sgld->type)) {
+        case NVME_SGL_DESCR_TYPE_SEGMENT:
+        case NVME_SGL_DESCR_TYPE_LAST_SEGMENT:
+            break;
+        default:
+            return NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
+        }
+
+        seg_len = le32_to_cpu(sgld->len);
+
+        /* check the length of the (Last) Segment descriptor */
+        if (!seg_len || seg_len & 0xf) {
+            return NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
+        }
+
+        if (UINT64_MAX - addr < seg_len) {
+            return NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
+        }
+
+        nsgld = seg_len / sizeof(NvmeSglDescriptor);
+
+        while (nsgld > SEG_CHUNK_SIZE) {
+            if (nvme_addr_read(n, addr, segment, sizeof(segment))) {
+                trace_nvme_dev_err_addr_read(addr);
+                status = NVME_DATA_TRANSFER_ERROR;
+                goto unmap;
+            }
+
+            status = nvme_map_sgl_data(n, qsg, iov, segment, SEG_CHUNK_SIZE,
+                                       &len, req);
+            if (status) {
+                goto unmap;
+            }
+
+            nsgld -= SEG_CHUNK_SIZE;
+            addr += SEG_CHUNK_SIZE * sizeof(NvmeSglDescriptor);
+        }
+
+        ret = nvme_addr_read(n, addr, segment, nsgld *
+                             sizeof(NvmeSglDescriptor));
+        if (ret) {
+            trace_nvme_dev_err_addr_read(addr);
+            status = NVME_DATA_TRANSFER_ERROR;
+            goto unmap;
+        }
+
+        last_sgld = &segment[nsgld - 1];
+
+        /* if the segment ends with a Data Block, then we are done */
+        if (NVME_SGL_TYPE(last_sgld->type) == NVME_SGL_DESCR_TYPE_DATA_BLOCK) {
+            status = nvme_map_sgl_data(n, qsg, iov, segment, nsgld, &len, req);
+            if (status) {
+                goto unmap;
+            }
+
+            goto out;
+        }
+
+        /*
+         * If the last descriptor was not a Data Block, then the current
+         * segment must not be a Last Segment.
+         */
+        if (NVME_SGL_TYPE(sgld->type) == NVME_SGL_DESCR_TYPE_LAST_SEGMENT) {
+            status = NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
+            goto unmap;
+        }
+
+        sgld = last_sgld;
+        addr = le64_to_cpu(sgld->addr);
+
+        /*
+         * Do not map the last descriptor; it will be a Segment or Last Segment
+         * descriptor and is handled by the next iteration.
+         */
+        status = nvme_map_sgl_data(n, qsg, iov, segment, nsgld - 1, &len, req);
+        if (status) {
+            goto unmap;
+        }
+
+        /*
+         * If the next segment is in the CMB, make sure that the sgl was
+         * already located there.
+         */
+        if (sgl_in_cmb != nvme_addr_is_cmb(n, addr)) {
+            status = NVME_INVALID_USE_OF_CMB | NVME_DNR;
+            goto unmap;
+        }
+    }
+
+out:
+    /* if there is any residual left in len, the SGL was too short */
+    if (len) {
+        status = NVME_DATA_SGL_LEN_INVALID | NVME_DNR;
+        goto unmap;
+    }
+
+    return NVME_SUCCESS;
+
+unmap:
+    if (iov->iov) {
+        qemu_iovec_destroy(iov);
+    }
+
+    if (qsg->sg) {
+        qemu_sglist_destroy(qsg);
+    }
+
+    return status;
+}
+
+static uint16_t nvme_map(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
+                         size_t len, NvmeRequest *req)
+{
+    uint64_t prp1, prp2;
+
+    switch (NVME_CMD_FLAGS_PSDT(req->cmd.flags)) {
+    case PSDT_PRP:
+        prp1 = le64_to_cpu(req->cmd.dptr.prp1);
+        prp2 = le64_to_cpu(req->cmd.dptr.prp2);
+
+        return nvme_map_prp(n, qsg, iov, prp1, prp2, len, req);
+    case PSDT_SGL_MPTR_CONTIGUOUS:
+    case PSDT_SGL_MPTR_SGL:
+        /* SGLs shall not be used for Admin commands in NVMe over PCIe */
+        if (!req->sq->sqid) {
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
+        return nvme_map_sgl(n, qsg, iov, req->cmd.dptr.sgl, len, req);
+    default:
+        return NVME_INVALID_FIELD;
+    }
+}
+
+static uint16_t nvme_dma(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
+                         DMADirection dir, NvmeRequest *req)
 {
     uint16_t status = NVME_SUCCESS;
 
-    status = nvme_map_prp(n, &req->qsg, &req->iov, prp1, prp2, len, req);
+    status = nvme_map(n, &req->qsg, &req->iov, len, req);
     if (status) {
         return status;
     }
@@ -400,16 +650,6 @@ static uint16_t nvme_dma_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
     return status;
 }
 
-static uint16_t nvme_map(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
-                         size_t len, NvmeRequest *req)
-{
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
-
-    return nvme_map_prp(n, qsg, iov, prp1, prp2, len, req);
-}
-
 static void nvme_aio_destroy(NvmeAIO *aio)
 {
     g_free(aio);
@@ -1016,10 +1256,7 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
 static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
                                 uint64_t off, NvmeRequest *req)
 {
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
-    uint32_t nsid = le32_to_cpu(cmd->nsid);
+    uint32_t nsid = le32_to_cpu(req->cmd.nsid);
 
     uint32_t trans_len;
     time_t current_ms;
@@ -1068,17 +1305,14 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
         nvme_clear_events(n, NVME_AER_TYPE_SMART);
     }
 
-    return nvme_dma_prp(n, (uint8_t *) &smart + off, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *) &smart + off, trans_len,
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
                                  NvmeRequest *req)
 {
     uint32_t trans_len;
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeFwSlotInfoLog fw_log;
 
     if (off > sizeof(fw_log)) {
@@ -1089,17 +1323,14 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t off,
 
     trans_len = MIN(sizeof(fw_log) - off, buf_len);
 
-    return nvme_dma_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len,
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
                                 uint64_t off, NvmeRequest *req)
 {
     uint32_t trans_len;
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeErrorLog errlog;
 
     if (!rae) {
@@ -1114,8 +1345,8 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
 
     trans_len = MIN(sizeof(errlog) - off, buf_len);
 
-    return nvme_dma_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *)&errlog, trans_len,
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
@@ -1266,14 +1497,10 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
 
 static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
-    uint64_t prp1 = le64_to_cpu(c->prp1);
-    uint64_t prp2 = le64_to_cpu(c->prp2);
-
     trace_nvme_dev_identify_ctrl();
 
-    return nvme_dma_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl),
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
@@ -1281,8 +1508,6 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
     NvmeNamespace *ns;
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint32_t nsid = le32_to_cpu(c->nsid);
-    uint64_t prp1 = le64_to_cpu(c->prp1);
-    uint64_t prp2 = le64_to_cpu(c->prp2);
 
     trace_nvme_dev_identify_ns(nsid);
 
@@ -1293,8 +1518,8 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 
     ns = &n->namespaces[nsid - 1];
 
-    return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns),
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
@@ -1302,8 +1527,6 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     static const int data_len = NVME_IDENTIFY_DATA_SIZE;
     uint32_t min_nsid = le32_to_cpu(c->nsid);
-    uint64_t prp1 = le64_to_cpu(c->prp1);
-    uint64_t prp2 = le64_to_cpu(c->prp2);
     uint32_t *list;
     uint16_t ret;
     int i, j = 0;
@@ -1320,8 +1543,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
             break;
         }
     }
-    ret = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2,
-                       DMA_DIRECTION_FROM_DEVICE, req);
+    ret = nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVICE,
+                   req);
     g_free(list);
     return ret;
 }
@@ -1330,8 +1553,6 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint32_t nsid = le32_to_cpu(c->nsid);
-    uint64_t prp1 = le64_to_cpu(c->prp1);
-    uint64_t prp2 = le64_to_cpu(c->prp2);
 
     uint8_t list[NVME_IDENTIFY_DATA_SIZE];
 
@@ -1361,8 +1582,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
     ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
     stl_be_p(&ns_descrs->uuid.v, nsid);
 
-    return nvme_dma_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2,
-                        DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
@@ -1438,14 +1659,10 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
 
 static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
 {
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
-
     uint64_t timestamp = nvme_get_timestamp(n);
 
-    return nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
-                        prp2, DMA_DIRECTION_FROM_DEVICE, req);
+    return nvme_dma(n, (uint8_t *)&timestamp, sizeof(timestamp),
+                    DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
@@ -1528,12 +1745,9 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
 {
     uint16_t ret;
     uint64_t timestamp;
-    NvmeCmd *cmd = &req->cmd;
-    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
-    ret = nvme_dma_prp(n, (uint8_t *)&timestamp, sizeof(timestamp), prp1,
-                       prp2, DMA_DIRECTION_TO_DEVICE, req);
+    ret = nvme_dma(n, (uint8_t *)&timestamp, sizeof(timestamp),
+                   DMA_DIRECTION_TO_DEVICE, req);
     if (ret != NVME_SUCCESS) {
         return ret;
     }
@@ -2323,6 +2537,8 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->nn = cpu_to_le32(n->num_namespaces);
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP);
 
+    id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT);
+
     pstrcpy((char *) id->subnqn, sizeof(id->subnqn), "nqn.2019-08.org.qemu:");
     pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);
 
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 75bde5e676a5..accbb04fe396 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -34,6 +34,7 @@ nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
 nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
 nvme_dev_map_prp(uint16_t cid, uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" trans_len %"PRIu64" len %"PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d"
+nvme_dev_map_sgl(uint16_t cid, uint8_t typ, uint32_t nlb, uint64_t len) "cid %"PRIu16" type 0x%"PRIx8" nlb %"PRIu32" len %"PRIu64""
 nvme_dev_req_add_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, uint64_t count, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" count %"PRIu64" opc \"%s\" req %p"
 nvme_dev_aio_cb(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p"
 nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
@@ -89,6 +90,9 @@ nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" mdts %"P
 nvme_dev_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16""
 nvme_dev_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
 nvme_dev_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
+nvme_dev_err_invalid_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8""
+nvme_dev_err_invalid_num_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8""
+nvme_dev_err_invalid_sgl_excess_length(uint16_t cid) "cid %"PRIu16""
 nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
 nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
 nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 43/48] nvme: add support for sgl bit bucket descriptor
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (41 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 42/48] nvme: add support for scatter gather lists Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 44/48] nvme: refactor identify active namespace id list Klaus Jensen
                   ` (6 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, Gollu Appalanaidu,
	qemu-devel, Max Reitz, Klaus Jensen, Keith Busch,
	Javier Gonzalez, Maxim Levitsky

From: Gollu Appalanaidu <anaidu.gollu@samsung.com>

This adds support for SGL descriptor type 0x1 (bit bucket descriptor).
See the NVM Express v1.3d specification, Section 4.4 ("Scatter Gather
List (SGL)").

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 33 +++++++++++++++++++++++++++------
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index a19085e605e7..f295f027b8e2 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -375,6 +375,10 @@ static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
         uint8_t type = NVME_SGL_TYPE(segment[i].type);
 
         switch (type) {
+        case NVME_SGL_DESCR_TYPE_BIT_BUCKET:
+            if (nvme_req_is_write(req)) {
+                continue;
+            }
         case NVME_SGL_DESCR_TYPE_DATA_BLOCK:
             break;
         case NVME_SGL_DESCR_TYPE_SEGMENT:
@@ -385,6 +389,7 @@ static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
         }
 
         dlen = le32_to_cpu(segment[i].len);
+
         if (!dlen) {
             continue;
         }
@@ -405,6 +410,11 @@ static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
         }
 
         trans_len = MIN(*len, dlen);
+
+        if (type == NVME_SGL_DESCR_TYPE_BIT_BUCKET) {
+            goto next;
+        }
+
         addr = le64_to_cpu(segment[i].addr);
 
         if (UINT64_MAX - addr < dlen) {
@@ -416,6 +426,7 @@ static uint16_t nvme_map_sgl_data(NvmeCtrl *n, QEMUSGList *qsg,
             return status;
         }
 
+next:
         *len -= trans_len;
     }
 
@@ -486,7 +497,8 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
         seg_len = le32_to_cpu(sgld->len);
 
         /* check the length of the (Last) Segment descriptor */
-        if (!seg_len || seg_len & 0xf) {
+        if ((!seg_len || seg_len & 0xf) &&
+            (NVME_SGL_TYPE(sgld->type) != NVME_SGL_DESCR_TYPE_BIT_BUCKET)) {
             return NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
         }
 
@@ -523,19 +535,27 @@ static uint16_t nvme_map_sgl(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
 
         last_sgld = &segment[nsgld - 1];
 
-        /* if the segment ends with a Data Block, then we are done */
-        if (NVME_SGL_TYPE(last_sgld->type) == NVME_SGL_DESCR_TYPE_DATA_BLOCK) {
+        /*
+         * If the segment ends with a Data Block or Bit Bucket Descriptor Type,
+         * then we are done.
+         */
+        switch (NVME_SGL_TYPE(last_sgld->type)) {
+        case NVME_SGL_DESCR_TYPE_DATA_BLOCK:
+        case NVME_SGL_DESCR_TYPE_BIT_BUCKET:
             status = nvme_map_sgl_data(n, qsg, iov, segment, nsgld, &len, req);
             if (status) {
                 goto unmap;
             }
 
             goto out;
+
+        default:
+            break;
         }
 
         /*
-         * If the last descriptor was not a Data Block, then the current
-         * segment must not be a Last Segment.
+         * If the last descriptor was not a Data Block or Bit Bucket, then the
+         * current segment must not be a Last Segment.
          */
         if (NVME_SGL_TYPE(sgld->type) == NVME_SGL_DESCR_TYPE_LAST_SEGMENT) {
             status = NVME_INVALID_SGL_SEG_DESCR | NVME_DNR;
@@ -2537,7 +2557,8 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->nn = cpu_to_le32(n->num_namespaces);
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP);
 
-    id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT);
+    id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT |
+                           NVME_CTRL_SGLS_BITBUCKET);
 
     pstrcpy((char *) id->subnqn, sizeof(id->subnqn), "nqn.2019-08.org.qemu:");
     pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 44/48] nvme: refactor identify active namespace id list
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (42 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 43/48] nvme: add support for sgl bit bucket descriptor Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 45/48] nvme: support multiple namespaces Klaus Jensen
                   ` (5 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Prepare to support inactive namespaces.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index f295f027b8e2..05a6fa334a70 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1549,16 +1549,16 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
     uint32_t min_nsid = le32_to_cpu(c->nsid);
     uint32_t *list;
     uint16_t ret;
-    int i, j = 0;
+    int j = 0;
 
     trace_nvme_dev_identify_nslist(min_nsid);
 
     list = g_malloc0(data_len);
-    for (i = 0; i < n->num_namespaces; i++) {
-        if (i < min_nsid) {
+    for (int i = 1; i <= n->num_namespaces; i++) {
+        if (i <= min_nsid) {
             continue;
         }
-        list[j++] = cpu_to_le32(i + 1);
+        list[j++] = cpu_to_le32(i);
         if (j == data_len / sizeof(uint32_t)) {
             break;
         }
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 45/48] nvme: support multiple namespaces
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (43 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 44/48] nvme: refactor identify active namespace id list Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:38   ` Philippe Mathieu-Daudé
  2020-04-15  5:51 ` [PATCH v7 46/48] pci: allocate pci id for nvme Klaus Jensen
                   ` (4 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

This adds support for multiple namespaces by introducing a new 'nvme-ns'
device model. The nvme device creates a bus named from the device name
('id'). The nvme-ns devices then connect to this and registers
themselves with the nvme device.

This changes how an nvme device is created. Example with two namespaces:

  -drive file=nvme0n1.img,if=none,id=disk1
  -drive file=nvme0n2.img,if=none,id=disk2
  -device nvme,serial=deadbeef,id=nvme0
  -device nvme-ns,drive=disk1,bus=nvme0,nsid=1
  -device nvme-ns,drive=disk2,bus=nvme0,nsid=2

The drive property is kept on the nvme device to keep the change
backward compatible, but the property is now optional. Specifying a
drive for the nvme device will always create the namespace with nsid 1.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/Makefile.objs |   2 +-
 hw/block/nvme-ns.c     | 157 +++++++++++++++++++++++++++
 hw/block/nvme-ns.h     |  60 +++++++++++
 hw/block/nvme.c        | 233 +++++++++++++++++++++++++++--------------
 hw/block/nvme.h        |  47 ++++-----
 hw/block/trace-events  |   8 +-
 6 files changed, 396 insertions(+), 111 deletions(-)
 create mode 100644 hw/block/nvme-ns.c
 create mode 100644 hw/block/nvme-ns.h

diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index 4b4a2b338dc4..d9141d6a4b9b 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -7,7 +7,7 @@ common-obj-$(CONFIG_PFLASH_CFI02) += pflash_cfi02.o
 common-obj-$(CONFIG_XEN) += xen-block.o
 common-obj-$(CONFIG_ECC) += ecc.o
 common-obj-$(CONFIG_ONENAND) += onenand.o
-common-obj-$(CONFIG_NVME_PCI) += nvme.o
+common-obj-$(CONFIG_NVME_PCI) += nvme.o nvme-ns.o
 common-obj-$(CONFIG_SWIM) += swim.o
 
 common-obj-$(CONFIG_SH4) += tc58128.o
diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
new file mode 100644
index 000000000000..bd64d4a94632
--- /dev/null
+++ b/hw/block/nvme-ns.c
@@ -0,0 +1,157 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/cutils.h"
+#include "qemu/log.h"
+#include "hw/block/block.h"
+#include "hw/pci/pci.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/block-backend.h"
+#include "qapi/error.h"
+
+#include "hw/qdev-properties.h"
+#include "hw/qdev-core.h"
+
+#include "nvme.h"
+#include "nvme-ns.h"
+
+static int nvme_ns_init(NvmeNamespace *ns)
+{
+    NvmeIdNs *id_ns = &ns->id_ns;
+
+    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
+    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
+
+    /* no thin provisioning */
+    id_ns->ncap = id_ns->nsze;
+    id_ns->nuse = id_ns->ncap;
+
+    return 0;
+}
+
+static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, NvmeIdCtrl *id,
+                            Error **errp)
+{
+    uint64_t perm, shared_perm;
+
+    Error *local_err = NULL;
+    int ret;
+
+    perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE;
+    shared_perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
+        BLK_PERM_GRAPH_MOD;
+
+    ret = blk_set_perm(ns->blk, perm, shared_perm, &local_err);
+    if (ret) {
+        error_propagate_prepend(errp, local_err,
+                                "could not set block permissions: ");
+        return ret;
+    }
+
+    ns->size = blk_getlength(ns->blk);
+    if (ns->size < 0) {
+        error_setg_errno(errp, -ns->size, "could not get blockdev size");
+        return -1;
+    }
+
+    switch (n->conf.wce) {
+    case ON_OFF_AUTO_ON:
+        n->features.volatile_wc = 1;
+        break;
+    case ON_OFF_AUTO_OFF:
+        n->features.volatile_wc = 0;
+    case ON_OFF_AUTO_AUTO:
+        n->features.volatile_wc = blk_enable_write_cache(ns->blk);
+        break;
+    default:
+        abort();
+    }
+
+    blk_set_enable_write_cache(ns->blk, n->features.volatile_wc);
+
+    return 0;
+}
+
+static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
+{
+    if (!ns->blk) {
+        error_setg(errp, "block backend not configured");
+        return -1;
+    }
+
+    return 0;
+}
+
+int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
+{
+    if (nvme_ns_check_constraints(ns, errp)) {
+        return -1;
+    }
+
+    if (nvme_ns_init_blk(n, ns, &n->id_ctrl, errp)) {
+        return -1;
+    }
+
+    nvme_ns_init(ns);
+    if (nvme_register_namespace(n, ns, errp)) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static void nvme_ns_realize(DeviceState *dev, Error **errp)
+{
+    NvmeNamespace *ns = NVME_NS(dev);
+    BusState *s = qdev_get_parent_bus(dev);
+    NvmeCtrl *n = NVME(s->parent);
+    Error *local_err = NULL;
+
+    if (nvme_ns_setup(n, ns, &local_err)) {
+        error_propagate_prepend(errp, local_err,
+                                "could not setup namespace: ");
+        return;
+    }
+}
+
+static Property nvme_ns_props[] = {
+    DEFINE_NVME_NS_PROPERTIES(NvmeNamespace, params),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void nvme_ns_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+
+    dc->bus_type = TYPE_NVME_BUS;
+    dc->realize = nvme_ns_realize;
+    device_class_set_props(dc, nvme_ns_props);
+    dc->desc = "virtual nvme namespace";
+}
+
+static void nvme_ns_instance_init(Object *obj)
+{
+    NvmeNamespace *ns = NVME_NS(obj);
+    char *bootindex = g_strdup_printf("/namespace@%d,0", ns->params.nsid);
+
+    device_add_bootindex_property(obj, &ns->bootindex, "bootindex",
+                                  bootindex, DEVICE(obj), &error_abort);
+
+    g_free(bootindex);
+}
+
+static const TypeInfo nvme_ns_info = {
+    .name = TYPE_NVME_NS,
+    .parent = TYPE_DEVICE,
+    .class_init = nvme_ns_class_init,
+    .instance_size = sizeof(NvmeNamespace),
+    .instance_init = nvme_ns_instance_init,
+};
+
+static void nvme_ns_register_types(void)
+{
+    type_register_static(&nvme_ns_info);
+}
+
+type_init(nvme_ns_register_types)
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
new file mode 100644
index 000000000000..3c3651d485d0
--- /dev/null
+++ b/hw/block/nvme-ns.h
@@ -0,0 +1,60 @@
+#ifndef NVME_NS_H
+#define NVME_NS_H
+
+#define TYPE_NVME_NS "nvme-ns"
+#define NVME_NS(obj) \
+    OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
+
+#define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
+    DEFINE_PROP_DRIVE("drive", _state, blk), \
+    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)
+
+typedef struct NvmeNamespaceParams {
+    uint32_t nsid;
+} NvmeNamespaceParams;
+
+typedef struct NvmeNamespace {
+    DeviceState  parent_obj;
+    BlockBackend *blk;
+    int32_t      bootindex;
+    int64_t      size;
+
+    NvmeIdNs            id_ns;
+    NvmeNamespaceParams params;
+} NvmeNamespace;
+
+static inline uint32_t nvme_nsid(NvmeNamespace *ns)
+{
+    if (ns) {
+        return ns->params.nsid;
+    }
+
+    return -1;
+}
+
+static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
+{
+    NvmeIdNs *id_ns = &ns->id_ns;
+    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
+}
+
+static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
+{
+    return nvme_ns_lbaf(ns)->ds;
+}
+
+static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
+{
+    return 1 << nvme_ns_lbads(ns);
+}
+
+static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns)
+{
+    return ns->size >> nvme_ns_lbads(ns);
+}
+
+typedef struct NvmeCtrl NvmeCtrl;
+
+int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp);
+
+#endif /* NVME_NS_H */
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 05a6fa334a70..e338d0893a70 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -17,10 +17,11 @@
 /**
  * Usage: add options:
  *      -drive file=<file>,if=none,id=<drive_id>
- *      -device nvme,drive=<drive_id>,serial=<serial>,id=<id[optional]>, \
+ *      -device nvme,serial=<serial>,id=<bus_name>, \
  *              cmb_size_mb=<cmb_size_mb[optional]>, \
  *              max_ioqpairs=<N[optional]>, \
  *              mdts=<mdts[optional]>
+ *      -device nvme-ns,drive=<drive_id>,bus=bus_name,nsid=1
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
@@ -44,6 +45,7 @@
 #include "qemu/cutils.h"
 #include "trace.h"
 #include "nvme.h"
+#include "nvme-ns.h"
 
 #define NVME_SPEC_VER 0x00010300
 #define NVME_CMB_BIR 2
@@ -89,6 +91,11 @@ static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
     return pci_dma_read(&n->parent_obj, addr, buf, size);
 }
 
+static uint16_t nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid)
+{
+    return nsid && nsid <= n->num_namespaces;
+}
+
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
 {
     return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
@@ -949,11 +956,12 @@ static uint16_t nvme_check_rw(NvmeCtrl *n, NvmeRequest *req)
 
 static void nvme_rw_cb(NvmeRequest *req, void *opaque)
 {
+    NvmeNamespace *ns = req->ns;
     NvmeSQueue *sq = req->sq;
     NvmeCtrl *n = sq->ctrl;
     NvmeCQueue *cq = n->cq[sq->cqid];
 
-    trace_nvme_dev_rw_cb(nvme_cid(req));
+    trace_nvme_dev_rw_cb(nvme_cid(req), nvme_nsid(ns));
 
     nvme_enqueue_req_completion(cq, req);
 }
@@ -1036,10 +1044,11 @@ static void nvme_aio_cb(void *opaque, int ret)
 
 static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeNamespace *ns = req->ns;
     NvmeAIO *aio = g_new0(NvmeAIO, 1);
 
     *aio = (NvmeAIO) {
-        .blk = n->conf.blk,
+        .blk = ns->blk,
         .req = req,
     };
 
@@ -1063,11 +1072,12 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
     req->slba = le64_to_cpu(rw->slba);
     req->nlb  = le16_to_cpu(rw->nlb) + 1;
 
-    trace_nvme_dev_write_zeroes(nvme_cid(req), req->slba, req->nlb);
+    trace_nvme_dev_write_zeroes(nvme_cid(req), nvme_nsid(ns), req->slba,
+                                req->nlb);
 
     status = nvme_check_bounds(n, req->slba, req->nlb, req);
     if (status) {
-        return status;
+        goto invalid;
     }
 
     offset = req->slba << nvme_ns_lbads(ns);
@@ -1076,7 +1086,7 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
     aio = g_new0(NvmeAIO, 1);
 
     *aio = (NvmeAIO) {
-        .blk = n->conf.blk,
+        .blk = ns->blk,
         .offset = offset,
         .len = count,
         .req = req,
@@ -1086,6 +1096,10 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
     nvme_submit_aio(aio);
 
     return NVME_NO_COMPLETE;
+
+invalid:
+    block_acct_invalid(blk_get_stats(ns->blk), BLOCK_ACCT_WRITE);
+    return status;
 }
 
 static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
@@ -1100,11 +1114,11 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
 
     req->nlb  = le16_to_cpu(rw->nlb) + 1;
     req->slba = le64_to_cpu(rw->slba);
-
     len = req->nlb << nvme_ns_lbads(ns);
 
-    trace_nvme_dev_rw(nvme_req_is_write(req) ? "write" : "read", req->nlb,
-                      req->nlb << nvme_ns_lbads(req->ns), req->slba);
+    trace_nvme_dev_rw(nvme_cid(req), nvme_req_is_write(req) ? "write" : "read",
+                      nvme_nsid(ns), req->nlb, req->nlb << nvme_ns_lbads(ns),
+                      req->slba);
 
     status = nvme_check_rw(n, req);
     if (status) {
@@ -1116,13 +1130,13 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
         goto invalid;
     }
 
-    nvme_rw_aio(n->conf.blk, req->slba << nvme_ns_lbads(ns), req);
+    nvme_rw_aio(ns->blk, req->slba << nvme_ns_lbads(ns), req);
     nvme_req_set_cb(req, nvme_rw_cb, NULL);
 
     return NVME_NO_COMPLETE;
 
 invalid:
-    block_acct_invalid(blk_get_stats(n->conf.blk), acct);
+    block_acct_invalid(blk_get_stats(ns->blk), acct);
     return status;
 }
 
@@ -1133,12 +1147,15 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
     trace_nvme_dev_io_cmd(nvme_cid(req), nsid, le16_to_cpu(req->sq->sqid),
                           req->cmd.opcode);
 
-    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
-        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
+    if (!nvme_nsid_valid(n, nsid)) {
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
-    req->ns = &n->namespaces[nsid - 1];
+    req->ns = nvme_ns(n, nsid);
+    if (unlikely(!req->ns)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
     switch (req->cmd.opcode) {
     case NVME_CMD_FLUSH:
         return nvme_flush(n, req);
@@ -1283,18 +1300,24 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
     uint64_t units_read = 0, units_written = 0;
     uint64_t read_commands = 0, write_commands = 0;
     NvmeSmartLog smart;
-    BlockAcctStats *s;
 
     if (nsid && nsid != 0xffffffff) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    s = blk_get_stats(n->conf.blk);
+    for (int i = 1; i <= n->num_namespaces; i++) {
+        NvmeNamespace *ns = nvme_ns(n, i);
+        if (!ns) {
+            continue;
+        }
 
-    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
-    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
-    read_commands = s->nr_ops[BLOCK_ACCT_READ];
-    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
+        BlockAcctStats *s = blk_get_stats(ns->blk);
+
+        units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
+        units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
+        read_commands += s->nr_ops[BLOCK_ACCT_READ];
+        write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
+    }
 
     if (off > sizeof(smart)) {
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1527,18 +1550,23 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeNamespace *ns;
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+    NvmeIdNs *id_ns, inactive = { 0 };
     uint32_t nsid = le32_to_cpu(c->nsid);
 
     trace_nvme_dev_identify_ns(nsid);
 
-    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
-        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
+    if (!nvme_nsid_valid(n, nsid)) {
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
-    ns = &n->namespaces[nsid - 1];
+    ns = nvme_ns(n, nsid);
+    if (unlikely(!ns)) {
+        id_ns = &inactive;
+    } else {
+        id_ns = &ns->id_ns;
+    }
 
-    return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns),
+    return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs),
                     DMA_DIRECTION_FROM_DEVICE, req);
 }
 
@@ -1555,7 +1583,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
 
     list = g_malloc0(data_len);
     for (int i = 1; i <= n->num_namespaces; i++) {
-        if (i <= min_nsid) {
+        if (i <= min_nsid || !nvme_ns(n, i)) {
             continue;
         }
         list[j++] = cpu_to_le32(i);
@@ -1573,7 +1601,6 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
     uint32_t nsid = le32_to_cpu(c->nsid);
-
     uint8_t list[NVME_IDENTIFY_DATA_SIZE];
 
     struct data {
@@ -1587,11 +1614,14 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 
     trace_nvme_dev_identify_ns_descr_list(nsid);
 
-    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
-        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
+    if (!nvme_nsid_valid(n, nsid)) {
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
+    if (unlikely(!nvme_ns(n, nsid))) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
     /*
      * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
      * structure, a Namespace UUID (nidt = 0x3) must be reported in the
@@ -1726,7 +1756,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
         result = cpu_to_le32(n->features.err_rec);
         break;
     case NVME_VOLATILE_WRITE_CACHE:
-        result = cpu_to_le32(blk_enable_write_cache(n->conf.blk));
+        result = cpu_to_le32(n->features.volatile_wc);
         trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
@@ -1779,6 +1809,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
 
 static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeNamespace *ns;
+
     NvmeCmd *cmd = &req->cmd;
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
@@ -1812,12 +1844,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
 
         break;
     case NVME_VOLATILE_WRITE_CACHE:
-        if (blk_enable_write_cache(n->conf.blk)) {
-            blk_flush(n->conf.blk);
+        n->features.volatile_wc = dw11;
+
+        for (int i = 1; i <= n->num_namespaces; i++) {
+            ns = nvme_ns(n, i);
+            if (!ns) {
+                continue;
+            }
+
+            if (blk_enable_write_cache(ns->blk)) {
+                blk_flush(ns->blk);
+            }
+
+            blk_set_enable_write_cache(ns->blk, dw11 & 1);
         }
 
-        blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
+
     case NVME_NUMBER_OF_QUEUES:
         if (n->qs_created) {
             return NVME_CMD_SEQ_ERROR | NVME_DNR;
@@ -1946,9 +1989,17 @@ static void nvme_process_sq(void *opaque)
 
 static void nvme_clear_ctrl(NvmeCtrl *n)
 {
+    NvmeNamespace *ns;
     int i;
 
-    blk_drain(n->conf.blk);
+    for (i = 1; i <= n->num_namespaces; i++) {
+        ns = nvme_ns(n, i);
+        if (!ns) {
+            continue;
+        }
+
+        blk_drain(ns->blk);
+    }
 
     for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
         if (n->sq[i] != NULL) {
@@ -1971,7 +2022,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
     n->outstanding_aers = 0;
     n->qs_created = false;
 
-    blk_flush(n->conf.blk);
+    for (i = 1; i <= n->num_namespaces; i++) {
+        ns = nvme_ns(n, i);
+        if (!ns) {
+            continue;
+        }
+
+        blk_flush(ns->blk);
+    }
+
     n->bar.cc = 0;
 }
 
@@ -2406,6 +2465,11 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
         params->max_ioqpairs = params->num_queues - 1;
     }
 
+    if (n->namespace.blk) {
+        warn_report("nvme: drive is deprecated; please use an nvme-ns device "
+                    "instead");
+    }
+
     if (params->max_ioqpairs < 1 ||
         params->max_ioqpairs > PCI_MSIX_FLAGS_QSIZE) {
         error_setg(errp, "nvme: max_ioqpairs must be between 1 and %d",
@@ -2413,11 +2477,6 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
         return -1;
     }
 
-    if (!n->conf.blk) {
-        error_setg(errp, "nvme: block backend not configured");
-        return -1;
-    }
-
     if (!params->serial) {
         error_setg(errp, "nvme: serial not configured");
         return -1;
@@ -2426,22 +2485,10 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
     return 0;
 }
 
-static int nvme_init_blk(NvmeCtrl *n, Error **errp)
-{
-    blkconf_blocksizes(&n->conf);
-    if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk),
-                                       false, errp)) {
-        return -1;
-    }
-
-    return 0;
-}
-
 static void nvme_init_state(NvmeCtrl *n)
 {
-    n->num_namespaces = 1;
+    n->num_namespaces = NVME_MAX_NAMESPACES;
     n->reg_size = pow2ceil(0x1008 + 2 * (n->params.max_ioqpairs) * 4);
-    n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
     n->temperature = NVME_TEMPERATURE;
@@ -2557,6 +2604,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->nn = cpu_to_le32(n->num_namespaces);
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP);
 
+    id->vwc = 0x1;
     id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT |
                            NVME_CTRL_SGLS_BITBUCKET);
 
@@ -2566,9 +2614,6 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     id->psd[0].mp = cpu_to_le16(0x9c4);
     id->psd[0].enlat = cpu_to_le32(0x10);
     id->psd[0].exlat = cpu_to_le32(0x4);
-    if (blk_enable_write_cache(n->conf.blk)) {
-        id->vwc = 1;
-    }
 
     n->bar.cap = 0;
     NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
@@ -2581,25 +2626,39 @@ static void nvme_init_ctrl(NvmeCtrl *n)
     n->bar.intmc = n->bar.intms = 0;
 }
 
-static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
+int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 {
-    int64_t bs_size;
-    NvmeIdNs *id_ns = &ns->id_ns;
+    uint32_t nsid = nvme_nsid(ns);
 
-    bs_size = blk_getlength(n->conf.blk);
-    if (bs_size < 0) {
-        error_setg_errno(errp, -bs_size, "blk_getlength");
+    if (nsid > NVME_MAX_NAMESPACES) {
+        error_setg(errp, "invalid namespace id (must be between 0 and %d)",
+                   NVME_MAX_NAMESPACES);
         return -1;
     }
 
-    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
-    n->ns_size = bs_size;
+    if (!nsid) {
+        for (int i = 1; i <= n->num_namespaces; i++) {
+            NvmeNamespace *ns = nvme_ns(n, i);
+            if (!ns) {
+                nsid = i;
+                break;
+            }
+        }
 
-    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns));
+        if (!nsid) {
+            error_setg(errp, "no free namespace id");
+            return -1;
+        }
+    } else {
+        if (n->namespaces[nsid - 1]) {
+            error_setg(errp, "namespace id '%d' is already in use", nsid);
+            return -1;
+        }
+    }
 
-    /* no thin provisioning */
-    id_ns->ncap = id_ns->nsze;
-    id_ns->nuse = id_ns->ncap;
+    trace_nvme_dev_register_namespace(nsid);
+
+    n->namespaces[nsid - 1] = ns;
 
     return 0;
 }
@@ -2607,26 +2666,28 @@ static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 {
     NvmeCtrl *n = NVME(pci_dev);
-    int i;
+    NvmeNamespace *ns;
 
     if (nvme_check_constraints(n, errp)) {
         return;
     }
 
+    qbus_create_inplace(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS,
+                        &pci_dev->qdev, n->parent_obj.qdev.id);
+
     nvme_init_state(n);
+    nvme_init_pci(n, pci_dev);
+    nvme_init_ctrl(n);
 
-    if (nvme_init_blk(n, errp)) {
-        return;
-    }
+    /* setup a namespace if the controller drive property was given */
+    if (n->namespace.blk) {
+        ns = &n->namespace;
+        ns->params.nsid = 1;
 
-    for (i = 0; i < n->num_namespaces; i++) {
-        if (nvme_init_namespace(n, &n->namespaces[i], errp)) {
+        if (nvme_ns_setup(n, ns, errp)) {
             return;
         }
     }
-
-    nvme_init_pci(n, pci_dev);
-    nvme_init_ctrl(n);
 }
 
 static void nvme_exit(PCIDevice *pci_dev)
@@ -2647,7 +2708,8 @@ static void nvme_exit(PCIDevice *pci_dev)
 }
 
 static Property nvme_props[] = {
-    DEFINE_BLOCK_PROPERTIES(NvmeCtrl, conf),
+    DEFINE_BLOCK_PROPERTIES_BASE(NvmeCtrl, conf), \
+    DEFINE_PROP_DRIVE("drive", NvmeCtrl, namespace.blk), \
     DEFINE_NVME_PROPERTIES(NvmeCtrl, params),
     DEFINE_PROP_END_OF_LIST(),
 };
@@ -2679,26 +2741,35 @@ static void nvme_instance_init(Object *obj)
 {
     NvmeCtrl *s = NVME(obj);
 
-    device_add_bootindex_property(obj, &s->conf.bootindex,
-                                  "bootindex", "/namespace@1,0",
-                                  DEVICE(obj), &error_abort);
+    if (s->namespace.blk) {
+        device_add_bootindex_property(obj, &s->conf.bootindex,
+                                      "bootindex", "/namespace@1,0",
+                                      DEVICE(obj), &error_abort);
+    }
 }
 
 static const TypeInfo nvme_info = {
     .name          = TYPE_NVME,
     .parent        = TYPE_PCI_DEVICE,
     .instance_size = sizeof(NvmeCtrl),
-    .class_init    = nvme_class_init,
     .instance_init = nvme_instance_init,
+    .class_init    = nvme_class_init,
     .interfaces = (InterfaceInfo[]) {
         { INTERFACE_PCIE_DEVICE },
         { }
     },
 };
 
+static const TypeInfo nvme_bus_info = {
+    .name = TYPE_NVME_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = sizeof(NvmeBus),
+};
+
 static void nvme_register_types(void)
 {
     type_register_static(&nvme_info);
+    type_register_static(&nvme_bus_info);
 }
 
 type_init(nvme_register_types)
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index ed4a5ce4121d..f42c17651b7b 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -2,6 +2,9 @@
 #define HW_NVME_H
 
 #include "block/nvme.h"
+#include "nvme-ns.h"
+
+#define NVME_MAX_NAMESPACES 256
 
 #define DEFINE_NVME_PROPERTIES(_state, _props) \
     DEFINE_PROP_STRING("serial", _state, _props.serial), \
@@ -93,26 +96,6 @@ typedef struct NvmeCQueue {
     QTAILQ_HEAD(, NvmeRequest) req_list;
 } NvmeCQueue;
 
-typedef struct NvmeNamespace {
-    NvmeIdNs        id_ns;
-} NvmeNamespace;
-
-static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
-{
-    NvmeIdNs *id_ns = &ns->id_ns;
-    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
-}
-
-static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
-{
-    return nvme_ns_lbaf(ns)->ds;
-}
-
-static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
-{
-    return 1 << nvme_ns_lbads(ns);
-}
-
 typedef enum NvmeAIOOp {
     NVME_AIO_OPC_NONE         = 0x0,
     NVME_AIO_OPC_FLUSH        = 0x1,
@@ -167,6 +150,13 @@ static inline bool nvme_req_is_write(NvmeRequest *req)
     }
 }
 
+#define TYPE_NVME_BUS "nvme-bus"
+#define NVME_BUS(obj) OBJECT_CHECK(NvmeBus, (obj), TYPE_NVME_BUS)
+
+typedef struct NvmeBus {
+    BusState parent_bus;
+} NvmeBus;
+
 #define TYPE_NVME "nvme"
 #define NVME(obj) \
         OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
@@ -176,8 +166,9 @@ typedef struct NvmeCtrl {
     MemoryRegion iomem;
     MemoryRegion ctrl_mem;
     NvmeBar      bar;
-    BlockConf    conf;
     NvmeParams   params;
+    NvmeBus      bus;
+    BlockConf    conf;
 
     bool        qs_created;
     uint32_t    page_size;
@@ -188,7 +179,6 @@ typedef struct NvmeCtrl {
     uint32_t    reg_size;
     uint32_t    num_namespaces;
     uint32_t    max_q_ents;
-    uint64_t    ns_size;
     uint8_t     outstanding_aers;
     uint8_t     *cmbuf;
     uint64_t    irq_status;
@@ -202,7 +192,8 @@ typedef struct NvmeCtrl {
     QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue;
     int         aer_queued;
 
-    NvmeNamespace   *namespaces;
+    NvmeNamespace   namespace;
+    NvmeNamespace   *namespaces[NVME_MAX_NAMESPACES];
     NvmeSQueue      **sq;
     NvmeCQueue      **cq;
     NvmeSQueue      admin_sq;
@@ -211,9 +202,13 @@ typedef struct NvmeCtrl {
     NvmeFeatureVal  features;
 } NvmeCtrl;
 
-static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
+static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid)
 {
-    return n->ns_size >> nvme_ns_lbads(ns);
+    if (!nsid || nsid > n->num_namespaces) {
+        return NULL;
+    }
+
+    return n->namespaces[nsid - 1];
 }
 
 static inline uint16_t nvme_cid(NvmeRequest *req)
@@ -225,4 +220,6 @@ static inline uint16_t nvme_cid(NvmeRequest *req)
     return 0xffff;
 }
 
+int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp);
+
 #endif /* HW_NVME_H */
diff --git a/hw/block/trace-events b/hw/block/trace-events
index accbb04fe396..8fec87260555 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -29,6 +29,7 @@ hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int t
 
 # nvme.c
 # nvme traces for successful events
+nvme_dev_register_namespace(uint32_t nsid) "nsid %"PRIu32""
 nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 nvme_dev_irq_pin(void) "pulsing IRQ pin"
 nvme_dev_irq_masked(void) "IRQ is masked"
@@ -39,9 +40,9 @@ nvme_dev_req_add_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offs
 nvme_dev_aio_cb(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p"
 nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
 nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
-nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
-nvme_dev_rw_cb(uint16_t cid) "cid %"PRIu16""
-nvme_dev_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
+nvme_dev_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" %s nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64""
+nvme_dev_rw_cb(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32""
+nvme_dev_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32""
 nvme_dev_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
 nvme_dev_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
 nvme_dev_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
@@ -98,7 +99,6 @@ nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or no
 nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
 nvme_dev_err_invalid_prp2_missing(void) "PRP2 is null and more data to be transferred"
 nvme_dev_err_invalid_prp(void) "invalid PRP"
-nvme_dev_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not within 1-%u"
 nvme_dev_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
 nvme_dev_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
 nvme_dev_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 46/48] pci: allocate pci id for nvme
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (44 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 45/48] nvme: support multiple namespaces Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-21  9:19   ` Gerd Hoffmann
  2020-04-15  5:51 ` [PATCH v7 47/48] nvme: change controller pci id Klaus Jensen
                   ` (3 subsequent siblings)
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Gerd Hoffmann, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

The emulated nvme device (hw/block/nvme.c) is currently using an
internal Intel device id.

Prepare to change that by allocating a device id under the 1b36 (Red
Hat, Inc.) vendor id.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 MAINTAINERS            |  1 +
 docs/specs/nvme.txt    | 25 +++++++++++++++++++++++++
 docs/specs/pci-ids.txt |  1 +
 include/hw/pci/pci.h   |  1 +
 4 files changed, 28 insertions(+)
 create mode 100644 docs/specs/nvme.txt

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f93e8c01d34..b4bbc58b668b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1736,6 +1736,7 @@ L: qemu-block@nongnu.org
 S: Supported
 F: hw/block/nvme*
 F: tests/qtest/nvme-test.c
+F: docs/specs/nvme.txt
 
 megasas
 M: Hannes Reinecke <hare@suse.com>
diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt
new file mode 100644
index 000000000000..b51552cb5c3f
--- /dev/null
+++ b/docs/specs/nvme.txt
@@ -0,0 +1,25 @@
+NVM Express Controller
+======================
+
+The nvme device (-device nvme) emulates an NVM Express Controller.
+
+
+Reference Specifications
+------------------------
+
+The device currently implements most mandatory features of NVMe v1.3d, see
+
+  https://nvmexpress.org/resources/specifications/
+
+for the specification.
+
+
+Known issues
+------------
+
+* The device does not have any way of storing persistent state, so minor parts
+  of the implementation is in violation of the specification:
+    - The accounting numbers in the SMART/Health are reset across power cycles
+
+* Interrupt Coalescing is not supported and is disabled by default in volation
+  of the specification.
diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
index 4d53e5c7d9d5..abbdbca6be38 100644
--- a/docs/specs/pci-ids.txt
+++ b/docs/specs/pci-ids.txt
@@ -63,6 +63,7 @@ PCI devices (other than virtio):
 1b36:000b  PCIe Expander Bridge (-device pxb-pcie)
 1b36:000d  PCI xhci usb host adapter
 1b36:000f  mdpy (mdev sample device), linux/samples/vfio-mdev/mdpy.c
+1b36:0010  PCIe NVMe device (-device nvme)
 
 All these devices are documented in docs/specs.
 
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index da9057b8db97..92231885bc23 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -104,6 +104,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_REDHAT_XHCI        0x000d
 #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e
 #define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
+#define PCI_DEVICE_ID_REDHAT_NVME        0x0010
 #define PCI_DEVICE_ID_REDHAT_QXL         0x0100
 
 #define FMT_PCIBUS                      PRIx64
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 47/48] nvme: change controller pci id
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (45 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 46/48] pci: allocate pci id for nvme Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  5:51 ` [PATCH v7 48/48] nvme: make lba data size configurable Klaus Jensen
                   ` (2 subsequent siblings)
  49 siblings, 0 replies; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

There are two reasons for changing this:

  1. The nvme device currently uses an internal Intel device id.

  2. Since commits "nvme: fix write zeroes offset and count" and "nvme:
     support multiple namespaces" the controller device no longer has
     the quirks that the Linux kernel think it has.

     As the quirks are applied based on pci vendor and device id, change
     them to get rid of the quirks.

To keep backward compatibility, add a new 'x-use-intel-id' parameter to
the nvme device to force use of the Intel vendor and device id. This is
off by default but add a compat property to set this for machines 4.2
and older.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c   | 13 +++++++++----
 hw/block/nvme.h   |  4 +++-
 hw/core/machine.c |  1 +
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index e338d0893a70..40a400333828 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2544,8 +2544,15 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev)
 
     pci_conf[PCI_INTERRUPT_PIN] = 1;
     pci_config_set_prog_interface(pci_conf, 0x2);
-    pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
-    pci_config_set_device_id(pci_conf, 0x5845);
+
+    if (n->params.use_intel_id) {
+        pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
+        pci_config_set_device_id(pci_conf, 0x5846);
+    } else {
+        pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT);
+        pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_REDHAT_NVME);
+    }
+
     pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS);
     pcie_endpoint_cap_init(pci_dev, 0x80);
 
@@ -2727,8 +2734,6 @@ static void nvme_class_init(ObjectClass *oc, void *data)
     pc->realize = nvme_realize;
     pc->exit = nvme_exit;
     pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
-    pc->vendor_id = PCI_VENDOR_ID_INTEL;
-    pc->device_id = 0x5845;
     pc->revision = 2;
 
     set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index f42c17651b7b..615a6ff5d13d 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -13,7 +13,8 @@
     DEFINE_PROP_UINT32("max_ioqpairs", _state, _props.max_ioqpairs, 64), \
     DEFINE_PROP_UINT8("aerl", _state, _props.aerl, 3), \
     DEFINE_PROP_UINT32("aer_max_queued", _state, _props.aer_max_queued, 64), \
-    DEFINE_PROP_UINT8("mdts", _state, _props.mdts, 7)
+    DEFINE_PROP_UINT8("mdts", _state, _props.mdts, 7), \
+    DEFINE_PROP_BOOL("x-use-intel-id", _state, _props.use_intel_id, false)
 
 typedef struct NvmeParams {
     char     *serial;
@@ -23,6 +24,7 @@ typedef struct NvmeParams {
     uint8_t  aerl;
     uint32_t aer_max_queued;
     uint8_t  mdts;
+    bool     use_intel_id;
 } NvmeParams;
 
 typedef struct NvmeAsyncEvent {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index c1a444cb7558..de972a7e45dc 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -40,6 +40,7 @@ GlobalProperty hw_compat_4_2[] = {
     { "qxl", "revision", "4" },
     { "qxl-vga", "revision", "4" },
     { "fw_cfg", "acpi-mr-restore", "false" },
+    { "nvme", "x-use-intel-id", "on"},
 };
 const size_t hw_compat_4_2_len = G_N_ELEMENTS(hw_compat_4_2);
 
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v7 48/48] nvme: make lba data size configurable
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (46 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 47/48] nvme: change controller pci id Klaus Jensen
@ 2020-04-15  5:51 ` Klaus Jensen
  2020-04-15  7:40   ` Philippe Mathieu-Daudé
  2020-04-15  7:17 ` [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces no-reply
  2020-04-15  8:02 ` no-reply
  49 siblings, 1 reply; 72+ messages in thread
From: Klaus Jensen @ 2020-04-15  5:51 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme-ns.c | 7 ++++++-
 hw/block/nvme-ns.h | 4 +++-
 hw/block/nvme.c    | 1 +
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index bd64d4a94632..d34b6b2439f1 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -18,7 +18,7 @@ static int nvme_ns_init(NvmeNamespace *ns)
 {
     NvmeIdNs *id_ns = &ns->id_ns;
 
-    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
+    id_ns->lbaf[0].ds = ns->params.lbads;
     id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
 
     /* no thin provisioning */
@@ -78,6 +78,11 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
         return -1;
     }
 
+    if (ns->params.lbads < 9 || ns->params.lbads > 12) {
+        error_setg(errp, "unsupported lbads (supported: 9-12)");
+        return 1;
+    }
+
     return 0;
 }
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 3c3651d485d0..43b78f8b8d9c 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -7,10 +7,12 @@
 
 #define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
     DEFINE_PROP_DRIVE("drive", _state, blk), \
-    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)
+    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0), \
+    DEFINE_PROP_UINT8("lbads", _state, _props.lbads, BDRV_SECTOR_BITS)
 
 typedef struct NvmeNamespaceParams {
     uint32_t nsid;
+    uint8_t  lbads;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 40a400333828..dd2759a4ce2e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2690,6 +2690,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     if (n->namespace.blk) {
         ns = &n->namespace;
         ns->params.nsid = 1;
+        ns->params.lbads = BDRV_SECTOR_BITS;
 
         if (nvme_ns_setup(n, ns, errp)) {
             return;
-- 
2.26.0



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 02/48] nvme: remove superfluous breaks
  2020-04-15  5:50 ` [PATCH v7 02/48] nvme: remove superfluous breaks Klaus Jensen
@ 2020-04-15  6:57   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  6:57 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:50 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> These break statements was left over when commit 3036a626e9ef ("nvme:
> add Get/Set Feature Timestamp support") was merged.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   hw/block/nvme.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 01e18fb9eb1f..da0e8af42823 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -788,7 +788,6 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>           break;
>       case NVME_TIMESTAMP:
>           return nvme_get_feature_timestamp(n, cmd);
> -        break;
>       default:
>           trace_nvme_dev_err_invalid_getfeat(dw10);
>           return NVME_INVALID_FIELD | NVME_DNR;
> @@ -832,11 +831,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>           req->cqe.result =
>               cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
>           break;
> -
>       case NVME_TIMESTAMP:
>           return nvme_set_feature_timestamp(n, cmd);
> -        break;
> -
>       default:
>           trace_nvme_dev_err_invalid_setfeat(dw10);
>           return NVME_INVALID_FIELD | NVME_DNR;
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 03/48] nvme: move device parameters to separate struct
  2020-04-15  5:50 ` [PATCH v7 03/48] nvme: move device parameters to separate struct Klaus Jensen
@ 2020-04-15  6:58   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  6:58 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:50 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Move device configuration parameters to separate struct to make it
> explicit what is configurable and what is set internally.
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   hw/block/nvme.c | 44 ++++++++++++++++++++++----------------------
>   hw/block/nvme.h | 16 +++++++++++++---
>   2 files changed, 35 insertions(+), 25 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index da0e8af42823..249f759f076e 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -64,12 +64,12 @@ static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
>   
>   static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
>   {
> -    return sqid < n->num_queues && n->sq[sqid] != NULL ? 0 : -1;
> +    return sqid < n->params.num_queues && n->sq[sqid] != NULL ? 0 : -1;
>   }
>   
>   static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
>   {
> -    return cqid < n->num_queues && n->cq[cqid] != NULL ? 0 : -1;
> +    return cqid < n->params.num_queues && n->cq[cqid] != NULL ? 0 : -1;
>   }
>   
>   static void nvme_inc_cq_tail(NvmeCQueue *cq)
> @@ -631,7 +631,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
>           trace_nvme_dev_err_invalid_create_cq_addr(prp1);
>           return NVME_INVALID_FIELD | NVME_DNR;
>       }
> -    if (unlikely(vector > n->num_queues)) {
> +    if (unlikely(vector > n->params.num_queues)) {
>           trace_nvme_dev_err_invalid_create_cq_vector(vector);
>           return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
>       }
> @@ -783,7 +783,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>           trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
>           break;
>       case NVME_NUMBER_OF_QUEUES:
> -        result = cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
> +        result = cpu_to_le32((n->params.num_queues - 2) |
> +                             ((n->params.num_queues - 2) << 16));
>           trace_nvme_dev_getfeat_numq(result);
>           break;
>       case NVME_TIMESTAMP:
> @@ -827,9 +828,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>       case NVME_NUMBER_OF_QUEUES:
>           trace_nvme_dev_setfeat_numq((dw11 & 0xFFFF) + 1,
>                                       ((dw11 >> 16) & 0xFFFF) + 1,
> -                                    n->num_queues - 1, n->num_queues - 1);
> -        req->cqe.result =
> -            cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
> +                                    n->params.num_queues - 1,
> +                                    n->params.num_queues - 1);
> +        req->cqe.result = cpu_to_le32((n->params.num_queues - 2) |
> +                                      ((n->params.num_queues - 2) << 16));
>           break;
>       case NVME_TIMESTAMP:
>           return nvme_set_feature_timestamp(n, cmd);
> @@ -900,12 +902,12 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>   
>       blk_drain(n->conf.blk);
>   
> -    for (i = 0; i < n->num_queues; i++) {
> +    for (i = 0; i < n->params.num_queues; i++) {
>           if (n->sq[i] != NULL) {
>               nvme_free_sq(n->sq[i], n);
>           }
>       }
> -    for (i = 0; i < n->num_queues; i++) {
> +    for (i = 0; i < n->params.num_queues; i++) {
>           if (n->cq[i] != NULL) {
>               nvme_free_cq(n->cq[i], n);
>           }
> @@ -1306,7 +1308,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>       int64_t bs_size;
>       uint8_t *pci_conf;
>   
> -    if (!n->num_queues) {
> +    if (!n->params.num_queues) {
>           error_setg(errp, "num_queues can't be zero");
>           return;
>       }
> @@ -1322,7 +1324,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>           return;
>       }
>   
> -    if (!n->serial) {
> +    if (!n->params.serial) {
>           error_setg(errp, "serial property not set");
>           return;
>       }
> @@ -1339,25 +1341,25 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>       pcie_endpoint_cap_init(pci_dev, 0x80);
>   
>       n->num_namespaces = 1;
> -    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
> +    n->reg_size = pow2ceil(0x1004 + 2 * (n->params.num_queues + 1) * 4);
>       n->ns_size = bs_size / (uint64_t)n->num_namespaces;
>   
>       n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
> -    n->sq = g_new0(NvmeSQueue *, n->num_queues);
> -    n->cq = g_new0(NvmeCQueue *, n->num_queues);
> +    n->sq = g_new0(NvmeSQueue *, n->params.num_queues);
> +    n->cq = g_new0(NvmeCQueue *, n->params.num_queues);
>   
>       memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n,
>                             "nvme", n->reg_size);
>       pci_register_bar(pci_dev, 0,
>           PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
>           &n->iomem);
> -    msix_init_exclusive_bar(pci_dev, n->num_queues, 4, NULL);
> +    msix_init_exclusive_bar(pci_dev, n->params.num_queues, 4, NULL);
>   
>       id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
>       id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
>       strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' ');
>       strpadcpy((char *)id->fr, sizeof(id->fr), "1.0", ' ');
> -    strpadcpy((char *)id->sn, sizeof(id->sn), n->serial, ' ');
> +    strpadcpy((char *)id->sn, sizeof(id->sn), n->params.serial, ' ');
>       id->rab = 6;
>       id->ieee[0] = 0x00;
>       id->ieee[1] = 0x02;
> @@ -1386,7 +1388,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>       n->bar.vs = 0x00010200;
>       n->bar.intmc = n->bar.intms = 0;
>   
> -    if (n->cmb_size_mb) {
> +    if (n->params.cmb_size_mb) {
>   
>           NVME_CMBLOC_SET_BIR(n->bar.cmbloc, 2);
>           NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
> @@ -1397,7 +1399,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>           NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
>           NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
>           NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
> -        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->cmb_size_mb);
> +        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
>   
>           n->cmbloc = n->bar.cmbloc;
>           n->cmbsz = n->bar.cmbsz;
> @@ -1436,7 +1438,7 @@ static void nvme_exit(PCIDevice *pci_dev)
>       g_free(n->cq);
>       g_free(n->sq);
>   
> -    if (n->cmb_size_mb) {
> +    if (n->params.cmb_size_mb) {
>           g_free(n->cmbuf);
>       }
>       msix_uninit_exclusive_bar(pci_dev);
> @@ -1444,9 +1446,7 @@ static void nvme_exit(PCIDevice *pci_dev)
>   
>   static Property nvme_props[] = {
>       DEFINE_BLOCK_PROPERTIES(NvmeCtrl, conf),
> -    DEFINE_PROP_STRING("serial", NvmeCtrl, serial),
> -    DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, cmb_size_mb, 0),
> -    DEFINE_PROP_UINT32("num_queues", NvmeCtrl, num_queues, 64),
> +    DEFINE_NVME_PROPERTIES(NvmeCtrl, params),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 557194ee1954..9957c4a200e2 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -1,7 +1,19 @@
>   #ifndef HW_NVME_H
>   #define HW_NVME_H
> +
>   #include "block/nvme.h"
>   
> +#define DEFINE_NVME_PROPERTIES(_state, _props) \
> +    DEFINE_PROP_STRING("serial", _state, _props.serial), \
> +    DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
> +    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 64)
> +
> +typedef struct NvmeParams {
> +    char     *serial;
> +    uint32_t num_queues;
> +    uint32_t cmb_size_mb;
> +} NvmeParams;
> +
>   typedef struct NvmeAsyncEvent {
>       QSIMPLEQ_ENTRY(NvmeAsyncEvent) entry;
>       NvmeAerResult result;
> @@ -63,6 +75,7 @@ typedef struct NvmeCtrl {
>       MemoryRegion ctrl_mem;
>       NvmeBar      bar;
>       BlockConf    conf;
> +    NvmeParams   params;
>   
>       uint32_t    page_size;
>       uint16_t    page_bits;
> @@ -71,10 +84,8 @@ typedef struct NvmeCtrl {
>       uint16_t    sqe_size;
>       uint32_t    reg_size;
>       uint32_t    num_namespaces;
> -    uint32_t    num_queues;
>       uint32_t    max_q_ents;
>       uint64_t    ns_size;
> -    uint32_t    cmb_size_mb;
>       uint32_t    cmbsz;
>       uint32_t    cmbloc;
>       uint8_t     *cmbuf;
> @@ -82,7 +93,6 @@ typedef struct NvmeCtrl {
>       uint64_t    host_timestamp;                 /* Timestamp sent by the host */
>       uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
>   
> -    char            *serial;
>       NvmeNamespace   *namespaces;
>       NvmeSQueue      **sq;
>       NvmeCQueue      **cq;
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 05/48] nvme: use constants in identify
  2020-04-15  5:50 ` [PATCH v7 05/48] nvme: use constants in identify Klaus Jensen
@ 2020-04-15  7:01   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:01 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:50 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   hw/block/nvme.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 088668f28bae..622103c42d0a 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -679,7 +679,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
>   
>   static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>   {
> -    static const int data_len = 4 * KiB;
> +    static const int data_len = NVME_IDENTIFY_DATA_SIZE;
>       uint32_t min_nsid = le32_to_cpu(c->nsid);
>       uint64_t prp1 = le64_to_cpu(c->prp1);
>       uint64_t prp2 = le64_to_cpu(c->prp2);
> @@ -709,11 +709,11 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
>       NvmeIdentify *c = (NvmeIdentify *)cmd;
>   
>       switch (le32_to_cpu(c->cns)) {
> -    case 0x00:
> +    case NVME_ID_CNS_NS:
>           return nvme_identify_ns(n, c);
> -    case 0x01:
> +    case NVME_ID_CNS_CTRL:
>           return nvme_identify_ctrl(n, c);
> -    case 0x02:
> +    case NVME_ID_CNS_NS_ACTIVE_LIST:
>           return nvme_identify_nslist(n, c);
>       default:

Ah, NVME_ID_CNS_NS_DESCR_LIST is currently invalid.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>           trace_nvme_dev_err_invalid_identify_cns(le32_to_cpu(c->cns));
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 06/48] nvme: refactor nvme_addr_read
  2020-04-15  5:50 ` [PATCH v7 06/48] nvme: refactor nvme_addr_read Klaus Jensen
@ 2020-04-15  7:03   ` Philippe Mathieu-Daudé
  2020-04-15  7:46     ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:03 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:50 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Pull the controller memory buffer check to its own function. The check
> will be used on its own in later patches.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> ---
>   hw/block/nvme.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 622103c42d0a..02d3dde90842 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -52,14 +52,22 @@
>   
>   static void nvme_process_sq(void *opaque);
>   
> +static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)

'inline' not really necessary here.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> +{
> +    hwaddr low = n->ctrl_mem.addr;
> +    hwaddr hi  = n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size);
> +
> +    return addr >= low && addr < hi;
> +}
> +
>   static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
>   {
> -    if (n->cmbsz && addr >= n->ctrl_mem.addr &&
> -                addr < (n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size))) {
> +    if (n->cmbsz && nvme_addr_is_cmb(n, addr)) {
>           memcpy(buf, (void *)&n->cmbuf[addr - n->ctrl_mem.addr], size);
> -    } else {
> -        pci_dma_read(&n->parent_obj, addr, buf, size);
> +        return;
>       }
> +
> +    pci_dma_read(&n->parent_obj, addr, buf, size);
>   }
>   
>   static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members
  2020-04-15  5:51 ` [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members Klaus Jensen
@ 2020-04-15  7:10   ` Philippe Mathieu-Daudé
  2020-04-15  7:19     ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:10 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>   hw/block/nvme.c | 7 ++-----
>   hw/block/nvme.h | 2 --
>   2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index f45909dad480..123539a5d0ae 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -63,7 +63,7 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
>   
>   static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
>   {
> -    if (n->cmbsz && nvme_addr_is_cmb(n, addr)) {
> +    if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
>           memcpy(buf, (void *)&n->cmbuf[addr - n->ctrl_mem.addr], size);
>           return;
>       }
> @@ -157,7 +157,7 @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
>       if (unlikely(!prp1)) {
>           trace_nvme_dev_err_invalid_prp();
>           return NVME_INVALID_FIELD | NVME_DNR;
> -    } else if (n->cmbsz && prp1 >= n->ctrl_mem.addr &&
> +    } else if (n->bar.cmbsz && prp1 >= n->ctrl_mem.addr &&
>                  prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
>           qsg->nsg = 0;
>           qemu_iovec_init(iov, num_prps);
> @@ -1443,9 +1443,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>           NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
>           NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
>   
> -        n->cmbloc = n->bar.cmbloc;
> -        n->cmbsz = n->bar.cmbsz;
> -
>           n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
>           memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
>                                 "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 98f5b9479244..ad1786953be9 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -88,8 +88,6 @@ typedef struct NvmeCtrl {
>       uint32_t    num_namespaces;
>       uint32_t    max_q_ents;
>       uint64_t    ns_size;
> -    uint32_t    cmbsz;
> -    uint32_t    cmbloc;
>       uint8_t     *cmbuf;
>       uint64_t    irq_status;
>       uint64_t    host_timestamp;                 /* Timestamp sent by the host */
> 

"hw/block/nvme.h" should not pull in "block/nvme.h", both should include 
a common "hw/block/nvme_spec.h" (or better named). Not related to this 
patch although.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 11/48] nvme: refactor device realization
  2020-04-15  5:51 ` [PATCH v7 11/48] nvme: refactor device realization Klaus Jensen
@ 2020-04-15  7:14   ` Philippe Mathieu-Daudé
  2020-04-15  7:25     ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:14 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

Hi Klaus,

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> This patch splits up nvme_realize into multiple individual functions,
> each initializing a different subset of the device.
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> ---
>   hw/block/nvme.c | 178 +++++++++++++++++++++++++++++++-----------------
>   hw/block/nvme.h |  21 ++++++
>   2 files changed, 136 insertions(+), 63 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 123539a5d0ae..d1c42ee4765c 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -44,6 +44,8 @@
>   #include "trace.h"
>   #include "nvme.h"
>   
> +#define NVME_CMB_BIR 2
> +
>   #define NVME_GUEST_ERR(trace, fmt, ...) \
>       do { \
>           (trace_##trace)(__VA_ARGS__); \
> @@ -1322,73 +1324,112 @@ static const MemoryRegionOps nvme_cmb_ops = {
>       },
>   };
>   
> -static void nvme_realize(PCIDevice *pci_dev, Error **errp)
> +static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
>   {
> -    NvmeCtrl *n = NVME(pci_dev);
> -    NvmeIdCtrl *id = &n->id_ctrl;
> +    NvmeParams *params = &n->params;
>   
> -    int i;
> -    int64_t bs_size;
> -    uint8_t *pci_conf;
> -
> -    if (n->params.num_queues) {
> +    if (params->num_queues) {
>           warn_report("nvme: num_queues is deprecated; please use max_ioqpairs "
>                       "instead");
>   
> -        n->params.max_ioqpairs = n->params.num_queues - 1;
> +        params->max_ioqpairs = params->num_queues - 1;
>       }
>   
> -    if (!n->params.max_ioqpairs) {
> -        error_setg(errp, "max_ioqpairs can't be less than 1");
> +    if (params->max_ioqpairs < 1 ||
> +        params->max_ioqpairs > PCI_MSIX_FLAGS_QSIZE) {
> +        error_setg(errp, "nvme: max_ioqpairs must be between 1 and %d",
> +                   PCI_MSIX_FLAGS_QSIZE);
> +        return -1;
>       }
>   
>       if (!n->conf.blk) {
> -        error_setg(errp, "drive property not set");
> -        return;
> +        error_setg(errp, "nvme: block backend not configured");
> +        return -1;
>       }
>   
> -    bs_size = blk_getlength(n->conf.blk);
> -    if (bs_size < 0) {
> -        error_setg(errp, "could not get backing file size");
> -        return;
> +    if (!params->serial) {
> +        error_setg(errp, "nvme: serial not configured");
> +        return -1;
>       }
>   
> -    if (!n->params.serial) {
> -        error_setg(errp, "serial property not set");
> -        return;
> -    }
> +    return 0;
> +}
> +
> +static int nvme_init_blk(NvmeCtrl *n, Error **errp)
> +{
>       blkconf_blocksizes(&n->conf);
>       if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk),
>                                          false, errp)) {
> -        return;
> +        return -1;
>       }
>   
> -    pci_conf = pci_dev->config;
> -    pci_conf[PCI_INTERRUPT_PIN] = 1;
> -    pci_config_set_prog_interface(pci_dev->config, 0x2);
> -    pci_config_set_class(pci_dev->config, PCI_CLASS_STORAGE_EXPRESS);
> -    pcie_endpoint_cap_init(pci_dev, 0x80);
> +    return 0;
> +}
>   
> +static void nvme_init_state(NvmeCtrl *n)
> +{
>       n->num_namespaces = 1;
>       n->reg_size = pow2ceil(0x1008 + 2 * (n->params.max_ioqpairs) * 4);
> -    n->ns_size = bs_size / (uint64_t)n->num_namespaces;
> -
>       n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>       n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>       n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +}
>   
> -    memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n,
> -                          "nvme", n->reg_size);
> -    pci_register_bar(pci_dev, 0,
> -        PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
> -        &n->iomem);
> +static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
> +{
> +    NVME_CMBLOC_SET_BIR(n->bar.cmbloc, NVME_CMB_BIR);
> +    NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
> +
> +    NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
> +    NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
> +    NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
> +    NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
> +    NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
> +    NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2);
> +    NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
> +
> +    n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
> +    memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
> +                          "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
> +    pci_register_bar(pci_dev, NVME_CMBLOC_BIR(n->bar.cmbloc),
> +                     PCI_BASE_ADDRESS_SPACE_MEMORY |
> +                     PCI_BASE_ADDRESS_MEM_TYPE_64 |
> +                     PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem);
> +}
> +
> +static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev)
> +{
> +    uint8_t *pci_conf = pci_dev->config;
> +
> +    pci_conf[PCI_INTERRUPT_PIN] = 1;
> +    pci_config_set_prog_interface(pci_conf, 0x2);
> +    pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
> +    pci_config_set_device_id(pci_conf, 0x5845);
> +    pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS);
> +    pcie_endpoint_cap_init(pci_dev, 0x80);
> +
> +    memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme",
> +                          n->reg_size);
> +    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
> +                     PCI_BASE_ADDRESS_MEM_TYPE_64, &n->iomem);
>       msix_init_exclusive_bar(pci_dev, n->params.max_ioqpairs + 1, 4, NULL);
>   
> +    if (n->params.cmb_size_mb) {
> +        nvme_init_cmb(n, pci_dev);
> +    }
> +}
> +
> +static void nvme_init_ctrl(NvmeCtrl *n)
> +{
> +    NvmeIdCtrl *id = &n->id_ctrl;
> +    NvmeParams *params = &n->params;
> +    uint8_t *pci_conf = n->parent_obj.config;
> +
>       id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
>       id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
>       strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' ');
>       strpadcpy((char *)id->fr, sizeof(id->fr), "1.0", ' ');
> -    strpadcpy((char *)id->sn, sizeof(id->sn), n->params.serial, ' ');
> +    strpadcpy((char *)id->sn, sizeof(id->sn), params->serial, ' ');
>       id->rab = 6;
>       id->ieee[0] = 0x00;
>       id->ieee[1] = 0x02;
> @@ -1429,43 +1470,54 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>   
>       n->bar.vs = 0x00010200;
>       n->bar.intmc = n->bar.intms = 0;
> +}
>   
> -    if (n->params.cmb_size_mb) {
> +static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
> +{
> +    int64_t bs_size;
> +    NvmeIdNs *id_ns = &ns->id_ns;
>   
> -        NVME_CMBLOC_SET_BIR(n->bar.cmbloc, 2);
> -        NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
> +    bs_size = blk_getlength(n->conf.blk);
> +    if (bs_size < 0) {
> +        error_setg_errno(errp, -bs_size, "blk_getlength");
> +        return -1;
> +    }
>   
> -        NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
> -        NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
> -        NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
> -        NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
> -        NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
> -        NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
> -        NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->params.cmb_size_mb);
> +    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
> +    n->ns_size = bs_size;
>   
> -        n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
> -        memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
> -                              "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
> -        pci_register_bar(pci_dev, NVME_CMBLOC_BIR(n->bar.cmbloc),
> -            PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64 |
> -            PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem);
> +    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns));
>   
> +    /* no thin provisioning */
> +    id_ns->ncap = id_ns->nsze;
> +    id_ns->nuse = id_ns->ncap;
> +
> +    return 0;
> +}
> +
> +static void nvme_realize(PCIDevice *pci_dev, Error **errp)
> +{
> +    NvmeCtrl *n = NVME(pci_dev);
> +    int i;
> +
> +    if (nvme_check_constraints(n, errp)) {
> +        return;
> +    }
> +
> +    nvme_init_state(n);
> +
> +    if (nvme_init_blk(n, errp)) {
> +        return;
>       }
>   
>       for (i = 0; i < n->num_namespaces; i++) {
> -        NvmeNamespace *ns = &n->namespaces[i];
> -        NvmeIdNs *id_ns = &ns->id_ns;
> -        id_ns->nsfeat = 0;
> -        id_ns->nlbaf = 0;
> -        id_ns->flbas = 0;
> -        id_ns->mc = 0;
> -        id_ns->dpc = 0;
> -        id_ns->dps = 0;
> -        id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
> -        id_ns->ncap  = id_ns->nuse = id_ns->nsze =
> -            cpu_to_le64(n->ns_size >>
> -                id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas)].ds);
> +        if (nvme_init_namespace(n, &n->namespaces[i], errp)) {
> +            return;
> +        }
>       }
> +
> +    nvme_init_pci(n, pci_dev);
> +    nvme_init_ctrl(n);

This patch is a pain to review... Could you split it? I'd use one 
trivial patch for each function extracted from nvme_realize().

>   }
>   
>   static void nvme_exit(PCIDevice *pci_dev)
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index ad1786953be9..b7c465560eea 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -67,6 +67,22 @@ typedef struct NvmeNamespace {
>       NvmeIdNs        id_ns;
>   } NvmeNamespace;
>   
> +static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> +{
> +    NvmeIdNs *id_ns = &ns->id_ns;
> +    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
> +}
> +
> +static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
> +{
> +    return nvme_ns_lbaf(ns)->ds;
> +}
> +
> +static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
> +{
> +    return 1 << nvme_ns_lbads(ns);
> +}
> +
>   #define TYPE_NVME "nvme"
>   #define NVME(obj) \
>           OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
> @@ -101,4 +117,9 @@ typedef struct NvmeCtrl {
>       NvmeIdCtrl      id_ctrl;
>   } NvmeCtrl;
>   
> +static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
> +{
> +    return n->ns_size >> nvme_ns_lbads(ns);
> +}
> +
>   #endif /* HW_NVME_H */
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (47 preceding siblings ...)
  2020-04-15  5:51 ` [PATCH v7 48/48] nvme: make lba data size configurable Klaus Jensen
@ 2020-04-15  7:17 ` no-reply
  2020-04-15  8:02 ` no-reply
  49 siblings, 0 replies; 72+ messages in thread
From: no-reply @ 2020-04-15  7:17 UTC (permalink / raw)
  To: its
  Cc: kwolf, beata.michalska, qemu-block, k.jensen, qemu-devel, mreitz,
	kbusch, its, javier.gonz, mlevitsk

Patchew URL: https://patchew.org/QEMU/20200415055140.466900-1-its@irrelevant.dk/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces
Message-id: 20200415055140.466900-1-its@irrelevant.dk
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20200415055140.466900-1-its@irrelevant.dk -> patchew/20200415055140.466900-1-its@irrelevant.dk
Switched to a new branch 'test'
075a008 nvme: make lba data size configurable
47f0589 nvme: change controller pci id
d5526c6 pci: allocate pci id for nvme
c945ecc nvme: support multiple namespaces
88ff4af nvme: refactor identify active namespace id list
05eb730 nvme: add support for sgl bit bucket descriptor
ccd3495 nvme: add support for scatter gather lists
525413d nvme: harden cmb access
c893248 nvme: handle dma errors
a1dcb81 pci: pass along the return value of dma_memory_rw
68e505d nvme: use preallocated qsg/iov in nvme_dma_prp
5778b7c nvme: add nvme_check_rw helper
bddb627 nvme: allow multiple aios per command
450162a nvme: remove NvmeCmd parameter
5b8060c nvme: refactor NvmeRequest
a4c50e4 nvme: be consistent about zeros vs zeroes
d60e3c2 nvme: add check for mdts
465e28d nvme: refactor request bounds checking
34b728f nvme: verify validity of prp lists in the cmb
c944bc7 nvme: add request mapping helper
29cf5ab nvme: pass request along for tracing
dfeb326 nvme: refactor dma read/write
be63c8d nvme: remove redundant has_sg member
31c0f74 nvme: replace dma_acct with blk_acct equivalent
c237462 nvme: add mapping helpers
5edac1b nvme: memset preallocated requests structures
a715b8e nvme: bump supported version to v1.3
1c57d98 nvme: provide the mandatory subnqn field
67a5dd5 nvme: enforce valid queue creation sequence
2e97095 nvme: support identify namespace descriptor list
057a489 nvme: add log specific field to trace events
65f95e3 nvme: make sure ncqr and nsqr is valid
3311059 nvme: additional tracing
3b35916 nvme: add missing mandatory features
9afe18e nvme: add support for the asynchronous event request command
642509b nvme: add support for the get log page command
d97b33d nvme: add temperature threshold feature
c6ca9fc nvme: refactor device realization
2786713 nvme: remove redundant cmbloc/cmbsz members
d9de9d6 nvme: add max_ioqpairs device parameter
5407887 nvme: fix pci doorbell size calculation
fc4bdcc nvme: add support for the abort command
a80e981 nvme: refactor nvme_addr_read
141dde9 nvme: use constants in identify
0d293d8 nvme: bump spec data structures to v1.3
d102a99 nvme: move device parameters to separate struct
21a7d88 nvme: remove superfluous breaks
e3e06ce nvme: rename trace events to nvme_dev

=== OUTPUT BEGIN ===
1/48 Checking commit e3e06ce2a1c6 (nvme: rename trace events to nvme_dev)
2/48 Checking commit 21a7d885cf07 (nvme: remove superfluous breaks)
3/48 Checking commit d102a99b902c (nvme: move device parameters to separate struct)
ERROR: Macros with complex values should be enclosed in parenthesis
#180: FILE: hw/block/nvme.h:6:
+#define DEFINE_NVME_PROPERTIES(_state, _props) \
+    DEFINE_PROP_STRING("serial", _state, _props.serial), \
+    DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
+    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 64)

total: 1 errors, 0 warnings, 181 lines checked

Patch 3/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

4/48 Checking commit 0d293d85aad5 (nvme: bump spec data structures to v1.3)
5/48 Checking commit 141dde95470b (nvme: use constants in identify)
6/48 Checking commit a80e981b44f5 (nvme: refactor nvme_addr_read)
7/48 Checking commit fc4bdcc8d99b (nvme: add support for the abort command)
8/48 Checking commit 5407887ccd40 (nvme: fix pci doorbell size calculation)
9/48 Checking commit d9de9d6bea9b (nvme: add max_ioqpairs device parameter)
10/48 Checking commit 27867136760a (nvme: remove redundant cmbloc/cmbsz members)
11/48 Checking commit c6ca9fc771c8 (nvme: refactor device realization)
12/48 Checking commit d97b33d714fd (nvme: add temperature threshold feature)
13/48 Checking commit 642509b88fa2 (nvme: add support for the get log page command)
14/48 Checking commit 9afe18e6c92e (nvme: add support for the asynchronous event request command)
15/48 Checking commit 3b359168c402 (nvme: add missing mandatory features)
16/48 Checking commit 3311059a2d10 (nvme: additional tracing)
17/48 Checking commit 65f95e32eac2 (nvme: make sure ncqr and nsqr is valid)
18/48 Checking commit 057a489e82f5 (nvme: add log specific field to trace events)
19/48 Checking commit 2e970952379b (nvme: support identify namespace descriptor list)
20/48 Checking commit 67a5dd510a3c (nvme: enforce valid queue creation sequence)
21/48 Checking commit 1c57d98f9cd3 (nvme: provide the mandatory subnqn field)
22/48 Checking commit a715b8ebca1b (nvme: bump supported version to v1.3)
23/48 Checking commit 5edac1bbd208 (nvme: memset preallocated requests structures)
24/48 Checking commit c237462ada4c (nvme: add mapping helpers)
25/48 Checking commit 31c0f741a2d9 (nvme: replace dma_acct with blk_acct equivalent)
26/48 Checking commit be63c8de71b9 (nvme: remove redundant has_sg member)
27/48 Checking commit dfeb3263df77 (nvme: refactor dma read/write)
28/48 Checking commit 29cf5ab19b8e (nvme: pass request along for tracing)
29/48 Checking commit c944bc709579 (nvme: add request mapping helper)
30/48 Checking commit 34b728fd35d6 (nvme: verify validity of prp lists in the cmb)
31/48 Checking commit 465e28dd32a2 (nvme: refactor request bounds checking)
32/48 Checking commit d60e3c234634 (nvme: add check for mdts)
33/48 Checking commit a4c50e475f1d (nvme: be consistent about zeros vs zeroes)
34/48 Checking commit 5b8060cc19a7 (nvme: refactor NvmeRequest)
35/48 Checking commit 450162a9af9e (nvme: remove NvmeCmd parameter)
36/48 Checking commit bddb6275ddec (nvme: allow multiple aios per command)
37/48 Checking commit 5778b7c92f9a (nvme: add nvme_check_rw helper)
38/48 Checking commit 68e505d6710c (nvme: use preallocated qsg/iov in nvme_dma_prp)
39/48 Checking commit a1dcb81b1fc4 (pci: pass along the return value of dma_memory_rw)
40/48 Checking commit c8932482ab54 (nvme: handle dma errors)
41/48 Checking commit 525413d1e0af (nvme: harden cmb access)
42/48 Checking commit ccd349591856 (nvme: add support for scatter gather lists)
43/48 Checking commit 05eb73006c5c (nvme: add support for sgl bit bucket descriptor)
44/48 Checking commit 88ff4af65551 (nvme: refactor identify active namespace id list)
45/48 Checking commit c945ecc719bd (nvme: support multiple namespaces)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#43: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#218: FILE: hw/block/nvme-ns.h:8:
+#define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
+    DEFINE_PROP_DRIVE("drive", _state, blk), \
+    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)

total: 1 errors, 1 warnings, 841 lines checked

Patch 45/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

46/48 Checking commit d5526c6f3499 (pci: allocate pci id for nvme)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#32: 
new file mode 100644

total: 0 errors, 1 warnings, 46 lines checked

Patch 46/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
47/48 Checking commit 47f0589e3647 (nvme: change controller pci id)
48/48 Checking commit 075a008ea8e4 (nvme: make lba data size configurable)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200415055140.466900-1-its@irrelevant.dk/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 12/48] nvme: add temperature threshold feature
  2020-04-15  5:51 ` [PATCH v7 12/48] nvme: add temperature threshold feature Klaus Jensen
@ 2020-04-15  7:19   ` Philippe Mathieu-Daudé
  2020-04-15  7:24     ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:19 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> It might seem wierd to implement this feature for an emulated device,

'weird'

> but it is mandatory to support and the feature is useful for testing
> asynchronous event request support, which will be added in a later
> patch.

Which patch? I can't find how you set the temperature in this series.

> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   hw/block/nvme.c      | 48 ++++++++++++++++++++++++++++++++++++++++++++
>   hw/block/nvme.h      |  1 +
>   include/block/nvme.h |  8 +++++++-
>   3 files changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index d1c42ee4765c..e777cc9075c1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -45,6 +45,9 @@
>   #include "nvme.h"
>   
>   #define NVME_CMB_BIR 2
> +#define NVME_TEMPERATURE 0x143
> +#define NVME_TEMPERATURE_WARNING 0x157
> +#define NVME_TEMPERATURE_CRITICAL 0x175
>   
>   #define NVME_GUEST_ERR(trace, fmt, ...) \
>       do { \
> @@ -798,9 +801,31 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
>   static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>   {
>       uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>       uint32_t result;
>   
>       switch (dw10) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        result = 0;
> +
> +        /*
> +         * The controller only implements the Composite Temperature sensor, so
> +         * return 0 for all other sensors.
> +         */
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        switch (NVME_TEMP_THSEL(dw11)) {
> +        case NVME_TEMP_THSEL_OVER:
> +            result = cpu_to_le16(n->features.temp_thresh_hi);
> +            break;
> +        case NVME_TEMP_THSEL_UNDER:
> +            result = cpu_to_le16(n->features.temp_thresh_low);
> +            break;
> +        }
> +
> +        break;
>       case NVME_VOLATILE_WRITE_CACHE:
>           result = blk_enable_write_cache(n->conf.blk);
>           trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
> @@ -845,6 +870,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>       uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>   
>       switch (dw10) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        switch (NVME_TEMP_THSEL(dw11)) {
> +        case NVME_TEMP_THSEL_OVER:
> +            n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
> +            break;
> +        case NVME_TEMP_THSEL_UNDER:
> +            n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
> +            break;
> +        default:
> +            return NVME_INVALID_FIELD | NVME_DNR;
> +        }
> +
> +        break;
>       case NVME_VOLATILE_WRITE_CACHE:
>           blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>           break;
> @@ -1373,6 +1415,7 @@ static void nvme_init_state(NvmeCtrl *n)
>       n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>       n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>       n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +    n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
>   }
>   
>   static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
> @@ -1450,6 +1493,11 @@ static void nvme_init_ctrl(NvmeCtrl *n)
>       id->acl = 3;
>       id->frmw = 7 << 1;
>       id->lpa = 1 << 0;
> +
> +    /* recommended default value (~70 C) */
> +    id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> +    id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
> +
>       id->sqes = (0x6 << 4) | 0x6;
>       id->cqes = (0x4 << 4) | 0x4;
>       id->nn = cpu_to_le32(n->num_namespaces);
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index b7c465560eea..807c4ad19dcc 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -115,6 +115,7 @@ typedef struct NvmeCtrl {
>       NvmeSQueue      admin_sq;
>       NvmeCQueue      admin_cq;
>       NvmeIdCtrl      id_ctrl;
> +    NvmeFeatureVal  features;
>   } NvmeCtrl;
>   
>   static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index b30744068d46..a0519814ecec 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -688,7 +688,13 @@ enum NvmeIdCtrlOncs {
>   typedef struct NvmeFeatureVal {
>       uint32_t    arbitration;
>       uint32_t    power_mgmt;
> -    uint32_t    temp_thresh;
> +    union {
> +        struct {
> +            uint16_t temp_thresh_hi;
> +            uint16_t temp_thresh_low;
> +        };
> +        uint32_t temp_thresh;
> +    };
>       uint32_t    err_rec;
>       uint32_t    volatile_wc;
>       uint32_t    num_queues;
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members
  2020-04-15  7:10   ` Philippe Mathieu-Daudé
@ 2020-04-15  7:19     ` Klaus Birkelund Jensen
  2020-04-15  7:48       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  7:19 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:10, Philippe Mathieu-Daudé wrote:
> 
> "hw/block/nvme.h" should not pull in "block/nvme.h", both should include a
> common "hw/block/nvme_spec.h" (or better named). Not related to this patch
> although.
> 

Hmm. It does pull in the "include/block/nvme.h" which is basically the
"nvme_spec.h" you are talking about. That file holds all spec related
structs that are shared between the VFIO based nvme driver
(block/nvme.c) and the emulated device (hw/block/nvme.c).

Isn't that what is intended?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 12/48] nvme: add temperature threshold feature
  2020-04-15  7:19   ` Philippe Mathieu-Daudé
@ 2020-04-15  7:24     ` Klaus Birkelund Jensen
  2020-04-15  7:28       ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  7:24 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:19, Philippe Mathieu-Daudé wrote:
> On 4/15/20 7:51 AM, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > It might seem wierd to implement this feature for an emulated device,
> 
> 'weird'

Thanks, fixed :)

> 
> > but it is mandatory to support and the feature is useful for testing
> > asynchronous event request support, which will be added in a later
> > patch.
> 
> Which patch? I can't find how you set the temperature in this series.
> 

The temperature cannot be changed, but the thresholds can with the Set
Features command (and that can then trigger AERs). That is added in
"nvme: add temperature threshold feature" and "nvme: add support for the
asynchronous event request command" respectively.

There is a test in SPDK that does this.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 11/48] nvme: refactor device realization
  2020-04-15  7:14   ` Philippe Mathieu-Daudé
@ 2020-04-15  7:25     ` Klaus Birkelund Jensen
  2020-04-15  7:55       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  7:25 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:14, Philippe Mathieu-Daudé wrote:
> Hi Klaus,
> 
> This patch is a pain to review... Could you split it? I'd use one trivial
> patch for each function extracted from nvme_realize().
> 

Understood, I will split it up!


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 40/48] nvme: handle dma errors
  2020-04-15  5:51 ` [PATCH v7 40/48] nvme: handle dma errors Klaus Jensen
@ 2020-04-15  7:26   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:26 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Handling DMA errors gracefully is required for the device to pass the
> block/011 test ("disable PCI device while doing I/O") in the blktests
> suite.
> 
> With this patch the device passes the test by retrying "critical"
> transfers (posting of completion entries and processing of submission
> queue entries).
> 
> If DMA errors occur at any other point in the execution of the command
> (say, while mapping the PRPs), the command is aborted with a Data
> Transfer Error status code.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   hw/block/nvme.c       | 45 ++++++++++++++++++++++++++++++++-----------
>   hw/block/trace-events |  2 ++
>   include/block/nvme.h  |  2 +-
>   3 files changed, 37 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index eb15a0bd3cf9..6dcd9c4b4cd0 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -74,14 +74,14 @@ static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
>       return addr >= low && addr < hi;
>   }
>   
> -static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
> +static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
>   {
>       if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
>           memcpy(buf, nvme_addr_to_cmb(n, addr), size);
> -        return;
> +        return 0;
>       }
>   
> -    pci_dma_read(&n->parent_obj, addr, buf, size);
> +    return pci_dma_read(&n->parent_obj, addr, buf, size);
>   }
>   
>   static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
> @@ -185,7 +185,7 @@ static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
>       }
>   
>       if (!nvme_addr_is_cmb(n, addr) || !nvme_addr_is_cmb(n, addr + len - 1)) {
> -        return NVME_DATA_TRAS_ERROR;
> +        return NVME_DATA_TRANSFER_ERROR;
>       }
>   
>       qemu_iovec_add(iov, nvme_addr_to_cmb(n, addr), len);
> @@ -238,6 +238,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
>       int num_prps = (len >> n->page_bits) + 1;
>       uint16_t status;
>       bool prp_list_in_cmb = false;
> +    int ret;
>   
>       trace_nvme_dev_map_prp(nvme_cid(req), trans_len, len, prp1, prp2,
>                              num_prps);
> @@ -277,7 +278,12 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
>   
>               nents = (len + n->page_size - 1) >> n->page_bits;
>               prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
> -            nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
> +            ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
> +            if (ret) {
> +                trace_nvme_dev_err_addr_read(prp2);
> +                status = NVME_DATA_TRANSFER_ERROR;
> +                goto unmap;
> +            }
>               while (len != 0) {
>                   uint64_t prp_ent = le64_to_cpu(prp_list[i]);
>   
> @@ -296,8 +302,13 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
>                       i = 0;
>                       nents = (len + n->page_size - 1) >> n->page_bits;
>                       prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
> -                    nvme_addr_read(n, prp_ent, (void *)prp_list,
> -                        prp_trans);
> +                    ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
> +                                         prp_trans);
> +                    if (ret) {
> +                        trace_nvme_dev_err_addr_read(prp_ent);
> +                        status = NVME_DATA_TRANSFER_ERROR;
> +                        goto unmap;
> +                    }
>                       prp_ent = le64_to_cpu(prp_list[i]);
>                   }
>   
> @@ -502,6 +513,7 @@ static void nvme_post_cqes(void *opaque)
>       NvmeCQueue *cq = opaque;
>       NvmeCtrl *n = cq->ctrl;
>       NvmeRequest *req, *next;
> +    int ret;
>   
>       QTAILQ_FOREACH_SAFE(req, &cq->req_list, entry, next) {
>           NvmeSQueue *sq;
> @@ -511,15 +523,21 @@ static void nvme_post_cqes(void *opaque)
>               break;
>           }
>   
> -        QTAILQ_REMOVE(&cq->req_list, req, entry);
>           sq = req->sq;
>           req->cqe.status = cpu_to_le16((req->status << 1) | cq->phase);
>           req->cqe.sq_id = cpu_to_le16(sq->sqid);
>           req->cqe.sq_head = cpu_to_le16(sq->head);
>           addr = cq->dma_addr + cq->tail * n->cqe_size;
> +        ret = pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
> +                            sizeof(req->cqe));
> +        if (ret) {
> +            trace_nvme_dev_err_addr_write(addr);
> +            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> +                      500 * SCALE_MS);

Could you refactor the timer_mod() calls into a better named helper?

> +            break;
> +        }
> +        QTAILQ_REMOVE(&cq->req_list, req, entry);
>           nvme_inc_cq_tail(cq);
> -        pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
> -            sizeof(req->cqe));
>           nvme_req_clear(req);
>           QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
>       }
> @@ -1664,7 +1682,12 @@ static void nvme_process_sq(void *opaque)
>   
>       while (!(nvme_sq_empty(sq) || QTAILQ_EMPTY(&sq->req_list))) {
>           addr = sq->dma_addr + sq->head * n->sqe_size;
> -        nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd));
> +        if (nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd))) {
> +            trace_nvme_dev_err_addr_read(addr);
> +            timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> +                      500 * SCALE_MS);

Ditto.

> +            break;
> +        }
>           nvme_inc_sq_head(sq);
>   
>           req = QTAILQ_FIRST(&sq->req_list);
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 7c277a2999c0..75bde5e676a5 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -87,6 +87,8 @@ nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_
>   # nvme traces for error conditions
>   nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" mdts %"PRIu64" len %"PRIu64""
>   nvme_dev_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16""
> +nvme_dev_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
> +nvme_dev_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
>   nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
>   nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
>   nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index c4c669e32fc4..03bee32c27c4 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -457,7 +457,7 @@ enum NvmeStatusCodes {
>       NVME_INVALID_OPCODE         = 0x0001,
>       NVME_INVALID_FIELD          = 0x0002,
>       NVME_CID_CONFLICT           = 0x0003,
> -    NVME_DATA_TRAS_ERROR        = 0x0004,
> +    NVME_DATA_TRANSFER_ERROR    = 0x0004,
>       NVME_POWER_LOSS_ABORT       = 0x0005,
>       NVME_INTERNAL_DEV_ERROR     = 0x0006,
>       NVME_CMD_ABORT_REQ          = 0x0007,
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 12/48] nvme: add temperature threshold feature
  2020-04-15  7:24     ` Klaus Birkelund Jensen
@ 2020-04-15  7:28       ` Klaus Birkelund Jensen
  2020-04-15  7:45         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  7:28 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:24, Klaus Birkelund Jensen wrote:
> On Apr 15 09:19, Philippe Mathieu-Daudé wrote:
> > On 4/15/20 7:51 AM, Klaus Jensen wrote:
> > > From: Klaus Jensen <k.jensen@samsung.com>
> > > 
> > > It might seem wierd to implement this feature for an emulated device,
> > 
> > 'weird'
> 
> Thanks, fixed :)
> 
> > 
> > > but it is mandatory to support and the feature is useful for testing
> > > asynchronous event request support, which will be added in a later
> > > patch.
> > 
> > Which patch? I can't find how you set the temperature in this series.
> > 
> 
> The temperature cannot be changed, but the thresholds can with the Set
> Features command (and that can then trigger AERs). That is added in
> "nvme: add temperature threshold feature" and "nvme: add support for the
> asynchronous event request command" respectively.
> 
> There is a test in SPDK that does this.
> 

Oh, I think I misunderstood you.

No, setting the temperature was moved to the "nvme: add support for the
get log page command" patch since that is the patch that actually uses
it. This was on request by Maxim in an earlier review.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 45/48] nvme: support multiple namespaces
  2020-04-15  5:51 ` [PATCH v7 45/48] nvme: support multiple namespaces Klaus Jensen
@ 2020-04-15  7:38   ` Philippe Mathieu-Daudé
  2020-04-15  8:02     ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:38 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> This adds support for multiple namespaces by introducing a new 'nvme-ns'
> device model. The nvme device creates a bus named from the device name
> ('id'). The nvme-ns devices then connect to this and registers
> themselves with the nvme device.
> 
> This changes how an nvme device is created. Example with two namespaces:
> 
>    -drive file=nvme0n1.img,if=none,id=disk1
>    -drive file=nvme0n2.img,if=none,id=disk2
>    -device nvme,serial=deadbeef,id=nvme0
>    -device nvme-ns,drive=disk1,bus=nvme0,nsid=1
>    -device nvme-ns,drive=disk2,bus=nvme0,nsid=2
> 
> The drive property is kept on the nvme device to keep the change
> backward compatible, but the property is now optional. Specifying a
> drive for the nvme device will always create the namespace with nsid 1.
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> ---
>   hw/block/Makefile.objs |   2 +-
>   hw/block/nvme-ns.c     | 157 +++++++++++++++++++++++++++
>   hw/block/nvme-ns.h     |  60 +++++++++++
>   hw/block/nvme.c        | 233 +++++++++++++++++++++++++++--------------
>   hw/block/nvme.h        |  47 ++++-----
>   hw/block/trace-events  |   8 +-
>   6 files changed, 396 insertions(+), 111 deletions(-)
>   create mode 100644 hw/block/nvme-ns.c
>   create mode 100644 hw/block/nvme-ns.h
> 
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index 4b4a2b338dc4..d9141d6a4b9b 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -7,7 +7,7 @@ common-obj-$(CONFIG_PFLASH_CFI02) += pflash_cfi02.o
>   common-obj-$(CONFIG_XEN) += xen-block.o
>   common-obj-$(CONFIG_ECC) += ecc.o
>   common-obj-$(CONFIG_ONENAND) += onenand.o
> -common-obj-$(CONFIG_NVME_PCI) += nvme.o
> +common-obj-$(CONFIG_NVME_PCI) += nvme.o nvme-ns.o
>   common-obj-$(CONFIG_SWIM) += swim.o
>   
>   common-obj-$(CONFIG_SH4) += tc58128.o
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> new file mode 100644
> index 000000000000..bd64d4a94632
> --- /dev/null
> +++ b/hw/block/nvme-ns.c
> @@ -0,0 +1,157 @@

Missing copyright + license.

> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qemu/cutils.h"
> +#include "qemu/log.h"
> +#include "hw/block/block.h"
> +#include "hw/pci/pci.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/block-backend.h"
> +#include "qapi/error.h"
> +
> +#include "hw/qdev-properties.h"
> +#include "hw/qdev-core.h"
> +
> +#include "nvme.h"
> +#include "nvme-ns.h"
> +
> +static int nvme_ns_init(NvmeNamespace *ns)
> +{
> +    NvmeIdNs *id_ns = &ns->id_ns;
> +
> +    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
> +    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
> +
> +    /* no thin provisioning */
> +    id_ns->ncap = id_ns->nsze;
> +    id_ns->nuse = id_ns->ncap;
> +
> +    return 0;
> +}
> +
> +static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, NvmeIdCtrl *id,
> +                            Error **errp)
> +{
> +    uint64_t perm, shared_perm;
> +
> +    Error *local_err = NULL;
> +    int ret;
> +
> +    perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE;
> +    shared_perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
> +        BLK_PERM_GRAPH_MOD;
> +
> +    ret = blk_set_perm(ns->blk, perm, shared_perm, &local_err);
> +    if (ret) {
> +        error_propagate_prepend(errp, local_err,
> +                                "could not set block permissions: ");
> +        return ret;
> +    }
> +
> +    ns->size = blk_getlength(ns->blk);
> +    if (ns->size < 0) {
> +        error_setg_errno(errp, -ns->size, "could not get blockdev size");
> +        return -1;
> +    }
> +
> +    switch (n->conf.wce) {
> +    case ON_OFF_AUTO_ON:
> +        n->features.volatile_wc = 1;
> +        break;
> +    case ON_OFF_AUTO_OFF:
> +        n->features.volatile_wc = 0;

Missing 'break'?

> +    case ON_OFF_AUTO_AUTO:
> +        n->features.volatile_wc = blk_enable_write_cache(ns->blk);
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    blk_set_enable_write_cache(ns->blk, n->features.volatile_wc);
> +
> +    return 0;
> +}
> +
> +static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
> +{
> +    if (!ns->blk) {
> +        error_setg(errp, "block backend not configured");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
> +{
> +    if (nvme_ns_check_constraints(ns, errp)) {
> +        return -1;
> +    }
> +
> +    if (nvme_ns_init_blk(n, ns, &n->id_ctrl, errp)) {
> +        return -1;
> +    }
> +
> +    nvme_ns_init(ns);
> +    if (nvme_register_namespace(n, ns, errp)) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +static void nvme_ns_realize(DeviceState *dev, Error **errp)
> +{
> +    NvmeNamespace *ns = NVME_NS(dev);
> +    BusState *s = qdev_get_parent_bus(dev);
> +    NvmeCtrl *n = NVME(s->parent);
> +    Error *local_err = NULL;
> +
> +    if (nvme_ns_setup(n, ns, &local_err)) {
> +        error_propagate_prepend(errp, local_err,
> +                                "could not setup namespace: ");
> +        return;
> +    }
> +}
> +
> +static Property nvme_ns_props[] = {
> +    DEFINE_NVME_NS_PROPERTIES(NvmeNamespace, params),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void nvme_ns_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +
> +    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> +
> +    dc->bus_type = TYPE_NVME_BUS;
> +    dc->realize = nvme_ns_realize;
> +    device_class_set_props(dc, nvme_ns_props);
> +    dc->desc = "virtual nvme namespace";

"Virtual NVMe namespace"?

> +}
> +
> +static void nvme_ns_instance_init(Object *obj)
> +{
> +    NvmeNamespace *ns = NVME_NS(obj);
> +    char *bootindex = g_strdup_printf("/namespace@%d,0", ns->params.nsid);
> +
> +    device_add_bootindex_property(obj, &ns->bootindex, "bootindex",
> +                                  bootindex, DEVICE(obj), &error_abort);
> +
> +    g_free(bootindex);
> +}
> +
> +static const TypeInfo nvme_ns_info = {
> +    .name = TYPE_NVME_NS,
> +    .parent = TYPE_DEVICE,
> +    .class_init = nvme_ns_class_init,
> +    .instance_size = sizeof(NvmeNamespace),
> +    .instance_init = nvme_ns_instance_init,
> +};
> +
> +static void nvme_ns_register_types(void)
> +{
> +    type_register_static(&nvme_ns_info);
> +}
> +
> +type_init(nvme_ns_register_types)
> diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
> new file mode 100644
> index 000000000000..3c3651d485d0
> --- /dev/null
> +++ b/hw/block/nvme-ns.h
> @@ -0,0 +1,60 @@

Missing copyright + license.

> +#ifndef NVME_NS_H
> +#define NVME_NS_H
> +
> +#define TYPE_NVME_NS "nvme-ns"
> +#define NVME_NS(obj) \
> +    OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
> +
> +#define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
> +    DEFINE_PROP_DRIVE("drive", _state, blk), \
> +    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)
> +
> +typedef struct NvmeNamespaceParams {
> +    uint32_t nsid;
> +} NvmeNamespaceParams;
> +
> +typedef struct NvmeNamespace {
> +    DeviceState  parent_obj;
> +    BlockBackend *blk;
> +    int32_t      bootindex;
> +    int64_t      size;
> +
> +    NvmeIdNs            id_ns;
> +    NvmeNamespaceParams params;
> +} NvmeNamespace;
> +
> +static inline uint32_t nvme_nsid(NvmeNamespace *ns)
> +{
> +    if (ns) {
> +        return ns->params.nsid;
> +    }
> +
> +    return -1;

This case doesn't seem possible.

Maybe:

       assert(ns);
       return ns->params.nsid;

> +}
> +
> +static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> +{
> +    NvmeIdNs *id_ns = &ns->id_ns;
> +    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
> +}
> +
> +static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
> +{
> +    return nvme_ns_lbaf(ns)->ds;
> +}
> +
> +static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
> +{
> +    return 1 << nvme_ns_lbads(ns);
> +}
> +
> +static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns)
> +{
> +    return ns->size >> nvme_ns_lbads(ns);
> +}
> +
> +typedef struct NvmeCtrl NvmeCtrl;
> +
> +int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp);
> +
> +#endif /* NVME_NS_H */
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 05a6fa334a70..e338d0893a70 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -17,10 +17,11 @@
>   /**
>    * Usage: add options:
>    *      -drive file=<file>,if=none,id=<drive_id>
> - *      -device nvme,drive=<drive_id>,serial=<serial>,id=<id[optional]>, \
> + *      -device nvme,serial=<serial>,id=<bus_name>, \
>    *              cmb_size_mb=<cmb_size_mb[optional]>, \
>    *              max_ioqpairs=<N[optional]>, \
>    *              mdts=<mdts[optional]>
> + *      -device nvme-ns,drive=<drive_id>,bus=bus_name,nsid=1
>    *
>    * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
>    * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
> @@ -44,6 +45,7 @@
>   #include "qemu/cutils.h"
>   #include "trace.h"
>   #include "nvme.h"
> +#include "nvme-ns.h"
>   
>   #define NVME_SPEC_VER 0x00010300
>   #define NVME_CMB_BIR 2
> @@ -89,6 +91,11 @@ static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
>       return pci_dma_read(&n->parent_obj, addr, buf, size);
>   }
>   
> +static uint16_t nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid)
> +{
> +    return nsid && nsid <= n->num_namespaces;
> +}
> +
>   static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
>   {
>       return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
> @@ -949,11 +956,12 @@ static uint16_t nvme_check_rw(NvmeCtrl *n, NvmeRequest *req)
>   
>   static void nvme_rw_cb(NvmeRequest *req, void *opaque)
>   {
> +    NvmeNamespace *ns = req->ns;
>       NvmeSQueue *sq = req->sq;
>       NvmeCtrl *n = sq->ctrl;
>       NvmeCQueue *cq = n->cq[sq->cqid];
>   
> -    trace_nvme_dev_rw_cb(nvme_cid(req));
> +    trace_nvme_dev_rw_cb(nvme_cid(req), nvme_nsid(ns));
>   
>       nvme_enqueue_req_completion(cq, req);
>   }
> @@ -1036,10 +1044,11 @@ static void nvme_aio_cb(void *opaque, int ret)
>   
>   static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
>   {
> +    NvmeNamespace *ns = req->ns;
>       NvmeAIO *aio = g_new0(NvmeAIO, 1);
>   
>       *aio = (NvmeAIO) {
> -        .blk = n->conf.blk,
> +        .blk = ns->blk,
>           .req = req,
>       };
>   
> @@ -1063,11 +1072,12 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
>       req->slba = le64_to_cpu(rw->slba);
>       req->nlb  = le16_to_cpu(rw->nlb) + 1;
>   
> -    trace_nvme_dev_write_zeroes(nvme_cid(req), req->slba, req->nlb);
> +    trace_nvme_dev_write_zeroes(nvme_cid(req), nvme_nsid(ns), req->slba,
> +                                req->nlb);
>   
>       status = nvme_check_bounds(n, req->slba, req->nlb, req);
>       if (status) {
> -        return status;
> +        goto invalid;
>       }
>   
>       offset = req->slba << nvme_ns_lbads(ns);
> @@ -1076,7 +1086,7 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
>       aio = g_new0(NvmeAIO, 1);
>   
>       *aio = (NvmeAIO) {
> -        .blk = n->conf.blk,
> +        .blk = ns->blk,
>           .offset = offset,
>           .len = count,
>           .req = req,
> @@ -1086,6 +1096,10 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
>       nvme_submit_aio(aio);
>   
>       return NVME_NO_COMPLETE;
> +
> +invalid:
> +    block_acct_invalid(blk_get_stats(ns->blk), BLOCK_ACCT_WRITE);
> +    return status;
>   }
>   
>   static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> @@ -1100,11 +1114,11 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
>   
>       req->nlb  = le16_to_cpu(rw->nlb) + 1;
>       req->slba = le64_to_cpu(rw->slba);
> -
>       len = req->nlb << nvme_ns_lbads(ns);
>   
> -    trace_nvme_dev_rw(nvme_req_is_write(req) ? "write" : "read", req->nlb,
> -                      req->nlb << nvme_ns_lbads(req->ns), req->slba);
> +    trace_nvme_dev_rw(nvme_cid(req), nvme_req_is_write(req) ? "write" : "read",
> +                      nvme_nsid(ns), req->nlb, req->nlb << nvme_ns_lbads(ns),
> +                      req->slba);
>   
>       status = nvme_check_rw(n, req);
>       if (status) {
> @@ -1116,13 +1130,13 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
>           goto invalid;
>       }
>   
> -    nvme_rw_aio(n->conf.blk, req->slba << nvme_ns_lbads(ns), req);
> +    nvme_rw_aio(ns->blk, req->slba << nvme_ns_lbads(ns), req);
>       nvme_req_set_cb(req, nvme_rw_cb, NULL);
>   
>       return NVME_NO_COMPLETE;
>   
>   invalid:
> -    block_acct_invalid(blk_get_stats(n->conf.blk), acct);
> +    block_acct_invalid(blk_get_stats(ns->blk), acct);
>       return status;
>   }
>   
> @@ -1133,12 +1147,15 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
>       trace_nvme_dev_io_cmd(nvme_cid(req), nsid, le16_to_cpu(req->sq->sqid),
>                             req->cmd.opcode);
>   
> -    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> -        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
> +    if (!nvme_nsid_valid(n, nsid)) {
>           return NVME_INVALID_NSID | NVME_DNR;
>       }
>   
> -    req->ns = &n->namespaces[nsid - 1];
> +    req->ns = nvme_ns(n, nsid);
> +    if (unlikely(!req->ns)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
>       switch (req->cmd.opcode) {
>       case NVME_CMD_FLUSH:
>           return nvme_flush(n, req);
> @@ -1283,18 +1300,24 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>       uint64_t units_read = 0, units_written = 0;
>       uint64_t read_commands = 0, write_commands = 0;
>       NvmeSmartLog smart;
> -    BlockAcctStats *s;
>   
>       if (nsid && nsid != 0xffffffff) {
>           return NVME_INVALID_FIELD | NVME_DNR;
>       }
>   
> -    s = blk_get_stats(n->conf.blk);
> +    for (int i = 1; i <= n->num_namespaces; i++) {
> +        NvmeNamespace *ns = nvme_ns(n, i);
> +        if (!ns) {
> +            continue;
> +        }
>   
> -    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> -    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> -    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> -    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> +        BlockAcctStats *s = blk_get_stats(ns->blk);
> +
> +        units_read += s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> +        units_written += s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> +        read_commands += s->nr_ops[BLOCK_ACCT_READ];
> +        write_commands += s->nr_ops[BLOCK_ACCT_WRITE];
> +    }
>   
>       if (off > sizeof(smart)) {
>           return NVME_INVALID_FIELD | NVME_DNR;
> @@ -1527,18 +1550,23 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
>   {
>       NvmeNamespace *ns;
>       NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
> +    NvmeIdNs *id_ns, inactive = { 0 };
>       uint32_t nsid = le32_to_cpu(c->nsid);
>   
>       trace_nvme_dev_identify_ns(nsid);
>   
> -    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> -        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
> +    if (!nvme_nsid_valid(n, nsid)) {
>           return NVME_INVALID_NSID | NVME_DNR;
>       }
>   
> -    ns = &n->namespaces[nsid - 1];
> +    ns = nvme_ns(n, nsid);
> +    if (unlikely(!ns)) {
> +        id_ns = &inactive;
> +    } else {
> +        id_ns = &ns->id_ns;
> +    }
>   
> -    return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns),
> +    return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs),
>                       DMA_DIRECTION_FROM_DEVICE, req);
>   }
>   
> @@ -1555,7 +1583,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
>   
>       list = g_malloc0(data_len);
>       for (int i = 1; i <= n->num_namespaces; i++) {
> -        if (i <= min_nsid) {
> +        if (i <= min_nsid || !nvme_ns(n, i)) {
>               continue;
>           }
>           list[j++] = cpu_to_le32(i);
> @@ -1573,7 +1601,6 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
>   {
>       NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
>       uint32_t nsid = le32_to_cpu(c->nsid);
> -
>       uint8_t list[NVME_IDENTIFY_DATA_SIZE];
>   
>       struct data {
> @@ -1587,11 +1614,14 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
>   
>       trace_nvme_dev_identify_ns_descr_list(nsid);
>   
> -    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> -        trace_nvme_dev_err_invalid_ns(nsid, n->num_namespaces);
> +    if (!nvme_nsid_valid(n, nsid)) {
>           return NVME_INVALID_NSID | NVME_DNR;
>       }
>   
> +    if (unlikely(!nvme_ns(n, nsid))) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
>       /*
>        * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
>        * structure, a Namespace UUID (nidt = 0x3) must be reported in the
> @@ -1726,7 +1756,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req)
>           result = cpu_to_le32(n->features.err_rec);
>           break;
>       case NVME_VOLATILE_WRITE_CACHE:
> -        result = cpu_to_le32(blk_enable_write_cache(n->conf.blk));
> +        result = cpu_to_le32(n->features.volatile_wc);
>           trace_nvme_dev_getfeat_vwcache(result ? "enabled" : "disabled");
>           break;
>       case NVME_NUMBER_OF_QUEUES:
> @@ -1779,6 +1809,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req)
>   
>   static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
>   {
> +    NvmeNamespace *ns;
> +
>       NvmeCmd *cmd = &req->cmd;
>       uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>       uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> @@ -1812,12 +1844,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
>   
>           break;
>       case NVME_VOLATILE_WRITE_CACHE:
> -        if (blk_enable_write_cache(n->conf.blk)) {
> -            blk_flush(n->conf.blk);
> +        n->features.volatile_wc = dw11;
> +
> +        for (int i = 1; i <= n->num_namespaces; i++) {
> +            ns = nvme_ns(n, i);
> +            if (!ns) {
> +                continue;
> +            }
> +
> +            if (blk_enable_write_cache(ns->blk)) {
> +                blk_flush(ns->blk);
> +            }
> +
> +            blk_set_enable_write_cache(ns->blk, dw11 & 1);
>           }
>   
> -        blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>           break;
> +
>       case NVME_NUMBER_OF_QUEUES:
>           if (n->qs_created) {
>               return NVME_CMD_SEQ_ERROR | NVME_DNR;
> @@ -1946,9 +1989,17 @@ static void nvme_process_sq(void *opaque)
>   
>   static void nvme_clear_ctrl(NvmeCtrl *n)
>   {
> +    NvmeNamespace *ns;
>       int i;
>   
> -    blk_drain(n->conf.blk);
> +    for (i = 1; i <= n->num_namespaces; i++) {
> +        ns = nvme_ns(n, i);
> +        if (!ns) {
> +            continue;
> +        }
> +
> +        blk_drain(ns->blk);
> +    }
>   
>       for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
>           if (n->sq[i] != NULL) {
> @@ -1971,7 +2022,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>       n->outstanding_aers = 0;
>       n->qs_created = false;
>   
> -    blk_flush(n->conf.blk);
> +    for (i = 1; i <= n->num_namespaces; i++) {
> +        ns = nvme_ns(n, i);
> +        if (!ns) {
> +            continue;
> +        }
> +
> +        blk_flush(ns->blk);
> +    }
> +
>       n->bar.cc = 0;
>   }
>   
> @@ -2406,6 +2465,11 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
>           params->max_ioqpairs = params->num_queues - 1;
>       }
>   
> +    if (n->namespace.blk) {
> +        warn_report("nvme: drive is deprecated; please use an nvme-ns device "
> +                    "instead");
> +    }
> +
>       if (params->max_ioqpairs < 1 ||
>           params->max_ioqpairs > PCI_MSIX_FLAGS_QSIZE) {
>           error_setg(errp, "nvme: max_ioqpairs must be between 1 and %d",
> @@ -2413,11 +2477,6 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
>           return -1;
>       }
>   
> -    if (!n->conf.blk) {
> -        error_setg(errp, "nvme: block backend not configured");
> -        return -1;
> -    }
> -
>       if (!params->serial) {
>           error_setg(errp, "nvme: serial not configured");
>           return -1;
> @@ -2426,22 +2485,10 @@ static int nvme_check_constraints(NvmeCtrl *n, Error **errp)
>       return 0;
>   }
>   
> -static int nvme_init_blk(NvmeCtrl *n, Error **errp)
> -{
> -    blkconf_blocksizes(&n->conf);
> -    if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk),
> -                                       false, errp)) {
> -        return -1;
> -    }
> -
> -    return 0;
> -}
> -
>   static void nvme_init_state(NvmeCtrl *n)
>   {
> -    n->num_namespaces = 1;
> +    n->num_namespaces = NVME_MAX_NAMESPACES;
>       n->reg_size = pow2ceil(0x1008 + 2 * (n->params.max_ioqpairs) * 4);
> -    n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>       n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>       n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
>       n->temperature = NVME_TEMPERATURE;
> @@ -2557,6 +2604,7 @@ static void nvme_init_ctrl(NvmeCtrl *n)
>       id->nn = cpu_to_le32(n->num_namespaces);
>       id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP);
>   
> +    id->vwc = 0x1;
>       id->sgls = cpu_to_le32(NVME_CTRL_SGLS_SUPPORTED_NO_ALIGNMENT |
>                              NVME_CTRL_SGLS_BITBUCKET);
>   
> @@ -2566,9 +2614,6 @@ static void nvme_init_ctrl(NvmeCtrl *n)
>       id->psd[0].mp = cpu_to_le16(0x9c4);
>       id->psd[0].enlat = cpu_to_le32(0x10);
>       id->psd[0].exlat = cpu_to_le32(0x4);
> -    if (blk_enable_write_cache(n->conf.blk)) {
> -        id->vwc = 1;
> -    }
>   
>       n->bar.cap = 0;
>       NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
> @@ -2581,25 +2626,39 @@ static void nvme_init_ctrl(NvmeCtrl *n)
>       n->bar.intmc = n->bar.intms = 0;
>   }
>   
> -static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
> +int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
>   {
> -    int64_t bs_size;
> -    NvmeIdNs *id_ns = &ns->id_ns;
> +    uint32_t nsid = nvme_nsid(ns);
>   
> -    bs_size = blk_getlength(n->conf.blk);
> -    if (bs_size < 0) {
> -        error_setg_errno(errp, -bs_size, "blk_getlength");
> +    if (nsid > NVME_MAX_NAMESPACES) {
> +        error_setg(errp, "invalid namespace id (must be between 0 and %d)",
> +                   NVME_MAX_NAMESPACES);
>           return -1;
>       }
>   
> -    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
> -    n->ns_size = bs_size;
> +    if (!nsid) {
> +        for (int i = 1; i <= n->num_namespaces; i++) {
> +            NvmeNamespace *ns = nvme_ns(n, i);
> +            if (!ns) {
> +                nsid = i;
> +                break;
> +            }
> +        }
>   
> -    id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns));
> +        if (!nsid) {
> +            error_setg(errp, "no free namespace id");
> +            return -1;
> +        }
> +    } else {
> +        if (n->namespaces[nsid - 1]) {
> +            error_setg(errp, "namespace id '%d' is already in use", nsid);
> +            return -1;
> +        }
> +    }
>   
> -    /* no thin provisioning */
> -    id_ns->ncap = id_ns->nsze;
> -    id_ns->nuse = id_ns->ncap;
> +    trace_nvme_dev_register_namespace(nsid);
> +
> +    n->namespaces[nsid - 1] = ns;
>   
>       return 0;
>   }
> @@ -2607,26 +2666,28 @@ static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
>   static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>   {
>       NvmeCtrl *n = NVME(pci_dev);
> -    int i;
> +    NvmeNamespace *ns;
>   
>       if (nvme_check_constraints(n, errp)) {
>           return;
>       }
>   
> +    qbus_create_inplace(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS,
> +                        &pci_dev->qdev, n->parent_obj.qdev.id);
> +
>       nvme_init_state(n);
> +    nvme_init_pci(n, pci_dev);
> +    nvme_init_ctrl(n);

Argh you move these functions again... Why not put them in the correct 
place when you refactor nvme_realize()?

>   
> -    if (nvme_init_blk(n, errp)) {
> -        return;
> -    }
> +    /* setup a namespace if the controller drive property was given */
> +    if (n->namespace.blk) {
> +        ns = &n->namespace;
> +        ns->params.nsid = 1;
>   
> -    for (i = 0; i < n->num_namespaces; i++) {
> -        if (nvme_init_namespace(n, &n->namespaces[i], errp)) {
> +        if (nvme_ns_setup(n, ns, errp)) {
>               return;
>           }
>       }
> -
> -    nvme_init_pci(n, pci_dev);
> -    nvme_init_ctrl(n);
>   }
>   
>   static void nvme_exit(PCIDevice *pci_dev)
> @@ -2647,7 +2708,8 @@ static void nvme_exit(PCIDevice *pci_dev)
>   }
>   
>   static Property nvme_props[] = {
> -    DEFINE_BLOCK_PROPERTIES(NvmeCtrl, conf),
> +    DEFINE_BLOCK_PROPERTIES_BASE(NvmeCtrl, conf), \
> +    DEFINE_PROP_DRIVE("drive", NvmeCtrl, namespace.blk), \
>       DEFINE_NVME_PROPERTIES(NvmeCtrl, params),
>       DEFINE_PROP_END_OF_LIST(),
>   };
> @@ -2679,26 +2741,35 @@ static void nvme_instance_init(Object *obj)
>   {
>       NvmeCtrl *s = NVME(obj);
>   
> -    device_add_bootindex_property(obj, &s->conf.bootindex,
> -                                  "bootindex", "/namespace@1,0",
> -                                  DEVICE(obj), &error_abort);
> +    if (s->namespace.blk) {
> +        device_add_bootindex_property(obj, &s->conf.bootindex,
> +                                      "bootindex", "/namespace@1,0",
> +                                      DEVICE(obj), &error_abort);
> +    }
>   }
>   
>   static const TypeInfo nvme_info = {
>       .name          = TYPE_NVME,
>       .parent        = TYPE_PCI_DEVICE,
>       .instance_size = sizeof(NvmeCtrl),
> -    .class_init    = nvme_class_init,
>       .instance_init = nvme_instance_init,
> +    .class_init    = nvme_class_init,
>       .interfaces = (InterfaceInfo[]) {
>           { INTERFACE_PCIE_DEVICE },
>           { }
>       },
>   };
>   
> +static const TypeInfo nvme_bus_info = {
> +    .name = TYPE_NVME_BUS,
> +    .parent = TYPE_BUS,
> +    .instance_size = sizeof(NvmeBus),
> +};
> +
>   static void nvme_register_types(void)
>   {
>       type_register_static(&nvme_info);
> +    type_register_static(&nvme_bus_info);
>   }
>   
>   type_init(nvme_register_types)
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index ed4a5ce4121d..f42c17651b7b 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -2,6 +2,9 @@
>   #define HW_NVME_H
>   
>   #include "block/nvme.h"
> +#include "nvme-ns.h"
> +
> +#define NVME_MAX_NAMESPACES 256
>   
>   #define DEFINE_NVME_PROPERTIES(_state, _props) \
>       DEFINE_PROP_STRING("serial", _state, _props.serial), \
> @@ -93,26 +96,6 @@ typedef struct NvmeCQueue {
>       QTAILQ_HEAD(, NvmeRequest) req_list;
>   } NvmeCQueue;
>   
> -typedef struct NvmeNamespace {
> -    NvmeIdNs        id_ns;
> -} NvmeNamespace;
> -
> -static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> -{
> -    NvmeIdNs *id_ns = &ns->id_ns;
> -    return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)];
> -}
> -
> -static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
> -{
> -    return nvme_ns_lbaf(ns)->ds;
> -}
> -
> -static inline size_t nvme_ns_lbads_bytes(NvmeNamespace *ns)
> -{
> -    return 1 << nvme_ns_lbads(ns);
> -}
> -
>   typedef enum NvmeAIOOp {
>       NVME_AIO_OPC_NONE         = 0x0,
>       NVME_AIO_OPC_FLUSH        = 0x1,
> @@ -167,6 +150,13 @@ static inline bool nvme_req_is_write(NvmeRequest *req)
>       }
>   }
>   
> +#define TYPE_NVME_BUS "nvme-bus"
> +#define NVME_BUS(obj) OBJECT_CHECK(NvmeBus, (obj), TYPE_NVME_BUS)
> +
> +typedef struct NvmeBus {
> +    BusState parent_bus;
> +} NvmeBus;
> +
>   #define TYPE_NVME "nvme"
>   #define NVME(obj) \
>           OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
> @@ -176,8 +166,9 @@ typedef struct NvmeCtrl {
>       MemoryRegion iomem;
>       MemoryRegion ctrl_mem;
>       NvmeBar      bar;
> -    BlockConf    conf;
>       NvmeParams   params;
> +    NvmeBus      bus;
> +    BlockConf    conf;
>   
>       bool        qs_created;
>       uint32_t    page_size;
> @@ -188,7 +179,6 @@ typedef struct NvmeCtrl {
>       uint32_t    reg_size;
>       uint32_t    num_namespaces;
>       uint32_t    max_q_ents;
> -    uint64_t    ns_size;
>       uint8_t     outstanding_aers;
>       uint8_t     *cmbuf;
>       uint64_t    irq_status;
> @@ -202,7 +192,8 @@ typedef struct NvmeCtrl {
>       QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue;
>       int         aer_queued;
>   
> -    NvmeNamespace   *namespaces;
> +    NvmeNamespace   namespace;
> +    NvmeNamespace   *namespaces[NVME_MAX_NAMESPACES];
>       NvmeSQueue      **sq;
>       NvmeCQueue      **cq;
>       NvmeSQueue      admin_sq;
> @@ -211,9 +202,13 @@ typedef struct NvmeCtrl {
>       NvmeFeatureVal  features;
>   } NvmeCtrl;
>   
> -static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns)
> +static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid)
>   {
> -    return n->ns_size >> nvme_ns_lbads(ns);
> +    if (!nsid || nsid > n->num_namespaces) {
> +        return NULL;
> +    }
> +
> +    return n->namespaces[nsid - 1];
>   }
>   
>   static inline uint16_t nvme_cid(NvmeRequest *req)
> @@ -225,4 +220,6 @@ static inline uint16_t nvme_cid(NvmeRequest *req)
>       return 0xffff;
>   }
>   
> +int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp);
> +
>   #endif /* HW_NVME_H */
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index accbb04fe396..8fec87260555 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -29,6 +29,7 @@ hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int t
>   
>   # nvme.c
>   # nvme traces for successful events
> +nvme_dev_register_namespace(uint32_t nsid) "nsid %"PRIu32""
>   nvme_dev_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
>   nvme_dev_irq_pin(void) "pulsing IRQ pin"
>   nvme_dev_irq_masked(void) "IRQ is masked"
> @@ -39,9 +40,9 @@ nvme_dev_req_add_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offs
>   nvme_dev_aio_cb(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p"
>   nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
>   nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
> -nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
> -nvme_dev_rw_cb(uint16_t cid) "cid %"PRIu16""
> -nvme_dev_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
> +nvme_dev_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" %s nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64""
> +nvme_dev_rw_cb(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32""
> +nvme_dev_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32""
>   nvme_dev_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
>   nvme_dev_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
>   nvme_dev_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
> @@ -98,7 +99,6 @@ nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or no
>   nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
>   nvme_dev_err_invalid_prp2_missing(void) "PRP2 is null and more data to be transferred"
>   nvme_dev_err_invalid_prp(void) "invalid PRP"
> -nvme_dev_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not within 1-%u"
>   nvme_dev_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
>   nvme_dev_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
>   nvme_dev_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 48/48] nvme: make lba data size configurable
  2020-04-15  5:51 ` [PATCH v7 48/48] nvme: make lba data size configurable Klaus Jensen
@ 2020-04-15  7:40   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:40 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Beata Michalska, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 4/15/20 7:51 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   hw/block/nvme-ns.c | 7 ++++++-
>   hw/block/nvme-ns.h | 4 +++-
>   hw/block/nvme.c    | 1 +
>   3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index bd64d4a94632..d34b6b2439f1 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c
> @@ -18,7 +18,7 @@ static int nvme_ns_init(NvmeNamespace *ns)
>   {
>       NvmeIdNs *id_ns = &ns->id_ns;
>   
> -    id_ns->lbaf[0].ds = BDRV_SECTOR_BITS;
> +    id_ns->lbaf[0].ds = ns->params.lbads;
>       id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
>   
>       /* no thin provisioning */
> @@ -78,6 +78,11 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
>           return -1;
>       }
>   
> +    if (ns->params.lbads < 9 || ns->params.lbads > 12) {
> +        error_setg(errp, "unsupported lbads (supported: 9-12)");
> +        return 1;
> +    }
> +
>       return 0;
>   }
>   
> diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
> index 3c3651d485d0..43b78f8b8d9c 100644
> --- a/hw/block/nvme-ns.h
> +++ b/hw/block/nvme-ns.h
> @@ -7,10 +7,12 @@
>   
>   #define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
>       DEFINE_PROP_DRIVE("drive", _state, blk), \
> -    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)
> +    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0), \
> +    DEFINE_PROP_UINT8("lbads", _state, _props.lbads, BDRV_SECTOR_BITS)
>   
>   typedef struct NvmeNamespaceParams {
>       uint32_t nsid;
> +    uint8_t  lbads;
>   } NvmeNamespaceParams;
>   
>   typedef struct NvmeNamespace {
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 40a400333828..dd2759a4ce2e 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2690,6 +2690,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>       if (n->namespace.blk) {
>           ns = &n->namespace;
>           ns->params.nsid = 1;
> +        ns->params.lbads = BDRV_SECTOR_BITS;
>   
>           if (nvme_ns_setup(n, ns, errp)) {
>               return;
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 12/48] nvme: add temperature threshold feature
  2020-04-15  7:28       ` Klaus Birkelund Jensen
@ 2020-04-15  7:45         ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:45 UTC (permalink / raw)
  To: Klaus Birkelund Jensen
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On 4/15/20 9:28 AM, Klaus Birkelund Jensen wrote:
> On Apr 15 09:24, Klaus Birkelund Jensen wrote:
>> On Apr 15 09:19, Philippe Mathieu-Daudé wrote:
>>> On 4/15/20 7:51 AM, Klaus Jensen wrote:
>>>> From: Klaus Jensen <k.jensen@samsung.com>
>>>>
>>>> It might seem wierd to implement this feature for an emulated device,
>>>
>>> 'weird'
>>
>> Thanks, fixed :)
>>
>>>
>>>> but it is mandatory to support and the feature is useful for testing
>>>> asynchronous event request support, which will be added in a later
>>>> patch.
>>>
>>> Which patch? I can't find how you set the temperature in this series.
>>>
>>
>> The temperature cannot be changed, but the thresholds can with the Set
>> Features command (and that can then trigger AERs). That is added in
>> "nvme: add temperature threshold feature" and "nvme: add support for the
>> asynchronous event request command" respectively.
>>
>> There is a test in SPDK that does this.
>>
> 
> Oh, I think I misunderstood you.
> 
> No, setting the temperature was moved to the "nvme: add support for the
> get log page command" patch since that is the patch that actually uses
> it. This was on request by Maxim in an earlier review.

I was expecting to see a QMP command to modify the device temperature at 
runtime.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 06/48] nvme: refactor nvme_addr_read
  2020-04-15  7:03   ` Philippe Mathieu-Daudé
@ 2020-04-15  7:46     ` Klaus Birkelund Jensen
  0 siblings, 0 replies; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  7:46 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:03, Philippe Mathieu-Daudé wrote:
> On 4/15/20 7:50 AM, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Pull the controller memory buffer check to its own function. The check
> > will be used on its own in later patches.
> > 
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Acked-by: Keith Busch <kbusch@kernel.org>
> > ---
> >   hw/block/nvme.c | 16 ++++++++++++----
> >   1 file changed, 12 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 622103c42d0a..02d3dde90842 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -52,14 +52,22 @@
> >   static void nvme_process_sq(void *opaque);
> > +static inline bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
> 
> 'inline' not really necessary here.
> 

Fixed.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members
  2020-04-15  7:19     ` Klaus Birkelund Jensen
@ 2020-04-15  7:48       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:48 UTC (permalink / raw)
  To: Klaus Birkelund Jensen
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On 4/15/20 9:19 AM, Klaus Birkelund Jensen wrote:
> On Apr 15 09:10, Philippe Mathieu-Daudé wrote:
>>
>> "hw/block/nvme.h" should not pull in "block/nvme.h", both should include a
>> common "hw/block/nvme_spec.h" (or better named). Not related to this patch
>> although.
>>
> 
> Hmm. It does pull in the "include/block/nvme.h" which is basically the
> "nvme_spec.h" you are talking about. That file holds all spec related
> structs that are shared between the VFIO based nvme driver
> (block/nvme.c) and the emulated device (hw/block/nvme.c).
> 
> Isn't that what is intended?

You are correct, "include/block/nvme.h" contains the spec definitions. I 
was expecting to see (QEMU) block-related structure in it, but they aren't.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 11/48] nvme: refactor device realization
  2020-04-15  7:25     ` Klaus Birkelund Jensen
@ 2020-04-15  7:55       ` Philippe Mathieu-Daudé
  2020-04-15  8:18         ` Klaus Birkelund Jensen
  0 siblings, 1 reply; 72+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-04-15  7:55 UTC (permalink / raw)
  To: Klaus Birkelund Jensen
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On 4/15/20 9:25 AM, Klaus Birkelund Jensen wrote:
> On Apr 15 09:14, Philippe Mathieu-Daudé wrote:
>> Hi Klaus,
>>
>> This patch is a pain to review... Could you split it? I'd use one trivial
>> patch for each function extracted from nvme_realize().
>>
> 
> Understood, I will split it up!

Thanks, that will help the review.

As this series is quite big, I recommend you to split it, so part of it 
can get merged quicker and you don't have to carry tons of patches that 
scare reviewers/maintainers.

Suggestions:

- 1: cleanups/refactors
- 2: support v1.3
- 3: more refactors, strengthening code
- 4: improve DMA & S/G
- 5: support for multiple NS
- 6: tests for multiple NS feature
- 7: tests bus unplug/replug (idea)

Or less :)



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 45/48] nvme: support multiple namespaces
  2020-04-15  7:38   ` Philippe Mathieu-Daudé
@ 2020-04-15  8:02     ` Klaus Birkelund Jensen
  0 siblings, 0 replies; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  8:02 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Apr 15 09:38, Philippe Mathieu-Daudé wrote:
> On 4/15/20 7:51 AM, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > This adds support for multiple namespaces by introducing a new 'nvme-ns'
> > device model. The nvme device creates a bus named from the device name
> > ('id'). The nvme-ns devices then connect to this and registers
> > themselves with the nvme device.
> > 
> > This changes how an nvme device is created. Example with two namespaces:
> > 
> >    -drive file=nvme0n1.img,if=none,id=disk1
> >    -drive file=nvme0n2.img,if=none,id=disk2
> >    -device nvme,serial=deadbeef,id=nvme0
> >    -device nvme-ns,drive=disk1,bus=nvme0,nsid=1
> >    -device nvme-ns,drive=disk2,bus=nvme0,nsid=2
> > 
> > The drive property is kept on the nvme device to keep the change
> > backward compatible, but the property is now optional. Specifying a
> > drive for the nvme device will always create the namespace with nsid 1.
> > 
> > Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Reviewed-by: Keith Busch <kbusch@kernel.org>
> > ---
> >   hw/block/Makefile.objs |   2 +-
> >   hw/block/nvme-ns.c     | 157 +++++++++++++++++++++++++++
> >   hw/block/nvme-ns.h     |  60 +++++++++++
> >   hw/block/nvme.c        | 233 +++++++++++++++++++++++++++--------------
> >   hw/block/nvme.h        |  47 ++++-----
> >   hw/block/trace-events  |   8 +-
> >   6 files changed, 396 insertions(+), 111 deletions(-)
> >   create mode 100644 hw/block/nvme-ns.c
> >   create mode 100644 hw/block/nvme-ns.h
> > 
> > diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> > index 4b4a2b338dc4..d9141d6a4b9b 100644
> > --- a/hw/block/Makefile.objs
> > +++ b/hw/block/Makefile.objs
> > @@ -7,7 +7,7 @@ common-obj-$(CONFIG_PFLASH_CFI02) += pflash_cfi02.o
> >   common-obj-$(CONFIG_XEN) += xen-block.o
> >   common-obj-$(CONFIG_ECC) += ecc.o
> >   common-obj-$(CONFIG_ONENAND) += onenand.o
> > -common-obj-$(CONFIG_NVME_PCI) += nvme.o
> > +common-obj-$(CONFIG_NVME_PCI) += nvme.o nvme-ns.o
> >   common-obj-$(CONFIG_SWIM) += swim.o
> >   common-obj-$(CONFIG_SH4) += tc58128.o
> > diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> > new file mode 100644
> > index 000000000000..bd64d4a94632
> > --- /dev/null
> > +++ b/hw/block/nvme-ns.c
> > @@ -0,0 +1,157 @@
> 
> Missing copyright + license.
> 

Fixed.

> > +
> > +    switch (n->conf.wce) {
> > +    case ON_OFF_AUTO_ON:
> > +        n->features.volatile_wc = 1;
> > +        break;
> > +    case ON_OFF_AUTO_OFF:
> > +        n->features.volatile_wc = 0;
> 
> Missing 'break'?
> 

Ouch. Fixed.

> > +    case ON_OFF_AUTO_AUTO:
> > +        n->features.volatile_wc = blk_enable_write_cache(ns->blk);
> > +        break;
> > +    default:
> > +        abort();
> > +    }
> > +
> > +    blk_set_enable_write_cache(ns->blk, n->features.volatile_wc);
> > +
> > +    return 0;
> > +}
> > +
> > +static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
> > +{
> > +    if (!ns->blk) {
> > +        error_setg(errp, "block backend not configured");
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
> > +{
> > +    if (nvme_ns_check_constraints(ns, errp)) {
> > +        return -1;
> > +    }
> > +
> > +    if (nvme_ns_init_blk(n, ns, &n->id_ctrl, errp)) {
> > +        return -1;
> > +    }
> > +
> > +    nvme_ns_init(ns);
> > +    if (nvme_register_namespace(n, ns, errp)) {
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static void nvme_ns_realize(DeviceState *dev, Error **errp)
> > +{
> > +    NvmeNamespace *ns = NVME_NS(dev);
> > +    BusState *s = qdev_get_parent_bus(dev);
> > +    NvmeCtrl *n = NVME(s->parent);
> > +    Error *local_err = NULL;
> > +
> > +    if (nvme_ns_setup(n, ns, &local_err)) {
> > +        error_propagate_prepend(errp, local_err,
> > +                                "could not setup namespace: ");
> > +        return;
> > +    }
> > +}
> > +
> > +static Property nvme_ns_props[] = {
> > +    DEFINE_NVME_NS_PROPERTIES(NvmeNamespace, params),
> > +    DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void nvme_ns_class_init(ObjectClass *oc, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(oc);
> > +
> > +    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> > +
> > +    dc->bus_type = TYPE_NVME_BUS;
> > +    dc->realize = nvme_ns_realize;
> > +    device_class_set_props(dc, nvme_ns_props);
> > +    dc->desc = "virtual nvme namespace";
> 
> "Virtual NVMe namespace"?
> 

Fixed.

> > +}
> > +
> > +static void nvme_ns_instance_init(Object *obj)
> > +{
> > +    NvmeNamespace *ns = NVME_NS(obj);
> > +    char *bootindex = g_strdup_printf("/namespace@%d,0", ns->params.nsid);
> > +
> > +    device_add_bootindex_property(obj, &ns->bootindex, "bootindex",
> > +                                  bootindex, DEVICE(obj), &error_abort);
> > +
> > +    g_free(bootindex);
> > +}
> > +
> > +static const TypeInfo nvme_ns_info = {
> > +    .name = TYPE_NVME_NS,
> > +    .parent = TYPE_DEVICE,
> > +    .class_init = nvme_ns_class_init,
> > +    .instance_size = sizeof(NvmeNamespace),
> > +    .instance_init = nvme_ns_instance_init,
> > +};
> > +
> > +static void nvme_ns_register_types(void)
> > +{
> > +    type_register_static(&nvme_ns_info);
> > +}
> > +
> > +type_init(nvme_ns_register_types)
> > diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
> > new file mode 100644
> > index 000000000000..3c3651d485d0
> > --- /dev/null
> > +++ b/hw/block/nvme-ns.h
> > @@ -0,0 +1,60 @@
> 
> Missing copyright + license.
> 

Fixed.

> > +#ifndef NVME_NS_H
> > +#define NVME_NS_H
> > +
> > +#define TYPE_NVME_NS "nvme-ns"
> > +#define NVME_NS(obj) \
> > +    OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
> > +
> > +#define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
> > +    DEFINE_PROP_DRIVE("drive", _state, blk), \
> > +    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)
> > +
> > +typedef struct NvmeNamespaceParams {
> > +    uint32_t nsid;
> > +} NvmeNamespaceParams;
> > +
> > +typedef struct NvmeNamespace {
> > +    DeviceState  parent_obj;
> > +    BlockBackend *blk;
> > +    int32_t      bootindex;
> > +    int64_t      size;
> > +
> > +    NvmeIdNs            id_ns;
> > +    NvmeNamespaceParams params;
> > +} NvmeNamespace;
> > +
> > +static inline uint32_t nvme_nsid(NvmeNamespace *ns)
> > +{
> > +    if (ns) {
> > +        return ns->params.nsid;
> > +    }
> > +
> > +    return -1;
> 
> This case doesn't seem possible.
> 
> Maybe:
> 
>       assert(ns);
>       return ns->params.nsid;
> 

I have some later patches where this might happen in certain traces.
That's why I'm using a sentinel value. But I will see if I can clean it
up instead.

> > @@ -2607,26 +2666,28 @@ static int nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
> >   static void nvme_realize(PCIDevice *pci_dev, Error **errp)
> >   {
> >       NvmeCtrl *n = NVME(pci_dev);
> > -    int i;
> > +    NvmeNamespace *ns;
> >       if (nvme_check_constraints(n, errp)) {
> >           return;
> >       }
> > +    qbus_create_inplace(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS,
> > +                        &pci_dev->qdev, n->parent_obj.qdev.id);
> > +
> >       nvme_init_state(n);
> > +    nvme_init_pci(n, pci_dev);
> > +    nvme_init_ctrl(n);
> 
> Argh you move these functions again... Why not put them in the correct place
> when you refactor nvme_realize()?
> 

"Argh" :) Yeah, I will do an effort to make this better!



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces
  2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
                   ` (48 preceding siblings ...)
  2020-04-15  7:17 ` [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces no-reply
@ 2020-04-15  8:02 ` no-reply
  49 siblings, 0 replies; 72+ messages in thread
From: no-reply @ 2020-04-15  8:02 UTC (permalink / raw)
  To: its
  Cc: kwolf, beata.michalska, qemu-block, k.jensen, qemu-devel, mreitz,
	kbusch, its, javier.gonz, mlevitsk

Patchew URL: https://patchew.org/QEMU/20200415055140.466900-1-its@irrelevant.dk/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces
Message-id: 20200415055140.466900-1-its@irrelevant.dk
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20200412210954.32313-1-f4bug@amsat.org -> patchew/20200412210954.32313-1-f4bug@amsat.org
 - [tag update]      patchew/20200412223619.11284-1-f4bug@amsat.org -> patchew/20200412223619.11284-1-f4bug@amsat.org
 - [tag update]      patchew/20200415055140.466900-1-its@irrelevant.dk -> patchew/20200415055140.466900-1-its@irrelevant.dk
Switched to a new branch 'test'
4ab23cf nvme: make lba data size configurable
4c8a056 nvme: change controller pci id
506f543 pci: allocate pci id for nvme
95623e7 nvme: support multiple namespaces
d2da14b nvme: refactor identify active namespace id list
bbac04c nvme: add support for sgl bit bucket descriptor
f6c40e0 nvme: add support for scatter gather lists
591ccdf nvme: harden cmb access
c3ccf89 nvme: handle dma errors
75febcd pci: pass along the return value of dma_memory_rw
05048e8 nvme: use preallocated qsg/iov in nvme_dma_prp
9a939bd nvme: add nvme_check_rw helper
d1ff466 nvme: allow multiple aios per command
5dc3864 nvme: remove NvmeCmd parameter
0d6fc35 nvme: refactor NvmeRequest
a77654d nvme: be consistent about zeros vs zeroes
9f7ec64 nvme: add check for mdts
0f0481c nvme: refactor request bounds checking
a9208e4 nvme: verify validity of prp lists in the cmb
19b6a4b nvme: add request mapping helper
bad14d6 nvme: pass request along for tracing
3a9fc50 nvme: refactor dma read/write
e3e7e57 nvme: remove redundant has_sg member
98e643a nvme: replace dma_acct with blk_acct equivalent
8e2a196 nvme: add mapping helpers
6c8bfdf nvme: memset preallocated requests structures
2ded511 nvme: bump supported version to v1.3
f4ea0f2 nvme: provide the mandatory subnqn field
8bf580e nvme: enforce valid queue creation sequence
452a2ff nvme: support identify namespace descriptor list
9e8fd3d nvme: add log specific field to trace events
c9a9106 nvme: make sure ncqr and nsqr is valid
2b53bfa nvme: additional tracing
a212237 nvme: add missing mandatory features
a9a43cf nvme: add support for the asynchronous event request command
f5dcf98 nvme: add support for the get log page command
ce62c52 nvme: add temperature threshold feature
656c989 nvme: refactor device realization
509c349 nvme: remove redundant cmbloc/cmbsz members
42e29cd nvme: add max_ioqpairs device parameter
20e6f29 nvme: fix pci doorbell size calculation
acd84e3 nvme: add support for the abort command
eb96029 nvme: refactor nvme_addr_read
075de60 nvme: use constants in identify
16e4f0d nvme: bump spec data structures to v1.3
5ca5eeb nvme: move device parameters to separate struct
bc6fa5c nvme: remove superfluous breaks
ecbfef5 nvme: rename trace events to nvme_dev

=== OUTPUT BEGIN ===
1/48 Checking commit ecbfef51f105 (nvme: rename trace events to nvme_dev)
2/48 Checking commit bc6fa5cc4955 (nvme: remove superfluous breaks)
3/48 Checking commit 5ca5eeb5d92c (nvme: move device parameters to separate struct)
ERROR: Macros with complex values should be enclosed in parenthesis
#180: FILE: hw/block/nvme.h:6:
+#define DEFINE_NVME_PROPERTIES(_state, _props) \
+    DEFINE_PROP_STRING("serial", _state, _props.serial), \
+    DEFINE_PROP_UINT32("cmb_size_mb", _state, _props.cmb_size_mb, 0), \
+    DEFINE_PROP_UINT32("num_queues", _state, _props.num_queues, 64)

total: 1 errors, 0 warnings, 181 lines checked

Patch 3/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

4/48 Checking commit 16e4f0dd48eb (nvme: bump spec data structures to v1.3)
5/48 Checking commit 075de60b9c58 (nvme: use constants in identify)
6/48 Checking commit eb9602977cf4 (nvme: refactor nvme_addr_read)
7/48 Checking commit acd84e3bbf0d (nvme: add support for the abort command)
8/48 Checking commit 20e6f29fe81d (nvme: fix pci doorbell size calculation)
9/48 Checking commit 42e29cd4ba89 (nvme: add max_ioqpairs device parameter)
10/48 Checking commit 509c349c8755 (nvme: remove redundant cmbloc/cmbsz members)
11/48 Checking commit 656c98958103 (nvme: refactor device realization)
12/48 Checking commit ce62c5295b18 (nvme: add temperature threshold feature)
13/48 Checking commit f5dcf988ef1a (nvme: add support for the get log page command)
14/48 Checking commit a9a43cf3d965 (nvme: add support for the asynchronous event request command)
15/48 Checking commit a21223712932 (nvme: add missing mandatory features)
16/48 Checking commit 2b53bfa79af6 (nvme: additional tracing)
17/48 Checking commit c9a91061dbeb (nvme: make sure ncqr and nsqr is valid)
18/48 Checking commit 9e8fd3ddf0bc (nvme: add log specific field to trace events)
19/48 Checking commit 452a2ffe63ee (nvme: support identify namespace descriptor list)
20/48 Checking commit 8bf580e70657 (nvme: enforce valid queue creation sequence)
21/48 Checking commit f4ea0f21000e (nvme: provide the mandatory subnqn field)
22/48 Checking commit 2ded5117428e (nvme: bump supported version to v1.3)
23/48 Checking commit 6c8bfdfdedac (nvme: memset preallocated requests structures)
24/48 Checking commit 8e2a196c482a (nvme: add mapping helpers)
25/48 Checking commit 98e643a6ff9c (nvme: replace dma_acct with blk_acct equivalent)
26/48 Checking commit e3e7e5712935 (nvme: remove redundant has_sg member)
27/48 Checking commit 3a9fc50dd3d8 (nvme: refactor dma read/write)
28/48 Checking commit bad14d6db88c (nvme: pass request along for tracing)
29/48 Checking commit 19b6a4bc26d6 (nvme: add request mapping helper)
30/48 Checking commit a9208e457a15 (nvme: verify validity of prp lists in the cmb)
31/48 Checking commit 0f0481c13a90 (nvme: refactor request bounds checking)
32/48 Checking commit 9f7ec646dfb3 (nvme: add check for mdts)
33/48 Checking commit a77654d0651b (nvme: be consistent about zeros vs zeroes)
34/48 Checking commit 0d6fc3519806 (nvme: refactor NvmeRequest)
35/48 Checking commit 5dc386494d66 (nvme: remove NvmeCmd parameter)
36/48 Checking commit d1ff466e6519 (nvme: allow multiple aios per command)
37/48 Checking commit 9a939bdbbe7a (nvme: add nvme_check_rw helper)
38/48 Checking commit 05048e848b16 (nvme: use preallocated qsg/iov in nvme_dma_prp)
39/48 Checking commit 75febcd68a46 (pci: pass along the return value of dma_memory_rw)
40/48 Checking commit c3ccf898265d (nvme: handle dma errors)
41/48 Checking commit 591ccdf3c944 (nvme: harden cmb access)
42/48 Checking commit f6c40e0d1671 (nvme: add support for scatter gather lists)
43/48 Checking commit bbac04c4a9c3 (nvme: add support for sgl bit bucket descriptor)
44/48 Checking commit d2da14b49e99 (nvme: refactor identify active namespace id list)
45/48 Checking commit 95623e7d16d1 (nvme: support multiple namespaces)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#43: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#218: FILE: hw/block/nvme-ns.h:8:
+#define DEFINE_NVME_NS_PROPERTIES(_state, _props) \
+    DEFINE_PROP_DRIVE("drive", _state, blk), \
+    DEFINE_PROP_UINT32("nsid", _state, _props.nsid, 0)

total: 1 errors, 1 warnings, 841 lines checked

Patch 45/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

46/48 Checking commit 506f54399c6a (pci: allocate pci id for nvme)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#32: 
new file mode 100644

total: 0 errors, 1 warnings, 46 lines checked

Patch 46/48 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
47/48 Checking commit 4c8a056f5893 (nvme: change controller pci id)
48/48 Checking commit 4ab23cf5a34f (nvme: make lba data size configurable)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200415055140.466900-1-its@irrelevant.dk/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 11/48] nvme: refactor device realization
  2020-04-15  7:55       ` Philippe Mathieu-Daudé
@ 2020-04-15  8:18         ` Klaus Birkelund Jensen
  0 siblings, 0 replies; 72+ messages in thread
From: Klaus Birkelund Jensen @ 2020-04-15  8:18 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez

On Apr 15 09:55, Philippe Mathieu-Daudé wrote:
> On 4/15/20 9:25 AM, Klaus Birkelund Jensen wrote:
> > On Apr 15 09:14, Philippe Mathieu-Daudé wrote:
> > > Hi Klaus,
> > > 
> > > This patch is a pain to review... Could you split it? I'd use one trivial
> > > patch for each function extracted from nvme_realize().
> > > 
> > 
> > Understood, I will split it up!
> 
> Thanks, that will help the review.
> 
> As this series is quite big, I recommend you to split it, so part of it can
> get merged quicker and you don't have to carry tons of patches that scare
> reviewers/maintainers.
> 
> Suggestions:
> 
> - 1: cleanups/refactors
> - 2: support v1.3
> - 3: more refactors, strengthening code
> - 4: improve DMA & S/G
> - 5: support for multiple NS
> - 6: tests for multiple NS feature
> - 7: tests bus unplug/replug (idea)
> 
> Or less :)
> 

Okay, good idea. Thanks.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 46/48] pci: allocate pci id for nvme
  2020-04-15  5:51 ` [PATCH v7 46/48] pci: allocate pci id for nvme Klaus Jensen
@ 2020-04-21  9:19   ` Gerd Hoffmann
  0 siblings, 0 replies; 72+ messages in thread
From: Gerd Hoffmann @ 2020-04-21  9:19 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Beata Michalska, qemu-block, Klaus Jensen,
	qemu-devel, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky

On Wed, Apr 15, 2020 at 07:51:38AM +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The emulated nvme device (hw/block/nvme.c) is currently using an
> internal Intel device id.
> 
> Prepare to change that by allocating a device id under the 1b36 (Red
> Hat, Inc.) vendor id.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Acked-by: Gerd Hoffmann <kraxel@redhat.com>



^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2020-04-21  9:20 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15  5:50 [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces Klaus Jensen
2020-04-15  5:50 ` [PATCH v7 01/48] nvme: rename trace events to nvme_dev Klaus Jensen
2020-04-15  5:50 ` [PATCH v7 02/48] nvme: remove superfluous breaks Klaus Jensen
2020-04-15  6:57   ` Philippe Mathieu-Daudé
2020-04-15  5:50 ` [PATCH v7 03/48] nvme: move device parameters to separate struct Klaus Jensen
2020-04-15  6:58   ` Philippe Mathieu-Daudé
2020-04-15  5:50 ` [PATCH v7 04/48] nvme: bump spec data structures to v1.3 Klaus Jensen
2020-04-15  5:50 ` [PATCH v7 05/48] nvme: use constants in identify Klaus Jensen
2020-04-15  7:01   ` Philippe Mathieu-Daudé
2020-04-15  5:50 ` [PATCH v7 06/48] nvme: refactor nvme_addr_read Klaus Jensen
2020-04-15  7:03   ` Philippe Mathieu-Daudé
2020-04-15  7:46     ` Klaus Birkelund Jensen
2020-04-15  5:50 ` [PATCH v7 07/48] nvme: add support for the abort command Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 08/48] nvme: fix pci doorbell size calculation Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 09/48] nvme: add max_ioqpairs device parameter Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 10/48] nvme: remove redundant cmbloc/cmbsz members Klaus Jensen
2020-04-15  7:10   ` Philippe Mathieu-Daudé
2020-04-15  7:19     ` Klaus Birkelund Jensen
2020-04-15  7:48       ` Philippe Mathieu-Daudé
2020-04-15  5:51 ` [PATCH v7 11/48] nvme: refactor device realization Klaus Jensen
2020-04-15  7:14   ` Philippe Mathieu-Daudé
2020-04-15  7:25     ` Klaus Birkelund Jensen
2020-04-15  7:55       ` Philippe Mathieu-Daudé
2020-04-15  8:18         ` Klaus Birkelund Jensen
2020-04-15  5:51 ` [PATCH v7 12/48] nvme: add temperature threshold feature Klaus Jensen
2020-04-15  7:19   ` Philippe Mathieu-Daudé
2020-04-15  7:24     ` Klaus Birkelund Jensen
2020-04-15  7:28       ` Klaus Birkelund Jensen
2020-04-15  7:45         ` Philippe Mathieu-Daudé
2020-04-15  5:51 ` [PATCH v7 13/48] nvme: add support for the get log page command Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 14/48] nvme: add support for the asynchronous event request command Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 15/48] nvme: add missing mandatory features Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 16/48] nvme: additional tracing Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 17/48] nvme: make sure ncqr and nsqr is valid Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 18/48] nvme: add log specific field to trace events Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 19/48] nvme: support identify namespace descriptor list Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 20/48] nvme: enforce valid queue creation sequence Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 21/48] nvme: provide the mandatory subnqn field Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 22/48] nvme: bump supported version to v1.3 Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 23/48] nvme: memset preallocated requests structures Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 24/48] nvme: add mapping helpers Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 25/48] nvme: replace dma_acct with blk_acct equivalent Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 26/48] nvme: remove redundant has_sg member Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 27/48] nvme: refactor dma read/write Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 28/48] nvme: pass request along for tracing Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 29/48] nvme: add request mapping helper Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 30/48] nvme: verify validity of prp lists in the cmb Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 31/48] nvme: refactor request bounds checking Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 32/48] nvme: add check for mdts Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 33/48] nvme: be consistent about zeros vs zeroes Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 34/48] nvme: refactor NvmeRequest Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 35/48] nvme: remove NvmeCmd parameter Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 36/48] nvme: allow multiple aios per command Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 37/48] nvme: add nvme_check_rw helper Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 38/48] nvme: use preallocated qsg/iov in nvme_dma_prp Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 39/48] pci: pass along the return value of dma_memory_rw Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 40/48] nvme: handle dma errors Klaus Jensen
2020-04-15  7:26   ` Philippe Mathieu-Daudé
2020-04-15  5:51 ` [PATCH v7 41/48] nvme: harden cmb access Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 42/48] nvme: add support for scatter gather lists Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 43/48] nvme: add support for sgl bit bucket descriptor Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 44/48] nvme: refactor identify active namespace id list Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 45/48] nvme: support multiple namespaces Klaus Jensen
2020-04-15  7:38   ` Philippe Mathieu-Daudé
2020-04-15  8:02     ` Klaus Birkelund Jensen
2020-04-15  5:51 ` [PATCH v7 46/48] pci: allocate pci id for nvme Klaus Jensen
2020-04-21  9:19   ` Gerd Hoffmann
2020-04-15  5:51 ` [PATCH v7 47/48] nvme: change controller pci id Klaus Jensen
2020-04-15  5:51 ` [PATCH v7 48/48] nvme: make lba data size configurable Klaus Jensen
2020-04-15  7:40   ` Philippe Mathieu-Daudé
2020-04-15  7:17 ` [PATCH v7 00/48] nvme: support NVMe v1.3d, SGLs and multiple namespaces no-reply
2020-04-15  8:02 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.