All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/18] hw/block/nvme: bump to v1.3
@ 2020-07-06  6:12 Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 01/18] hw/block/nvme: bump spec data structures " Klaus Jensen
                   ` (18 more replies)
  0 siblings, 19 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

This adds mandatory features of NVM Express v1.3 to the emulated NVMe
device.


v3:
  * hw/block/nvme: additional tracing
    - Reverse logic in nvme_cid(). (Philippe)
    - Move nvme_cid() and nvme_sqid() to source file. (Philippe)
  * hw/block/nvme: fix missing endian conversion
    - Move this patch to very early in the series and fix the bug properly as
      suggested by Philippe. Then let the change trickle down through
      the series. (Philippe)
  * hw/block/nvme: add remaining mandatory controller parameters
    - Move the nvme_feature_{support,default} arrays to the source file.
      (Philippe)
    - Add a NVME_FID_MAX constant. (Philippe)
  * hw/block/nvme: support the get/set features select and save fields
    - Move the nvme_feature_cap array to the source file. (Philippe)
  * hw/block/nvme: reject invalid nsid values in active namespace id list
    - Rework the condition and add a comment and reference to the spec.
      (Philippe)
  * hw/block/nvme: provide the mandatory subnqn field
    - Change to use strpadcpy(). (Philippe)

  Had to clear some R-b's due to functional changes.

  Missing review: 2, 3, 7, 12, 16, 17


v2:
  * hw/block/nvme: bump spec data structures to v1.3
    - Shorten some constants. (Dmitry)
  * hw/block/nvme: add temperature threshold feature
    - Remove unused temp_thresh member. (Dmitry)
  * hw/block/nvme: add support for the get log page command
    - Change the temperature field in the NvmeSmartLog struct to be an
      uint16_t and handle wierd alignment by adding QEMU_PACKED to the
      struct. (Dmitry)
  * hw/block/nvme: add remaining mandatory controller parameters
    - Fix spelling. (Dmitry)
  * hw/block/nvme: support the get/set features select and save fields
    - Fix bad logic causing temperature thresholds to always report
      defaults. (Dmitry)
  * hw/block/nvme: reject invalid nsid values in active namespace id list
    - Added patch; reject the 0xfffffffe and 0xffffffff nsid values.


$ git-backport-diff -u for-master/bump-to-v1.3-v2 -r upstream/master... -S
Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/18:[----] [--] 'hw/block/nvme: bump spec data structures to v1.3'
002/18:[0008] [FC] 'hw/block/nvme: fix missing endian conversion'
003/18:[0028] [FC] 'hw/block/nvme: additional tracing'
004/18:[----] [--] 'hw/block/nvme: add support for the abort command'
005/18:[0004] [FC] 'hw/block/nvme: add temperature threshold feature'
006/18:[----] [--] 'hw/block/nvme: mark fw slot 1 as read-only'
007/18:[----] [--] 'hw/block/nvme: add support for the get log page command'
008/18:[0002] [FC] 'hw/block/nvme: add support for the asynchronous event request command'
009/18:[----] [--] 'hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h'
010/18:[----] [--] 'hw/block/nvme: flush write cache when disabled'
011/18:[0044] [FC] 'hw/block/nvme: add remaining mandatory controller parameters'
012/18:[0024] [FC] 'hw/block/nvme: support the get/set features select and save fields'
013/18:[----] [--] 'hw/block/nvme: make sure ncqr and nsqr is valid'
014/18:[----] [--] 'hw/block/nvme: support identify namespace descriptor list'
015/18:[0008] [FC] 'hw/block/nvme: reject invalid nsid values in active namespace id list'
016/18:[----] [--] 'hw/block/nvme: enforce valid queue creation sequence'
017/18:[0006] [FC] 'hw/block/nvme: provide the mandatory subnqn field'
018/18:[----] [--] 'hw/block/nvme: bump supported version to v1.3'


Klaus Jensen (18):
  hw/block/nvme: bump spec data structures to v1.3
  hw/block/nvme: fix missing endian conversion
  hw/block/nvme: additional tracing
  hw/block/nvme: add support for the abort command
  hw/block/nvme: add temperature threshold feature
  hw/block/nvme: mark fw slot 1 as read-only
  hw/block/nvme: add support for the get log page command
  hw/block/nvme: add support for the asynchronous event request command
  hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h
  hw/block/nvme: flush write cache when disabled
  hw/block/nvme: add remaining mandatory controller parameters
  hw/block/nvme: support the get/set features select and save fields
  hw/block/nvme: make sure ncqr and nsqr is valid
  hw/block/nvme: support identify namespace descriptor list
  hw/block/nvme: reject invalid nsid values in active namespace id list
  hw/block/nvme: enforce valid queue creation sequence
  hw/block/nvme: provide the mandatory subnqn field
  hw/block/nvme: bump supported version to v1.3

 block/nvme.c          |  18 +-
 hw/block/nvme.c       | 676 ++++++++++++++++++++++++++++++++++++++++--
 hw/block/nvme.h       |  22 +-
 hw/block/trace-events |  27 +-
 include/block/nvme.h  | 225 +++++++++++---
 5 files changed, 892 insertions(+), 76 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 01/18] hw/block/nvme: bump spec data structures to v1.3
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-08 19:19   ` Dmitry Fomichev
  2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, Dmitry Fomichev, Klaus Jensen, qemu-devel,
	Max Reitz, Klaus Jensen, Keith Busch, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Add missing fields in the Identify Controller and Identify Namespace
data structures to bring them in line with NVMe v1.3.

This also adds data structures and defines for SGL support which
requires a couple of trivial changes to the nvme block driver as well.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 block/nvme.c         |  18 ++---
 hw/block/nvme.c      |  12 ++--
 include/block/nvme.h | 156 ++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 154 insertions(+), 32 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 374e26891573..c1c4c07ac6cc 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -518,7 +518,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
         error_setg(errp, "Cannot map buffer for DMA");
         goto out;
     }
-    cmd.prp1 = cpu_to_le64(iova);
+    cmd.dptr.prp1 = cpu_to_le64(iova);
 
     if (nvme_cmd_sync(bs, s->queues[0], &cmd)) {
         error_setg(errp, "Failed to identify controller");
@@ -629,7 +629,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
     }
     cmd = (NvmeCmd) {
         .opcode = NVME_ADM_CMD_CREATE_CQ,
-        .prp1 = cpu_to_le64(q->cq.iova),
+        .dptr.prp1 = cpu_to_le64(q->cq.iova),
         .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
         .cdw11 = cpu_to_le32(0x3),
     };
@@ -640,7 +640,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
     }
     cmd = (NvmeCmd) {
         .opcode = NVME_ADM_CMD_CREATE_SQ,
-        .prp1 = cpu_to_le64(q->sq.iova),
+        .dptr.prp1 = cpu_to_le64(q->sq.iova),
         .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
         .cdw11 = cpu_to_le32(0x1 | (n << 16)),
     };
@@ -988,16 +988,16 @@ try_map:
     case 0:
         abort();
     case 1:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = 0;
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = 0;
         break;
     case 2:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = pagelist[1];
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = pagelist[1];
         break;
     default:
-        cmd->prp1 = pagelist[0];
-        cmd->prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
+        cmd->dptr.prp1 = pagelist[0];
+        cmd->dptr.prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
         break;
     }
     trace_nvme_cmd_map_qiov(s, cmd, req, qiov, entries);
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1aee042d4cb2..71b388aa0e20 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -397,8 +397,8 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
     uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
     uint64_t slba = le64_to_cpu(rw->slba);
-    uint64_t prp1 = le64_to_cpu(rw->prp1);
-    uint64_t prp2 = le64_to_cpu(rw->prp2);
+    uint64_t prp1 = le64_to_cpu(rw->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(rw->dptr.prp2);
 
     uint8_t lba_index  = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
     uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
@@ -795,8 +795,8 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
 
 static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 {
-    uint64_t prp1 = le64_to_cpu(cmd->prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->prp2);
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
     uint64_t timestamp = nvme_get_timestamp(n);
 
@@ -834,8 +834,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 {
     uint16_t ret;
     uint64_t timestamp;
-    uint64_t prp1 = le64_to_cpu(cmd->prp1);
-    uint64_t prp2 = le64_to_cpu(cmd->prp2);
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
 
     ret = nvme_dma_write_prp(n, (uint8_t *)&timestamp,
                                 sizeof(timestamp), prp1, prp2);
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 1720ee1d5158..2a80d2a7ed89 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -377,15 +377,53 @@ enum NvmePmrmscMask {
 #define NVME_PMRMSC_SET_CBA(pmrmsc, val)   \
     (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT)
 
+enum NvmeSglDescriptorType {
+    NVME_SGL_DESCR_TYPE_DATA_BLOCK          = 0x0,
+    NVME_SGL_DESCR_TYPE_BIT_BUCKET          = 0x1,
+    NVME_SGL_DESCR_TYPE_SEGMENT             = 0x2,
+    NVME_SGL_DESCR_TYPE_LAST_SEGMENT        = 0x3,
+    NVME_SGL_DESCR_TYPE_KEYED_DATA_BLOCK    = 0x4,
+
+    NVME_SGL_DESCR_TYPE_VENDOR_SPECIFIC     = 0xf,
+};
+
+enum NvmeSglDescriptorSubtype {
+    NVME_SGL_DESCR_SUBTYPE_ADDRESS = 0x0,
+};
+
+typedef struct NvmeSglDescriptor {
+    uint64_t addr;
+    uint32_t len;
+    uint8_t  rsvd[3];
+    uint8_t  type;
+} NvmeSglDescriptor;
+
+#define NVME_SGL_TYPE(type)     ((type >> 4) & 0xf)
+#define NVME_SGL_SUBTYPE(type)  (type & 0xf)
+
+typedef union NvmeCmdDptr {
+    struct {
+        uint64_t    prp1;
+        uint64_t    prp2;
+    };
+
+    NvmeSglDescriptor sgl;
+} NvmeCmdDptr;
+
+enum NvmePsdt {
+    PSDT_PRP                 = 0x0,
+    PSDT_SGL_MPTR_CONTIGUOUS = 0x1,
+    PSDT_SGL_MPTR_SGL        = 0x2,
+};
+
 typedef struct NvmeCmd {
     uint8_t     opcode;
-    uint8_t     fuse;
+    uint8_t     flags;
     uint16_t    cid;
     uint32_t    nsid;
     uint64_t    res1;
     uint64_t    mptr;
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint32_t    cdw10;
     uint32_t    cdw11;
     uint32_t    cdw12;
@@ -394,6 +432,9 @@ typedef struct NvmeCmd {
     uint32_t    cdw15;
 } NvmeCmd;
 
+#define NVME_CMD_FLAGS_FUSE(flags) (flags & 0x3)
+#define NVME_CMD_FLAGS_PSDT(flags) ((flags >> 6) & 0x3)
+
 enum NvmeAdminCommands {
     NVME_ADM_CMD_DELETE_SQ      = 0x00,
     NVME_ADM_CMD_CREATE_SQ      = 0x01,
@@ -493,8 +534,7 @@ typedef struct NvmeRwCmd {
     uint32_t    nsid;
     uint64_t    rsvd2;
     uint64_t    mptr;
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint64_t    slba;
     uint16_t    nlb;
     uint16_t    control;
@@ -534,8 +574,7 @@ typedef struct NvmeDsmCmd {
     uint16_t    cid;
     uint32_t    nsid;
     uint64_t    rsvd2[2];
-    uint64_t    prp1;
-    uint64_t    prp2;
+    NvmeCmdDptr dptr;
     uint32_t    nr;
     uint32_t    attributes;
     uint32_t    rsvd12[4];
@@ -599,6 +638,12 @@ enum NvmeStatusCodes {
     NVME_CMD_ABORT_MISSING_FUSE = 0x000a,
     NVME_INVALID_NSID           = 0x000b,
     NVME_CMD_SEQ_ERROR          = 0x000c,
+    NVME_INVALID_SGL_SEG_DESCR  = 0x000d,
+    NVME_INVALID_NUM_SGL_DESCRS = 0x000e,
+    NVME_DATA_SGL_LEN_INVALID   = 0x000f,
+    NVME_MD_SGL_LEN_INVALID     = 0x0010,
+    NVME_SGL_DESCR_TYPE_INVALID = 0x0011,
+    NVME_INVALID_USE_OF_CMB     = 0x0012,
     NVME_LBA_RANGE              = 0x0080,
     NVME_CAP_EXCEEDED           = 0x0081,
     NVME_NS_NOT_READY           = 0x0082,
@@ -687,7 +732,7 @@ enum NvmeSmartWarn {
     NVME_SMART_FAILED_VOLATILE_MEDIA  = 1 << 4,
 };
 
-enum LogIdentifier {
+enum NvmeLogIdentifier {
     NVME_LOG_ERROR_INFO     = 0x01,
     NVME_LOG_SMART_INFO     = 0x02,
     NVME_LOG_FW_SLOT_INFO   = 0x03,
@@ -711,6 +756,7 @@ enum {
     NVME_ID_CNS_NS             = 0x0,
     NVME_ID_CNS_CTRL           = 0x1,
     NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
+    NVME_ID_CNS_NS_DESCR_LIST  = 0x3,
 };
 
 typedef struct NvmeIdCtrl {
@@ -723,7 +769,15 @@ typedef struct NvmeIdCtrl {
     uint8_t     ieee[3];
     uint8_t     cmic;
     uint8_t     mdts;
-    uint8_t     rsvd255[178];
+    uint16_t    cntlid;
+    uint32_t    ver;
+    uint32_t    rtd3r;
+    uint32_t    rtd3e;
+    uint32_t    oaes;
+    uint32_t    ctratt;
+    uint8_t     rsvd100[12];
+    uint8_t     fguid[16];
+    uint8_t     rsvd128[128];
     uint16_t    oacs;
     uint8_t     acl;
     uint8_t     aerl;
@@ -731,10 +785,28 @@ typedef struct NvmeIdCtrl {
     uint8_t     lpa;
     uint8_t     elpe;
     uint8_t     npss;
-    uint8_t     rsvd511[248];
+    uint8_t     avscc;
+    uint8_t     apsta;
+    uint16_t    wctemp;
+    uint16_t    cctemp;
+    uint16_t    mtfa;
+    uint32_t    hmpre;
+    uint32_t    hmmin;
+    uint8_t     tnvmcap[16];
+    uint8_t     unvmcap[16];
+    uint32_t    rpmbs;
+    uint16_t    edstt;
+    uint8_t     dsto;
+    uint8_t     fwug;
+    uint16_t    kas;
+    uint16_t    hctma;
+    uint16_t    mntmt;
+    uint16_t    mxtmt;
+    uint32_t    sanicap;
+    uint8_t     rsvd332[180];
     uint8_t     sqes;
     uint8_t     cqes;
-    uint16_t    rsvd515;
+    uint16_t    maxcmd;
     uint32_t    nn;
     uint16_t    oncs;
     uint16_t    fuses;
@@ -742,8 +814,14 @@ typedef struct NvmeIdCtrl {
     uint8_t     vwc;
     uint16_t    awun;
     uint16_t    awupf;
-    uint8_t     rsvd703[174];
-    uint8_t     rsvd2047[1344];
+    uint8_t     nvscc;
+    uint8_t     rsvd531;
+    uint16_t    acwu;
+    uint8_t     rsvd534[2];
+    uint32_t    sgls;
+    uint8_t     rsvd540[228];
+    uint8_t     subnqn[256];
+    uint8_t     rsvd1024[1024];
     NvmePSD     psd[32];
     uint8_t     vs[1024];
 } NvmeIdCtrl;
@@ -769,6 +847,16 @@ enum NvmeIdCtrlOncs {
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
 #define NVME_CTRL_CQES_MAX(cqes) (((cqes) >> 4) & 0xf)
 
+#define NVME_CTRL_SGLS_SUPPORT_MASK        (0x3 <<  0)
+#define NVME_CTRL_SGLS_SUPPORT_NO_ALIGN    (0x1 <<  0)
+#define NVME_CTRL_SGLS_SUPPORT_DWORD_ALIGN (0x1 <<  1)
+#define NVME_CTRL_SGLS_KEYED               (0x1 <<  2)
+#define NVME_CTRL_SGLS_BITBUCKET           (0x1 << 16)
+#define NVME_CTRL_SGLS_MPTR_CONTIGUOUS     (0x1 << 17)
+#define NVME_CTRL_SGLS_EXCESS_LENGTH       (0x1 << 18)
+#define NVME_CTRL_SGLS_MPTR_SGL            (0x1 << 19)
+#define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
+
 typedef struct NvmeFeatureVal {
     uint32_t    arbitration;
     uint32_t    power_mgmt;
@@ -791,6 +879,15 @@ typedef struct NvmeFeatureVal {
 #define NVME_INTC_THR(intc)     (intc & 0xff)
 #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
 
+#define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
+#define NVME_TEMP_THSEL_OVER   0x0
+#define NVME_TEMP_THSEL_UNDER  0x1
+
+#define NVME_TEMP_TMPSEL(temp)     ((temp >> 16) & 0xf)
+#define NVME_TEMP_TMPSEL_COMPOSITE 0x0
+
+#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
+
 enum NvmeFeatureIds {
     NVME_ARBITRATION                = 0x1,
     NVME_POWER_MANAGEMENT           = 0x2,
@@ -833,18 +930,43 @@ typedef struct NvmeIdNs {
     uint8_t     mc;
     uint8_t     dpc;
     uint8_t     dps;
-
     uint8_t     nmic;
     uint8_t     rescap;
     uint8_t     fpi;
     uint8_t     dlfeat;
-
-    uint8_t     res34[94];
+    uint16_t    nawun;
+    uint16_t    nawupf;
+    uint16_t    nacwu;
+    uint16_t    nabsn;
+    uint16_t    nabo;
+    uint16_t    nabspf;
+    uint16_t    noiob;
+    uint8_t     nvmcap[16];
+    uint8_t     rsvd64[40];
+    uint8_t     nguid[16];
+    uint64_t    eui64;
     NvmeLBAF    lbaf[16];
-    uint8_t     res192[192];
+    uint8_t     rsvd192[192];
     uint8_t     vs[3712];
 } NvmeIdNs;
 
+typedef struct NvmeIdNsDescr {
+    uint8_t nidt;
+    uint8_t nidl;
+    uint8_t rsvd2[2];
+} NvmeIdNsDescr;
+
+enum {
+    NVME_NIDT_EUI64_LEN =  8,
+    NVME_NIDT_NGUID_LEN = 16,
+    NVME_NIDT_UUID_LEN  = 16,
+};
+
+enum NvmeNsIdentifierType {
+    NVME_NIDT_EUI64 = 0x1,
+    NVME_NIDT_NGUID = 0x2,
+    NVME_NIDT_UUID  = 0x3,
+};
 
 /*Deallocate Logical Block Features*/
 #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat)       ((dlfeat) & 0x10)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 01/18] hw/block/nvme: bump spec data structures " Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
                   ` (16 subsequent siblings)
  18 siblings, 3 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Fix a missing cpu_to conversion by moving conversion to just before
returning instead.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 hw/block/nvme.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 71b388aa0e20..766cd5b33bb1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -815,8 +815,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
         break;
     case NVME_NUMBER_OF_QUEUES:
-        result = cpu_to_le32((n->params.max_ioqpairs - 1) |
-                             ((n->params.max_ioqpairs - 1) << 16));
+        result = (n->params.max_ioqpairs - 1) |
+            ((n->params.max_ioqpairs - 1) << 16);
         trace_pci_nvme_getfeat_numq(result);
         break;
     case NVME_TIMESTAMP:
@@ -826,7 +826,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    req->cqe.result = result;
+    req->cqe.result = cpu_to_le32(result);
     return NVME_SUCCESS;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 03/18] hw/block/nvme: additional tracing
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 01/18] hw/block/nvme: bump spec data structures " Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2020-07-06  6:12 ` [PATCH v3 04/18] hw/block/nvme: add support for the abort command Klaus Jensen
                   ` (15 subsequent siblings)
  18 siblings, 3 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Add various additional tracing and streamline nvme_identify_ns and
nvme_identify_nslist (they do not need to repeat the command, it is
already in the trace name).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c       | 33 +++++++++++++++++++++++++++++++++
 hw/block/trace-events | 13 +++++++++++--
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 766cd5b33bb1..09ef54d771c4 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -69,6 +69,20 @@
 
 static void nvme_process_sq(void *opaque);
 
+static uint16_t nvme_cid(NvmeRequest *req)
+{
+    if (!req) {
+        return 0xffff;
+    }
+
+    return le16_to_cpu(req->cqe.cid);
+}
+
+static uint16_t nvme_sqid(NvmeRequest *req)
+{
+    return le16_to_cpu(req->sq->sqid);
+}
+
 static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
 {
     hwaddr low = n->ctrl_mem.addr;
@@ -331,6 +345,8 @@ static void nvme_post_cqes(void *opaque)
 static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
 {
     assert(cq->cqid == req->sq->cqid);
+    trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
+                                          req->status);
     QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
     QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
@@ -343,6 +359,8 @@ static void nvme_rw_cb(void *opaque, int ret)
     NvmeCtrl *n = sq->ctrl;
     NvmeCQueue *cq = n->cq[sq->cqid];
 
+    trace_pci_nvme_rw_cb(nvme_cid(req));
+
     if (!ret) {
         block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
         req->status = NVME_SUCCESS;
@@ -378,6 +396,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
     uint64_t offset = slba << data_shift;
     uint32_t count = nlb << data_shift;
 
+    trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb);
+
     if (unlikely(slba + nlb > ns->id_ns.nsze)) {
         trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
         return NVME_LBA_RANGE | NVME_DNR;
@@ -445,6 +465,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     NvmeNamespace *ns;
     uint32_t nsid = le32_to_cpu(cmd->nsid);
 
+    trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), cmd->opcode);
+
     if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
         trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
         return NVME_INVALID_NSID | NVME_DNR;
@@ -876,6 +898,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
+    trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
+
     switch (cmd->opcode) {
     case NVME_ADM_CMD_DELETE_SQ:
         return nvme_del_sq(n, cmd);
@@ -1204,6 +1228,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
     uint8_t *ptr = (uint8_t *)&n->bar;
     uint64_t val = 0;
 
+    trace_pci_nvme_mmio_read(addr);
+
     if (unlikely(addr & (sizeof(uint32_t) - 1))) {
         NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
                        "MMIO read not 32-bit aligned,"
@@ -1273,6 +1299,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             return;
         }
 
+        trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
+
         start_sqs = nvme_cq_full(cq) ? 1 : 0;
         cq->head = new_head;
         if (start_sqs) {
@@ -1311,6 +1339,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
             return;
         }
 
+        trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
+
         sq->tail = new_tail;
         timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
     }
@@ -1320,6 +1350,9 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, uint64_t data,
     unsigned size)
 {
     NvmeCtrl *n = (NvmeCtrl *)opaque;
+
+    trace_pci_nvme_mmio_write(addr, data);
+
     if (addr < sizeof(n->bar)) {
         nvme_write_bar(n, addr, data, size);
     } else if (addr >= 0x1000) {
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 958fcc5508d1..c40c0d2e4b28 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -33,19 +33,28 @@ pci_nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 pci_nvme_irq_pin(void) "pulsing IRQ pin"
 pci_nvme_irq_masked(void) "IRQ is masked"
 pci_nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
+pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
+pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
 pci_nvme_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
+pci_nvme_rw_cb(uint16_t cid) "cid %"PRIu16""
+pci_nvme_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
 pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
 pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
 pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
 pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 pci_nvme_identify_ctrl(void) "identify controller"
-pci_nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
-pci_nvme_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
+pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
+pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
 pci_nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
 pci_nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
+pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
+pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
+pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
+pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16""
+pci_nvme_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_tail %"PRIu16""
 pci_nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 pci_nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
 pci_nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 04/18] hw/block/nvme: add support for the abort command
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (2 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature Klaus Jensen
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
Section 5.1 ("Abort command").

The Abort command is a best effort command; for now, the device always
fails to abort the given command.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 09ef54d771c4..415d3b036897 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -775,6 +775,18 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
     }
 }
 
+static uint16_t nvme_abort(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    uint16_t sqid = le32_to_cpu(cmd->cdw10) & 0xffff;
+
+    req->cqe.result = 1;
+    if (nvme_check_sqid(n, sqid)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    return NVME_SUCCESS;
+}
+
 static inline void nvme_set_timestamp(NvmeCtrl *n, uint64_t ts)
 {
     trace_pci_nvme_setfeat_timestamp(ts);
@@ -911,6 +923,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_create_cq(n, cmd);
     case NVME_ADM_CMD_IDENTIFY:
         return nvme_identify(n, cmd);
+    case NVME_ADM_CMD_ABORT:
+        return nvme_abort(n, cmd, req);
     case NVME_ADM_CMD_SET_FEATURES:
         return nvme_set_feature(n, cmd, req);
     case NVME_ADM_CMD_GET_FEATURES:
@@ -1596,6 +1610,19 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->ieee[1] = 0x02;
     id->ieee[2] = 0xb3;
     id->oacs = cpu_to_le16(0);
+
+    /*
+     * Because the controller always completes the Abort command immediately,
+     * there can never be more than one concurrently executing Abort command,
+     * so this value is never used for anything. Note that there can easily be
+     * many Abort commands in the queues, but they are not considered
+     * "executing" until processed by nvme_abort.
+     *
+     * The specification recommends a value of 3 for Abort Command Limit (four
+     * concurrently outstanding Abort commands), so lets use that though it is
+     * inconsequential.
+     */
+    id->acl = 3;
     id->frmw = 7 << 1;
     id->lpa = 1 << 0;
     id->sqes = (0x6 << 4) | 0x6;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (3 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 04/18] hw/block/nvme: add support for the abort command Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-08 19:24   ` Dmitry Fomichev
  2020-07-06  6:12 ` [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only Klaus Jensen
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

It might seem weird to implement this feature for an emulated device,
but it is mandatory to support and the feature is useful for testing
asynchronous event request support, which will be added in a later
patch.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c      | 48 ++++++++++++++++++++++++++++++++++++++++++++
 hw/block/nvme.h      |  1 +
 include/block/nvme.h |  5 ++++-
 3 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 415d3b036897..a330ccf91620 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -59,6 +59,9 @@
 #define NVME_DB_SIZE  4
 #define NVME_CMB_BIR 2
 #define NVME_PMR_BIR 2
+#define NVME_TEMPERATURE 0x143
+#define NVME_TEMPERATURE_WARNING 0x157
+#define NVME_TEMPERATURE_CRITICAL 0x175
 
 #define NVME_GUEST_ERR(trace, fmt, ...) \
     do { \
@@ -841,9 +844,31 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
+    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t result;
 
     switch (dw10) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        result = 0;
+
+        /*
+         * The controller only implements the Composite Temperature sensor, so
+         * return 0 for all other sensors.
+         */
+        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        switch (NVME_TEMP_THSEL(dw11)) {
+        case NVME_TEMP_THSEL_OVER:
+            result = n->features.temp_thresh_hi;
+            break;
+        case NVME_TEMP_THSEL_UNDER:
+            result = n->features.temp_thresh_low;
+            break;
+        }
+
+        break;
     case NVME_VOLATILE_WRITE_CACHE:
         result = blk_enable_write_cache(n->conf.blk);
         trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
@@ -888,6 +913,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
 
     switch (dw10) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        switch (NVME_TEMP_THSEL(dw11)) {
+        case NVME_TEMP_THSEL_OVER:
+            n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
+            break;
+        case NVME_TEMP_THSEL_UNDER:
+            n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
+            break;
+        default:
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
+        break;
     case NVME_VOLATILE_WRITE_CACHE:
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
@@ -1468,6 +1510,7 @@ static void nvme_init_state(NvmeCtrl *n)
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+    n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
 }
 
 static void nvme_init_blk(NvmeCtrl *n, Error **errp)
@@ -1625,6 +1668,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->acl = 3;
     id->frmw = 7 << 1;
     id->lpa = 1 << 0;
+
+    /* recommended default value (~70 C) */
+    id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
+    id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
+
     id->sqes = (0x6 << 4) | 0x6;
     id->cqes = (0x4 << 4) | 0x4;
     id->nn = cpu_to_le32(n->num_namespaces);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 1d30c0bca283..e3a2c907e210 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -107,6 +107,7 @@ typedef struct NvmeCtrl {
     NvmeSQueue      admin_sq;
     NvmeCQueue      admin_cq;
     NvmeIdCtrl      id_ctrl;
+    NvmeFeatureVal  features;
 } NvmeCtrl;
 
 /* calculate the number of LBAs that the namespace can accomodate */
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 2a80d2a7ed89..d2c457695b38 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -860,7 +860,10 @@ enum NvmeIdCtrlOncs {
 typedef struct NvmeFeatureVal {
     uint32_t    arbitration;
     uint32_t    power_mgmt;
-    uint32_t    temp_thresh;
+    struct {
+        uint16_t temp_thresh_hi;
+        uint16_t temp_thresh_low;
+    };
     uint32_t    err_rec;
     uint32_t    volatile_wc;
     uint32_t    num_queues;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (4 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29  9:14   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Mark firmware slot 1 as read-only and only support that slot.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c      | 3 ++-
 include/block/nvme.h | 4 ++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index a330ccf91620..b6bc75eb61a2 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -62,6 +62,7 @@
 #define NVME_TEMPERATURE 0x143
 #define NVME_TEMPERATURE_WARNING 0x157
 #define NVME_TEMPERATURE_CRITICAL 0x175
+#define NVME_NUM_FW_SLOTS 1
 
 #define NVME_GUEST_ERR(trace, fmt, ...) \
     do { \
@@ -1666,7 +1667,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
      * inconsequential.
      */
     id->acl = 3;
-    id->frmw = 7 << 1;
+    id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
     id->lpa = 1 << 0;
 
     /* recommended default value (~70 C) */
diff --git a/include/block/nvme.h b/include/block/nvme.h
index d2c457695b38..d639e8bbee92 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -842,6 +842,10 @@ enum NvmeIdCtrlOncs {
     NVME_ONCS_TIMESTAMP     = 1 << 6,
 };
 
+enum NvmeIdCtrlFrmw {
+    NVME_FRMW_SLOT1_RO = 1 << 0,
+};
+
 #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
 #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (5 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-08 19:22   ` Dmitry Fomichev
                     ` (2 more replies)
  2020-07-06  6:12 ` [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command Klaus Jensen
                   ` (11 subsequent siblings)
  18 siblings, 3 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Add support for the Get Log Page command and basic implementations of
the mandatory Error Information, SMART / Health Information and Firmware
Slot Information log pages.

In violation of the specification, the SMART / Health Information log
page does not persist information over the lifetime of the controller
because the device has no place to store such persistent state.

Note that the LPA field in the Identify Controller data structure
intentionally has bit 0 cleared because there is no namespace specific
information in the SMART / Health information log page.

Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
Section 5.14 ("Get Log Page command").

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 hw/block/nvme.c       | 140 +++++++++++++++++++++++++++++++++++++++++-
 hw/block/nvme.h       |   2 +
 hw/block/trace-events |   2 +
 include/block/nvme.h  |   8 ++-
 4 files changed, 149 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index b6bc75eb61a2..7cb3787638f6 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -606,6 +606,140 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
+{
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    uint32_t nsid = le32_to_cpu(cmd->nsid);
+
+    uint32_t trans_len;
+    time_t current_ms;
+    uint64_t units_read = 0, units_written = 0;
+    uint64_t read_commands = 0, write_commands = 0;
+    NvmeSmartLog smart;
+    BlockAcctStats *s;
+
+    if (nsid && nsid != 0xffffffff) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    s = blk_get_stats(n->conf.blk);
+
+    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
+    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
+    read_commands = s->nr_ops[BLOCK_ACCT_READ];
+    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
+
+    if (off > sizeof(smart)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    trans_len = MIN(sizeof(smart) - off, buf_len);
+
+    memset(&smart, 0x0, sizeof(smart));
+
+    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
+    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
+    smart.host_read_commands[0] = cpu_to_le64(read_commands);
+    smart.host_write_commands[0] = cpu_to_le64(write_commands);
+
+    smart.temperature = cpu_to_le16(n->temperature);
+
+    if ((n->temperature >= n->features.temp_thresh_hi) ||
+        (n->temperature <= n->features.temp_thresh_low)) {
+        smart.critical_warning |= NVME_SMART_TEMPERATURE;
+    }
+
+    current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    smart.power_on_hours[0] =
+        cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
+
+    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
+                             prp2);
+}
+
+static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                 uint64_t off, NvmeRequest *req)
+{
+    uint32_t trans_len;
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    NvmeFwSlotInfoLog fw_log = {
+        .afi = 0x1,
+    };
+
+    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
+
+    if (off > sizeof(fw_log)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    trans_len = MIN(sizeof(fw_log) - off, buf_len);
+
+    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
+                             prp2);
+}
+
+static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
+                                uint64_t off, NvmeRequest *req)
+{
+    uint32_t trans_len;
+    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
+    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
+    NvmeErrorLog errlog;
+
+    if (off > sizeof(errlog)) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    memset(&errlog, 0x0, sizeof(errlog));
+
+    trans_len = MIN(sizeof(errlog) - off, buf_len);
+
+    return nvme_dma_read_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2);
+}
+
+static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
+    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
+    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
+    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
+    uint8_t  lid = dw10 & 0xff;
+    uint8_t  lsp = (dw10 >> 8) & 0xf;
+    uint8_t  rae = (dw10 >> 15) & 0x1;
+    uint32_t numdl, numdu;
+    uint64_t off, lpol, lpou;
+    size_t   len;
+
+    numdl = (dw10 >> 16);
+    numdu = (dw11 & 0xffff);
+    lpol = dw12;
+    lpou = dw13;
+
+    len = (((numdu << 16) | numdl) + 1) << 2;
+    off = (lpou << 32ULL) | lpol;
+
+    if (off & 0x3) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    trace_pci_nvme_get_log(nvme_cid(req), lid, lsp, rae, len, off);
+
+    switch (lid) {
+    case NVME_LOG_ERROR_INFO:
+        return nvme_error_info(n, cmd, len, off, req);
+    case NVME_LOG_SMART_INFO:
+        return nvme_smart_info(n, cmd, len, off, req);
+    case NVME_LOG_FW_SLOT_INFO:
+        return nvme_fw_log_info(n, cmd, len, off, req);
+    default:
+        trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+}
+
 static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 {
     n->cq[cq->cqid] = NULL;
@@ -960,6 +1094,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_del_sq(n, cmd);
     case NVME_ADM_CMD_CREATE_SQ:
         return nvme_create_sq(n, cmd);
+    case NVME_ADM_CMD_GET_LOG_PAGE:
+        return nvme_get_log(n, cmd, req);
     case NVME_ADM_CMD_DELETE_CQ:
         return nvme_del_cq(n, cmd);
     case NVME_ADM_CMD_CREATE_CQ:
@@ -1511,7 +1647,9 @@ static void nvme_init_state(NvmeCtrl *n)
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
     n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
     n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+    n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
+    n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
 }
 
 static void nvme_init_blk(NvmeCtrl *n, Error **errp)
@@ -1668,7 +1806,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
      */
     id->acl = 3;
     id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
-    id->lpa = 1 << 0;
+    id->lpa = NVME_LPA_EXTENDED;
 
     /* recommended default value (~70 C) */
     id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index e3a2c907e210..8228978e93de 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -98,6 +98,8 @@ typedef struct NvmeCtrl {
     uint32_t    irq_status;
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
     uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
+    uint64_t    starttime_ms;
+    uint16_t    temperature;
 
     HostMemoryBackend *pmrdev;
 
diff --git a/hw/block/trace-events b/hw/block/trace-events
index c40c0d2e4b28..3330d74e48db 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 pci_nvme_identify_ctrl(void) "identify controller"
 pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
+pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
 pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
@@ -94,6 +95,7 @@ pci_nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completi
 pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
 pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
 pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
+pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16""
 pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
 pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
 pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index d639e8bbee92..49ce97ae1ab4 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -704,9 +704,9 @@ typedef struct NvmeErrorLog {
     uint8_t     resv[35];
 } NvmeErrorLog;
 
-typedef struct NvmeSmartLog {
+typedef struct QEMU_PACKED NvmeSmartLog {
     uint8_t     critical_warning;
-    uint8_t     temperature[2];
+    uint16_t    temperature;
     uint8_t     available_spare;
     uint8_t     available_spare_threshold;
     uint8_t     percentage_used;
@@ -846,6 +846,10 @@ enum NvmeIdCtrlFrmw {
     NVME_FRMW_SLOT1_RO = 1 << 0,
 };
 
+enum NvmeIdCtrlLpa {
+    NVME_LPA_EXTENDED = 1 << 2,
+};
+
 #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
 #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (6 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29 10:43   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h Klaus Jensen
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Add support for the Asynchronous Event Request command. Required for
compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.2
("Asynchronous Event Request command").

Mostly imported from Keith's qemu-nvme tree. Modified with a max number
of queued events (controllable with the aer_max_queued device
parameter). The spec states that the controller *should* retain
events, so we do best effort here.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c       | 180 ++++++++++++++++++++++++++++++++++++++++--
 hw/block/nvme.h       |  10 ++-
 hw/block/trace-events |   9 +++
 include/block/nvme.h  |   8 +-
 4 files changed, 198 insertions(+), 9 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7cb3787638f6..80c7285bc1cf 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -356,6 +356,85 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
     timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
 }
 
+static void nvme_process_aers(void *opaque)
+{
+    NvmeCtrl *n = opaque;
+    NvmeAsyncEvent *event, *next;
+
+    trace_pci_nvme_process_aers(n->aer_queued);
+
+    QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
+        NvmeRequest *req;
+        NvmeAerResult *result;
+
+        /* can't post cqe if there is nothing to complete */
+        if (!n->outstanding_aers) {
+            trace_pci_nvme_no_outstanding_aers();
+            break;
+        }
+
+        /* ignore if masked (cqe posted, but event not cleared) */
+        if (n->aer_mask & (1 << event->result.event_type)) {
+            trace_pci_nvme_aer_masked(event->result.event_type, n->aer_mask);
+            continue;
+        }
+
+        QTAILQ_REMOVE(&n->aer_queue, event, entry);
+        n->aer_queued--;
+
+        n->aer_mask |= 1 << event->result.event_type;
+        n->outstanding_aers--;
+
+        req = n->aer_reqs[n->outstanding_aers];
+
+        result = (NvmeAerResult *) &req->cqe.result;
+        result->event_type = event->result.event_type;
+        result->event_info = event->result.event_info;
+        result->log_page = event->result.log_page;
+        g_free(event);
+
+        req->status = NVME_SUCCESS;
+
+        trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
+                                    result->log_page);
+
+        nvme_enqueue_req_completion(&n->admin_cq, req);
+    }
+}
+
+static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
+                               uint8_t event_info, uint8_t log_page)
+{
+    NvmeAsyncEvent *event;
+
+    trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
+
+    if (n->aer_queued == n->params.aer_max_queued) {
+        trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
+        return;
+    }
+
+    event = g_new(NvmeAsyncEvent, 1);
+    event->result = (NvmeAerResult) {
+        .event_type = event_type,
+        .event_info = event_info,
+        .log_page   = log_page,
+    };
+
+    QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
+    n->aer_queued++;
+
+    nvme_process_aers(n);
+}
+
+static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
+{
+    n->aer_mask &= ~(1 << event_type);
+    if (!QTAILQ_EMPTY(&n->aer_queue)) {
+        nvme_process_aers(n);
+    }
+}
+
 static void nvme_rw_cb(void *opaque, int ret)
 {
     NvmeRequest *req = opaque;
@@ -606,8 +685,9 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
     return NVME_SUCCESS;
 }
 
-static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
-                                uint64_t off, NvmeRequest *req)
+static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
+                                uint32_t buf_len, uint64_t off,
+                                NvmeRequest *req)
 {
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
@@ -655,6 +735,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
     smart.power_on_hours[0] =
         cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
 
+    if (!rae) {
+        nvme_clear_events(n, NVME_AER_TYPE_SMART);
+    }
+
     return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
                              prp2);
 }
@@ -681,14 +765,19 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
                              prp2);
 }
 
-static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
-                                uint64_t off, NvmeRequest *req)
+static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
+                                uint32_t buf_len, uint64_t off,
+                                NvmeRequest *req)
 {
     uint32_t trans_len;
     uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
     uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
     NvmeErrorLog errlog;
 
+    if (!rae) {
+        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
+    }
+
     if (off > sizeof(errlog)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
@@ -729,9 +818,9 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
     switch (lid) {
     case NVME_LOG_ERROR_INFO:
-        return nvme_error_info(n, cmd, len, off, req);
+        return nvme_error_info(n, cmd, rae, len, off, req);
     case NVME_LOG_SMART_INFO:
-        return nvme_smart_info(n, cmd, len, off, req);
+        return nvme_smart_info(n, cmd, rae, len, off, req);
     case NVME_LOG_FW_SLOT_INFO:
         return nvme_fw_log_info(n, cmd, len, off, req);
     default:
@@ -1013,6 +1102,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
             ((n->params.max_ioqpairs - 1) << 16);
         trace_pci_nvme_getfeat_numq(result);
         break;
+    case NVME_ASYNCHRONOUS_EVENT_CONF:
+        result = n->features.async_config;
+        break;
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
     default:
@@ -1064,6 +1156,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
             return NVME_INVALID_FIELD | NVME_DNR;
         }
 
+        if (((n->temperature >= n->features.temp_thresh_hi) ||
+            (n->temperature <= n->features.temp_thresh_low)) &&
+            NVME_AEC_SMART(n->features.async_config) & NVME_SMART_TEMPERATURE) {
+            nvme_enqueue_event(n, NVME_AER_TYPE_SMART,
+                               NVME_AER_INFO_SMART_TEMP_THRESH,
+                               NVME_LOG_SMART_INFO);
+        }
+
         break;
     case NVME_VOLATILE_WRITE_CACHE:
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
@@ -1076,6 +1176,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
                                       ((n->params.max_ioqpairs - 1) << 16));
         break;
+    case NVME_ASYNCHRONOUS_EVENT_CONF:
+        n->features.async_config = dw11;
+        break;
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
     default:
@@ -1085,6 +1188,25 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
+{
+    trace_pci_nvme_aer(nvme_cid(req));
+
+    if (n->outstanding_aers > n->params.aerl) {
+        trace_pci_nvme_aer_aerl_exceeded();
+        return NVME_AER_LIMIT_EXCEEDED;
+    }
+
+    n->aer_reqs[n->outstanding_aers] = req;
+    n->outstanding_aers++;
+
+    if (!QTAILQ_EMPTY(&n->aer_queue)) {
+        nvme_process_aers(n);
+    }
+
+    return NVME_NO_COMPLETE;
+}
+
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
@@ -1108,6 +1230,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         return nvme_set_feature(n, cmd, req);
     case NVME_ADM_CMD_GET_FEATURES:
         return nvme_get_feature(n, cmd, req);
+    case NVME_ADM_CMD_ASYNC_EV_REQ:
+        return nvme_aer(n, cmd, req);
     default:
         trace_pci_nvme_err_invalid_admin_opc(cmd->opcode);
         return NVME_INVALID_OPCODE | NVME_DNR;
@@ -1162,6 +1286,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
         }
     }
 
+    while (!QTAILQ_EMPTY(&n->aer_queue)) {
+        NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
+        QTAILQ_REMOVE(&n->aer_queue, event, entry);
+        g_free(event);
+    }
+
+    n->aer_queued = 0;
+    n->outstanding_aers = 0;
+
     blk_flush(n->conf.blk);
     n->bar.cc = 0;
 }
@@ -1258,6 +1391,8 @@ static int nvme_start_ctrl(NvmeCtrl *n)
 
     nvme_set_timestamp(n, 0ULL);
 
+    QTAILQ_INIT(&n->aer_queue);
+
     return 0;
 }
 
@@ -1479,6 +1614,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            "completion queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1489,6 +1631,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            " beyond queue size, sqid=%"PRIu32","
                            " new_head=%"PRIu16", ignoring",
                            qid, new_head);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1519,6 +1668,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            "submission queue doorbell write"
                            " for nonexistent queue,"
                            " sqid=%"PRIu32", ignoring", qid);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1529,6 +1685,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
                            " beyond queue size, sqid=%"PRIu32","
                            " new_tail=%"PRIu16", ignoring",
                            qid, new_tail);
+
+            if (n->outstanding_aers) {
+                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
+                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
+                                   NVME_LOG_ERROR_INFO);
+            }
+
             return;
         }
 
@@ -1650,6 +1813,7 @@ static void nvme_init_state(NvmeCtrl *n)
     n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
     n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
 }
 
 static void nvme_init_blk(NvmeCtrl *n, Error **errp)
@@ -1805,6 +1969,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
      * inconsequential.
      */
     id->acl = 3;
+    id->aerl = n->params.aerl;
     id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
     id->lpa = NVME_LPA_EXTENDED;
 
@@ -1879,6 +2044,7 @@ static void nvme_exit(PCIDevice *pci_dev)
     g_free(n->namespaces);
     g_free(n->cq);
     g_free(n->sq);
+    g_free(n->aer_reqs);
 
     if (n->params.cmb_size_mb) {
         g_free(n->cmbuf);
@@ -1899,6 +2065,8 @@ static Property nvme_props[] = {
     DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
     DEFINE_PROP_UINT32("max_ioqpairs", NvmeCtrl, params.max_ioqpairs, 64),
     DEFINE_PROP_UINT16("msix_qsize", NvmeCtrl, params.msix_qsize, 65),
+    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
+    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 8228978e93de..1837233617bb 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -9,10 +9,12 @@ typedef struct NvmeParams {
     uint32_t max_ioqpairs;
     uint16_t msix_qsize;
     uint32_t cmb_size_mb;
+    uint8_t  aerl;
+    uint32_t aer_max_queued;
 } NvmeParams;
 
 typedef struct NvmeAsyncEvent {
-    QSIMPLEQ_ENTRY(NvmeAsyncEvent) entry;
+    QTAILQ_ENTRY(NvmeAsyncEvent) entry;
     NvmeAerResult result;
 } NvmeAsyncEvent;
 
@@ -94,6 +96,7 @@ typedef struct NvmeCtrl {
     uint32_t    num_namespaces;
     uint32_t    max_q_ents;
     uint64_t    ns_size;
+    uint8_t     outstanding_aers;
     uint8_t     *cmbuf;
     uint32_t    irq_status;
     uint64_t    host_timestamp;                 /* Timestamp sent by the host */
@@ -103,6 +106,11 @@ typedef struct NvmeCtrl {
 
     HostMemoryBackend *pmrdev;
 
+    uint8_t     aer_mask;
+    NvmeRequest **aer_reqs;
+    QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue;
+    int         aer_queued;
+
     NvmeNamespace   *namespaces;
     NvmeSQueue      **sq;
     NvmeCQueue      **cq;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 3330d74e48db..091af16ca7d7 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -51,6 +51,15 @@ pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
 pci_nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
 pci_nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
+pci_nvme_process_aers(int queued) "queued %d"
+pci_nvme_aer(uint16_t cid) "cid %"PRIu16""
+pci_nvme_aer_aerl_exceeded(void) "aerl exceeded"
+pci_nvme_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8""
+pci_nvme_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+pci_nvme_enqueue_event_noqueue(int queued) "queued %d"
+pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
+pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
 pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
 pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
 pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 49ce97ae1ab4..2101292ed5e8 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -597,8 +597,8 @@ enum NvmeAsyncEventRequest {
     NVME_AER_TYPE_SMART                     = 1,
     NVME_AER_TYPE_IO_SPECIFIC               = 6,
     NVME_AER_TYPE_VENDOR_SPECIFIC           = 7,
-    NVME_AER_INFO_ERR_INVALID_SQ            = 0,
-    NVME_AER_INFO_ERR_INVALID_DB            = 1,
+    NVME_AER_INFO_ERR_INVALID_DB_REGISTER   = 0,
+    NVME_AER_INFO_ERR_INVALID_DB_VALUE      = 1,
     NVME_AER_INFO_ERR_DIAG_FAIL             = 2,
     NVME_AER_INFO_ERR_PERS_INTERNAL_ERR     = 3,
     NVME_AER_INFO_ERR_TRANS_INTERNAL_ERR    = 4,
@@ -899,6 +899,10 @@ typedef struct NvmeFeatureVal {
 
 #define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
 
+#define NVME_AEC_SMART(aec)         (aec & 0xff)
+#define NVME_AEC_NS_ATTR(aec)       ((aec >> 8) & 0x1)
+#define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1)
+
 enum NvmeFeatureIds {
     NVME_ARBITRATION                = 0x1,
     NVME_POWER_MANAGEMENT           = 0x2,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (7 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29 10:46   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled Klaus Jensen
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

The NvmeFeatureVal does not belong with the spec-related data structures
in include/block/nvme.h that is shared between the block-level nvme
driver and the emulated nvme device.

Move it into the nvme device specific header file as it is the only
user of the structure. Also, remove the unused members.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.h      |  8 ++++++++
 include/block/nvme.h | 17 -----------------
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 1837233617bb..b93067c9e4a1 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -79,6 +79,14 @@ static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
 #define NVME(obj) \
         OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
 
+typedef struct NvmeFeatureVal {
+    struct {
+        uint16_t temp_thresh_hi;
+        uint16_t temp_thresh_low;
+    };
+    uint32_t    async_config;
+} NvmeFeatureVal;
+
 typedef struct NvmeCtrl {
     PCIDevice    parent_obj;
     MemoryRegion iomem;
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 2101292ed5e8..0dce15af6bcf 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -865,23 +865,6 @@ enum NvmeIdCtrlLpa {
 #define NVME_CTRL_SGLS_MPTR_SGL            (0x1 << 19)
 #define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
 
-typedef struct NvmeFeatureVal {
-    uint32_t    arbitration;
-    uint32_t    power_mgmt;
-    struct {
-        uint16_t temp_thresh_hi;
-        uint16_t temp_thresh_low;
-    };
-    uint32_t    err_rec;
-    uint32_t    volatile_wc;
-    uint32_t    num_queues;
-    uint32_t    int_coalescing;
-    uint32_t    *int_vector_config;
-    uint32_t    write_atomicity;
-    uint32_t    async_config;
-    uint32_t    sw_prog_marker;
-} NvmeFeatureVal;
-
 #define NVME_ARB_AB(arb)    (arb & 0x7)
 #define NVME_ARB_LPW(arb)   ((arb >> 8) & 0xff)
 #define NVME_ARB_MPW(arb)   ((arb >> 16) & 0xff)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (8 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29 11:03   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters Klaus Jensen
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

If the write cache is disabled with a Set Features command, flush it if
currently enabled.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 80c7285bc1cf..8fce2ebf69e7 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1166,6 +1166,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 
         break;
     case NVME_VOLATILE_WRITE_CACHE:
+        if (!(dw11 & 0x1) && blk_enable_write_cache(n->conf.blk)) {
+            blk_flush(n->conf.blk);
+        }
+
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (9 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29 11:31   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields Klaus Jensen
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Add support for any remaining mandatory controller operating parameters
(features).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c       | 56 ++++++++++++++++++++++++++++++++++++++-----
 hw/block/trace-events |  2 ++
 include/block/nvme.h  | 10 +++++++-
 3 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 8fce2ebf69e7..2d85e853403f 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -71,6 +71,20 @@
             " in %s: " fmt "\n", __func__, ## __VA_ARGS__); \
     } while (0)
 
+static const bool nvme_feature_support[NVME_FID_MAX] = {
+    [NVME_ARBITRATION]              = true,
+    [NVME_POWER_MANAGEMENT]         = true,
+    [NVME_TEMPERATURE_THRESHOLD]    = true,
+    [NVME_ERROR_RECOVERY]           = true,
+    [NVME_VOLATILE_WRITE_CACHE]     = true,
+    [NVME_NUMBER_OF_QUEUES]         = true,
+    [NVME_INTERRUPT_COALESCING]     = true,
+    [NVME_INTERRUPT_VECTOR_CONF]    = true,
+    [NVME_WRITE_ATOMICITY]          = true,
+    [NVME_ASYNCHRONOUS_EVENT_CONF]  = true,
+    [NVME_TIMESTAMP]                = true,
+};
+
 static void nvme_process_sq(void *opaque);
 
 static uint16_t nvme_cid(NvmeRequest *req)
@@ -1070,8 +1084,20 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
     uint32_t result;
+    uint8_t fid = NVME_GETSETFEAT_FID(dw10);
+    uint16_t iv;
 
-    switch (dw10) {
+    static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
+        [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
+    };
+
+    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
+
+    if (!nvme_feature_support[fid]) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    switch (fid) {
     case NVME_TEMPERATURE_THRESHOLD:
         result = 0;
 
@@ -1101,6 +1127,18 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         result = (n->params.max_ioqpairs - 1) |
             ((n->params.max_ioqpairs - 1) << 16);
         trace_pci_nvme_getfeat_numq(result);
+        break;
+    case NVME_INTERRUPT_VECTOR_CONF:
+        iv = dw11 & 0xffff;
+        if (iv >= n->params.max_ioqpairs + 1) {
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
+        result = iv;
+        if (iv == n->admin_cq.vector) {
+            result |= NVME_INTVC_NOCOALESCING;
+        }
+
         break;
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         result = n->features.async_config;
@@ -1108,8 +1146,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_TIMESTAMP:
         return nvme_get_feature_timestamp(n, cmd);
     default:
-        trace_pci_nvme_err_invalid_getfeat(dw10);
-        return NVME_INVALID_FIELD | NVME_DNR;
+        result = nvme_feature_default[fid];
+        break;
     }
 
     req->cqe.result = cpu_to_le32(result);
@@ -1138,8 +1176,15 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
+    uint8_t fid = NVME_GETSETFEAT_FID(dw10);
 
-    switch (dw10) {
+    trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
+
+    if (!nvme_feature_support[fid]) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    switch (fid) {
     case NVME_TEMPERATURE_THRESHOLD:
         if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
             break;
@@ -1186,8 +1231,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, cmd);
     default:
-        trace_pci_nvme_err_invalid_setfeat(dw10);
-        return NVME_INVALID_FIELD | NVME_DNR;
+        return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
     }
     return NVME_SUCCESS;
 }
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 091af16ca7d7..42e62f4649f8 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -46,6 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
 pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
+pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
+pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
 pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 0dce15af6bcf..cd396111b2f5 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -662,6 +662,7 @@ enum NvmeStatusCodes {
     NVME_FW_REQ_RESET           = 0x010b,
     NVME_INVALID_QUEUE_DEL      = 0x010c,
     NVME_FID_NOT_SAVEABLE       = 0x010d,
+    NVME_FEAT_NOT_CHANGEABLE    = 0x010e,
     NVME_FID_NOT_NSID_SPEC      = 0x010f,
     NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
     NVME_CONFLICTING_ATTRS      = 0x0180,
@@ -866,6 +867,7 @@ enum NvmeIdCtrlLpa {
 #define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
 
 #define NVME_ARB_AB(arb)    (arb & 0x7)
+#define NVME_ARB_AB_NOLIMIT 0x7
 #define NVME_ARB_LPW(arb)   ((arb >> 8) & 0xff)
 #define NVME_ARB_MPW(arb)   ((arb >> 16) & 0xff)
 #define NVME_ARB_HPW(arb)   ((arb >> 24) & 0xff)
@@ -873,6 +875,8 @@ enum NvmeIdCtrlLpa {
 #define NVME_INTC_THR(intc)     (intc & 0xff)
 #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
 
+#define NVME_INTVC_NOCOALESCING (0x1 << 16)
+
 #define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
 #define NVME_TEMP_THSEL_OVER   0x0
 #define NVME_TEMP_THSEL_UNDER  0x1
@@ -899,9 +903,13 @@ enum NvmeFeatureIds {
     NVME_WRITE_ATOMICITY            = 0xa,
     NVME_ASYNCHRONOUS_EVENT_CONF    = 0xb,
     NVME_TIMESTAMP                  = 0xe,
-    NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
+    NVME_SOFTWARE_PROGRESS_MARKER   = 0x80,
+    NVME_FID_MAX                    = 0x100,
 };
 
+#define NVME_GETSETFEAT_FID_MASK 0xff
+#define NVME_GETSETFEAT_FID(dw10) (dw10 & NVME_GETSETFEAT_FID_MASK)
+
 typedef struct NvmeRangeType {
     uint8_t     type;
     uint8_t     attributes;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (10 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-08 19:25   ` Dmitry Fomichev
  2020-07-29 13:17   ` Maxim Levitsky
  2020-07-06  6:12 ` [PATCH v3 13/18] hw/block/nvme: make sure ncqr and nsqr is valid Klaus Jensen
                   ` (6 subsequent siblings)
  18 siblings, 2 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Since the device does not have any persistent state storage, no
features are "saveable" and setting the Save (SV) field in any Set
Features command will result in a Feature Identifier Not Saveable status
code.

Similarly, if the Select (SEL) field is set to request saved values, the
devices will (as it should) return the default values instead.

Since this also introduces "Supported Capabilities", the nsid field is
now also checked for validity wrt. the feature being get/set'ed.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c       | 103 +++++++++++++++++++++++++++++++++++++-----
 hw/block/trace-events |   4 +-
 include/block/nvme.h  |  27 ++++++++++-
 3 files changed, 119 insertions(+), 15 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 2d85e853403f..df8b786e4875 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -85,6 +85,14 @@ static const bool nvme_feature_support[NVME_FID_MAX] = {
     [NVME_TIMESTAMP]                = true,
 };
 
+static const uint32_t nvme_feature_cap[NVME_FID_MAX] = {
+    [NVME_TEMPERATURE_THRESHOLD]    = NVME_FEAT_CAP_CHANGE,
+    [NVME_VOLATILE_WRITE_CACHE]     = NVME_FEAT_CAP_CHANGE,
+    [NVME_NUMBER_OF_QUEUES]         = NVME_FEAT_CAP_CHANGE,
+    [NVME_ASYNCHRONOUS_EVENT_CONF]  = NVME_FEAT_CAP_CHANGE,
+    [NVME_TIMESTAMP]                = NVME_FEAT_CAP_CHANGE,
+};
+
 static void nvme_process_sq(void *opaque);
 
 static uint16_t nvme_cid(NvmeRequest *req)
@@ -1083,20 +1091,47 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
+    uint32_t nsid = le32_to_cpu(cmd->nsid);
     uint32_t result;
     uint8_t fid = NVME_GETSETFEAT_FID(dw10);
+    NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
     uint16_t iv;
 
     static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
         [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
     };
 
-    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
+    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
 
     if (!nvme_feature_support[fid]) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
+        if (!nsid || nsid > n->num_namespaces) {
+            /*
+             * The Reservation Notification Mask and Reservation Persistence
+             * features require a status code of Invalid Field in Command when
+             * NSID is 0xFFFFFFFF. Since the device does not support those
+             * features we can always return Invalid Namespace or Format as we
+             * should do for all other features.
+             */
+            return NVME_INVALID_NSID | NVME_DNR;
+        }
+    }
+
+    switch (sel) {
+    case NVME_GETFEAT_SELECT_CURRENT:
+        break;
+    case NVME_GETFEAT_SELECT_SAVED:
+        /* no features are saveable by the controller; fallthrough */
+    case NVME_GETFEAT_SELECT_DEFAULT:
+        goto defaults;
+    case NVME_GETFEAT_SELECT_CAP:
+        result = nvme_feature_cap[fid];
+        goto out;
+    }
+
     switch (fid) {
     case NVME_TEMPERATURE_THRESHOLD:
         result = 0;
@@ -1106,22 +1141,45 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
          * return 0 for all other sensors.
          */
         if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
-            break;
+            goto out;
         }
 
         switch (NVME_TEMP_THSEL(dw11)) {
         case NVME_TEMP_THSEL_OVER:
             result = n->features.temp_thresh_hi;
-            break;
+            goto out;
         case NVME_TEMP_THSEL_UNDER:
             result = n->features.temp_thresh_low;
-            break;
+            goto out;
         }
 
-        break;
+        return NVME_INVALID_FIELD | NVME_DNR;
     case NVME_VOLATILE_WRITE_CACHE:
         result = blk_enable_write_cache(n->conf.blk);
         trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
+        goto out;
+    case NVME_ASYNCHRONOUS_EVENT_CONF:
+        result = n->features.async_config;
+        goto out;
+    case NVME_TIMESTAMP:
+        return nvme_get_feature_timestamp(n, cmd);
+    default:
+        break;
+    }
+
+defaults:
+    switch (fid) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        result = 0;
+
+        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        if (NVME_TEMP_THSEL(dw11) == NVME_TEMP_THSEL_OVER) {
+            result = NVME_TEMPERATURE_WARNING;
+        }
+
         break;
     case NVME_NUMBER_OF_QUEUES:
         result = (n->params.max_ioqpairs - 1) |
@@ -1140,16 +1198,12 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         }
 
         break;
-    case NVME_ASYNCHRONOUS_EVENT_CONF:
-        result = n->features.async_config;
-        break;
-    case NVME_TIMESTAMP:
-        return nvme_get_feature_timestamp(n, cmd);
     default:
         result = nvme_feature_default[fid];
         break;
     }
 
+out:
     req->cqe.result = cpu_to_le32(result);
     return NVME_SUCCESS;
 }
@@ -1176,14 +1230,37 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
 {
     uint32_t dw10 = le32_to_cpu(cmd->cdw10);
     uint32_t dw11 = le32_to_cpu(cmd->cdw11);
+    uint32_t nsid = le32_to_cpu(cmd->nsid);
     uint8_t fid = NVME_GETSETFEAT_FID(dw10);
+    uint8_t save = NVME_SETFEAT_SAVE(dw10);
 
-    trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
+    trace_pci_nvme_setfeat(nvme_cid(req), fid, save, dw11);
+
+    if (save) {
+        return NVME_FID_NOT_SAVEABLE | NVME_DNR;
+    }
 
     if (!nvme_feature_support[fid]) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
+        if (!nsid || (nsid != NVME_NSID_BROADCAST &&
+                      nsid > n->num_namespaces)) {
+            return NVME_INVALID_NSID | NVME_DNR;
+        }
+    } else if (nsid && nsid != NVME_NSID_BROADCAST) {
+        if (nsid > n->num_namespaces) {
+            return NVME_INVALID_NSID | NVME_DNR;
+        }
+
+        return NVME_FEAT_NOT_NS_SPEC | NVME_DNR;
+    }
+
+    if (!(nvme_feature_cap[fid] & NVME_FEAT_CAP_CHANGE)) {
+        return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
+    }
+
     switch (fid) {
     case NVME_TEMPERATURE_THRESHOLD:
         if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
@@ -2028,7 +2105,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->sqes = (0x6 << 4) | 0x6;
     id->cqes = (0x4 << 4) | 0x4;
     id->nn = cpu_to_le32(n->num_namespaces);
-    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
+    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
+                           NVME_ONCS_FEATURES);
+
     id->psd[0].mp = cpu_to_le16(0x9c4);
     id->psd[0].enlat = cpu_to_le32(0x10);
     id->psd[0].exlat = cpu_to_le32(0x4);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 42e62f4649f8..4a4ef34071df 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -46,8 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
 pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
-pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
-pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
+pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
+pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
 pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
 pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
 pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index cd396111b2f5..179e20a01477 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -663,7 +663,7 @@ enum NvmeStatusCodes {
     NVME_INVALID_QUEUE_DEL      = 0x010c,
     NVME_FID_NOT_SAVEABLE       = 0x010d,
     NVME_FEAT_NOT_CHANGEABLE    = 0x010e,
-    NVME_FID_NOT_NSID_SPEC      = 0x010f,
+    NVME_FEAT_NOT_NS_SPEC       = 0x010f,
     NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
     NVME_CONFLICTING_ATTRS      = 0x0180,
     NVME_INVALID_PROT_INFO      = 0x0181,
@@ -907,9 +907,32 @@ enum NvmeFeatureIds {
     NVME_FID_MAX                    = 0x100,
 };
 
+typedef enum NvmeFeatureCap {
+    NVME_FEAT_CAP_SAVE      = 1 << 0,
+    NVME_FEAT_CAP_NS        = 1 << 1,
+    NVME_FEAT_CAP_CHANGE    = 1 << 2,
+} NvmeFeatureCap;
+
+typedef enum NvmeGetFeatureSelect {
+    NVME_GETFEAT_SELECT_CURRENT = 0x0,
+    NVME_GETFEAT_SELECT_DEFAULT = 0x1,
+    NVME_GETFEAT_SELECT_SAVED   = 0x2,
+    NVME_GETFEAT_SELECT_CAP     = 0x3,
+} NvmeGetFeatureSelect;
+
 #define NVME_GETSETFEAT_FID_MASK 0xff
 #define NVME_GETSETFEAT_FID(dw10) (dw10 & NVME_GETSETFEAT_FID_MASK)
 
+#define NVME_GETFEAT_SELECT_SHIFT 8
+#define NVME_GETFEAT_SELECT_MASK  0x7
+#define NVME_GETFEAT_SELECT(dw10) \
+    ((dw10 >> NVME_GETFEAT_SELECT_SHIFT) & NVME_GETFEAT_SELECT_MASK)
+
+#define NVME_SETFEAT_SAVE_SHIFT 31
+#define NVME_SETFEAT_SAVE_MASK  0x1
+#define NVME_SETFEAT_SAVE(dw10) \
+    ((dw10 >> NVME_SETFEAT_SAVE_SHIFT) & NVME_SETFEAT_SAVE_MASK)
+
 typedef struct NvmeRangeType {
     uint8_t     type;
     uint8_t     attributes;
@@ -926,6 +949,8 @@ typedef struct NvmeLBAF {
     uint8_t     rp;
 } NvmeLBAF;
 
+#define NVME_NSID_BROADCAST 0xffffffff
+
 typedef struct NvmeIdNs {
     uint64_t    nsze;
     uint64_t    ncap;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 13/18] hw/block/nvme: make sure ncqr and nsqr is valid
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (11 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-06  6:12 ` [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list Klaus Jensen
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

0xffff is not an allowed value for NCQR and NSQR in Set Features on
Number of Queues.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index df8b786e4875..37e4fd8dfce1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1295,6 +1295,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
+        /*
+         * NVMe v1.3, Section 5.21.1.7: 0xffff is not an allowed value for NCQR
+         * and NSQR.
+         */
+        if ((dw11 & 0xffff) == 0xffff || ((dw11 >> 16) & 0xffff) == 0xffff) {
+            return NVME_INVALID_FIELD | NVME_DNR;
+        }
+
         trace_pci_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
                                     ((dw11 >> 16) & 0xFFFF) + 1,
                                     n->params.max_ioqpairs,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (12 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 13/18] hw/block/nvme: make sure ncqr and nsqr is valid Klaus Jensen
@ 2020-07-06  6:12 ` Klaus Jensen
  2020-07-29 13:25   ` Maxim Levitsky
  2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:12 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Since we are not providing the NGUID or EUI64 fields, we must support
the Namespace UUID. We do not have any way of storing a persistent
unique identifier, so conjure up a UUID that is just the namespace id.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c       | 41 +++++++++++++++++++++++++++++++++++++++++
 hw/block/trace-events |  1 +
 2 files changed, 42 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 37e4fd8dfce1..fc58f3d76530 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1007,6 +1007,45 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
     return ret;
 }
 
+static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
+{
+    uint32_t nsid = le32_to_cpu(c->nsid);
+    uint64_t prp1 = le64_to_cpu(c->prp1);
+    uint64_t prp2 = le64_to_cpu(c->prp2);
+
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE];
+
+    struct data {
+        struct {
+            NvmeIdNsDescr hdr;
+            uint8_t v[16];
+        } uuid;
+    };
+
+    struct data *ns_descrs = (struct data *)list;
+
+    trace_pci_nvme_identify_ns_descr_list(nsid);
+
+    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
+        trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
+        return NVME_INVALID_NSID | NVME_DNR;
+    }
+
+    memset(list, 0x0, sizeof(list));
+
+    /*
+     * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
+     * structure, a Namespace UUID (nidt = 0x3) must be reported in the
+     * Namespace Identification Descriptor. Add a very basic Namespace UUID
+     * here.
+     */
+    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
+    ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
+    stl_be_p(&ns_descrs->uuid.v, nsid);
+
+    return nvme_dma_read_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2);
+}
+
 static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
 {
     NvmeIdentify *c = (NvmeIdentify *)cmd;
@@ -1018,6 +1057,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
         return nvme_identify_ctrl(n, c);
     case NVME_ID_CNS_NS_ACTIVE_LIST:
         return nvme_identify_nslist(n, c);
+    case NVME_ID_CNS_NS_DESCR_LIST:
+        return nvme_identify_ns_descr_list(n, c);
     default:
         trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 4a4ef34071df..7b7303cab1dd 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
 pci_nvme_identify_ctrl(void) "identify controller"
 pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
+pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
 pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
 pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
 pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (13 preceding siblings ...)
  2020-07-06  6:12 ` [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list Klaus Jensen
@ 2020-07-06  6:13 ` Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2020-07-06  6:13 ` [PATCH v3 16/18] hw/block/nvme: enforce valid queue creation sequence Klaus Jensen
                   ` (3 subsequent siblings)
  18 siblings, 3 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:13 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Reject the nsid broadcast value (0xffffffff) and 0xfffffffe in the
Active Namespace ID list.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index fc58f3d76530..af39126cd8d1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -992,6 +992,16 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
 
     trace_pci_nvme_identify_nslist(min_nsid);
 
+    /*
+     * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values
+     * since the Active Namespace ID List should return namespaces with ids
+     * *higher* than the NSID specified in the command. This is also specified
+     * in the spec (NVM Express v1.3d, Section 5.15.4).
+     */
+    if (min_nsid >= NVME_NSID_BROADCAST - 1) {
+        return NVME_INVALID_NSID | NVME_DNR;
+    }
+
     list = g_malloc0(data_len);
     for (i = 0; i < n->num_namespaces; i++) {
         if (i < min_nsid) {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 16/18] hw/block/nvme: enforce valid queue creation sequence
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (14 preceding siblings ...)
  2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
@ 2020-07-06  6:13 ` Klaus Jensen
  2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:13 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Support returning Command Sequence Error if Set Features on Number of
Queues is called after queues have been created.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c | 12 ++++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index af39126cd8d1..07d58aa945f2 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -946,6 +946,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
     cq = g_malloc0(sizeof(*cq));
     nvme_init_cq(cq, n, prp1, cqid, vector, qsize + 1,
         NVME_CQ_FLAGS_IEN(qflags));
+
+    /*
+     * It is only required to set qs_created when creating a completion queue;
+     * creating a submission queue without a matching completion queue will
+     * fail.
+     */
+    n->qs_created = true;
     return NVME_SUCCESS;
 }
 
@@ -1346,6 +1353,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
         blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
         break;
     case NVME_NUMBER_OF_QUEUES:
+        if (n->qs_created) {
+            return NVME_CMD_SEQ_ERROR | NVME_DNR;
+        }
+
         /*
          * NVMe v1.3, Section 5.21.1.7: 0xffff is not an allowed value for NCQR
          * and NSQR.
@@ -1478,6 +1489,7 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
 
     n->aer_queued = 0;
     n->outstanding_aers = 0;
+    n->qs_created = false;
 
     blk_flush(n->conf.blk);
     n->bar.cc = 0;
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index b93067c9e4a1..0b6a8ae66559 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -95,6 +95,7 @@ typedef struct NvmeCtrl {
     BlockConf    conf;
     NvmeParams   params;
 
+    bool        qs_created;
     uint32_t    page_size;
     uint16_t    page_bits;
     uint16_t    max_prp_ents;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (15 preceding siblings ...)
  2020-07-06  6:13 ` [PATCH v3 16/18] hw/block/nvme: enforce valid queue creation sequence Klaus Jensen
@ 2020-07-06  6:13 ` Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2020-07-06  6:13 ` [PATCH v3 18/18] hw/block/nvme: bump supported version to v1.3 Klaus Jensen
  2020-07-20  9:13 ` [PATCH v3 00/18] hw/block/nvme: bump " Klaus Jensen
  18 siblings, 3 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:13 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

The SUBNQN field is mandatory in NVM Express 1.3.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/block/nvme.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 07d58aa945f2..e3984157926b 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2141,6 +2141,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
 {
     NvmeIdCtrl *id = &n->id_ctrl;
     uint8_t *pci_conf = pci_dev->config;
+    char *subnqn;
 
     id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
     id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
@@ -2179,6 +2180,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
                            NVME_ONCS_FEATURES);
 
+    subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s", n->params.serial);
+    strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
+    g_free(subnqn);
+
     id->psd[0].mp = cpu_to_le16(0x9c4);
     id->psd[0].enlat = cpu_to_le32(0x10);
     id->psd[0].exlat = cpu_to_le32(0x4);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v3 18/18] hw/block/nvme: bump supported version to v1.3
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (16 preceding siblings ...)
  2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
@ 2020-07-06  6:13 ` Klaus Jensen
  2020-07-20  9:13 ` [PATCH v3 00/18] hw/block/nvme: bump " Klaus Jensen
  18 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-06  6:13 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Klaus Jensen, Keith Busch, Javier Gonzalez, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Bump the supported NVM Express version to v1.3.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index e3984157926b..eda3fedb84e3 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -57,6 +57,7 @@
 #define NVME_MAX_IOQPAIRS 0xffff
 #define NVME_REG_SIZE 0x1000
 #define NVME_DB_SIZE  4
+#define NVME_SPEC_VER 0x00010300
 #define NVME_CMB_BIR 2
 #define NVME_PMR_BIR 2
 #define NVME_TEMPERATURE 0x143
@@ -2152,6 +2153,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     id->ieee[0] = 0x00;
     id->ieee[1] = 0x02;
     id->ieee[2] = 0xb3;
+    id->ver = cpu_to_le32(NVME_SPEC_VER);
     id->oacs = cpu_to_le16(0);
 
     /*
@@ -2198,7 +2200,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     NVME_CAP_SET_CSS(n->bar.cap, 1);
     NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
-    n->bar.vs = 0x00010200;
+    n->bar.vs = NVME_SPEC_VER;
     n->bar.intmc = n->bar.intms = 0;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field
  2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
@ 2020-07-06  9:47   ` Philippe Mathieu-Daudé
  2020-07-08 19:26   ` Dmitry Fomichev
  2020-07-29 13:34   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-06  9:47 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 7/6/20 8:13 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The SUBNQN field is mandatory in NVM Express 1.3.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 07d58aa945f2..e3984157926b 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2141,6 +2141,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>  {
>      NvmeIdCtrl *id = &n->id_ctrl;
>      uint8_t *pci_conf = pci_dev->config;
> +    char *subnqn;
>  
>      id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
>      id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
> @@ -2179,6 +2180,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
>                             NVME_ONCS_FEATURES);
>  
> +    subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s", n->params.serial);
> +    strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
> +    g_free(subnqn);
> +
>      id->psd[0].mp = cpu_to_le16(0x9c4);
>      id->psd[0].enlat = cpu_to_le32(0x10);
>      id->psd[0].exlat = cpu_to_le32(0x4);
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list
  2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
@ 2020-07-06  9:47   ` Philippe Mathieu-Daudé
  2020-07-08 19:26   ` Dmitry Fomichev
  2020-07-29 13:27   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-06  9:47 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 7/6/20 8:13 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Reject the nsid broadcast value (0xffffffff) and 0xfffffffe in the
> Active Namespace ID list.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index fc58f3d76530..af39126cd8d1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -992,6 +992,16 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>  
>      trace_pci_nvme_identify_nslist(min_nsid);
>  
> +    /*
> +     * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values
> +     * since the Active Namespace ID List should return namespaces with ids
> +     * *higher* than the NSID specified in the command. This is also specified
> +     * in the spec (NVM Express v1.3d, Section 5.15.4).
> +     */
> +    if (min_nsid >= NVME_NSID_BROADCAST - 1) {
> +        return NVME_INVALID_NSID | NVME_DNR;
> +    }
> +
>      list = g_malloc0(data_len);
>      for (i = 0; i < n->num_namespaces; i++) {
>          if (i < min_nsid) {
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 03/18] hw/block/nvme: additional tracing
  2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
@ 2020-07-06  9:50   ` Philippe Mathieu-Daudé
  2020-07-08 19:21   ` Dmitry Fomichev
  2020-07-29  8:52   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-06  9:50 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 7/6/20 8:12 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add various additional tracing and streamline nvme_identify_ns and
> nvme_identify_nslist (they do not need to repeat the command, it is
> already in the trace name).
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c       | 33 +++++++++++++++++++++++++++++++++
>  hw/block/trace-events | 13 +++++++++++--
>  2 files changed, 44 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion
  2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
@ 2020-07-06  9:50   ` Philippe Mathieu-Daudé
  2020-07-08 19:20   ` Dmitry Fomichev
  2020-07-29  8:49   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-06  9:50 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Maxim Levitsky

On 7/6/20 8:12 AM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Fix a missing cpu_to conversion by moving conversion to just before
> returning instead.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  hw/block/nvme.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 71b388aa0e20..766cd5b33bb1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -815,8 +815,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
>          break;
>      case NVME_NUMBER_OF_QUEUES:
> -        result = cpu_to_le32((n->params.max_ioqpairs - 1) |
> -                             ((n->params.max_ioqpairs - 1) << 16));
> +        result = (n->params.max_ioqpairs - 1) |
> +            ((n->params.max_ioqpairs - 1) << 16);
>          trace_pci_nvme_getfeat_numq(result);
>          break;
>      case NVME_TIMESTAMP:
> @@ -826,7 +826,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    req->cqe.result = result;
> +    req->cqe.result = cpu_to_le32(result);
>      return NVME_SUCCESS;
>  }
>  
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to v1.3
  2020-07-06  6:12 ` [PATCH v3 01/18] hw/block/nvme: bump spec data structures " Klaus Jensen
@ 2020-07-08 19:19   ` Dmitry Fomichev
  2020-07-08 21:24     ` Klaus Jensen
  0 siblings, 1 reply; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:19 UTC (permalink / raw)
  To: its, qemu-block
  Cc: fam, kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good with a small nit (see below),

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> 
On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add missing fields in the Identify Controller and Identify Namespace
> data structures to bring them in line with NVMe v1.3.
> 
> This also adds data structures and defines for SGL support which
> requires a couple of trivial changes to the nvme block driver as well.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Fam Zheng <fam@euphon.net>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  block/nvme.c         |  18 ++---
>  hw/block/nvme.c      |  12 ++--
>  include/block/nvme.h | 156 ++++++++++++++++++++++++++++++++++++++-----
>  3 files changed, 154 insertions(+), 32 deletions(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index 374e26891573..c1c4c07ac6cc 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -518,7 +518,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
>          error_setg(errp, "Cannot map buffer for DMA");
>          goto out;
>      }
> -    cmd.prp1 = cpu_to_le64(iova);
> +    cmd.dptr.prp1 = cpu_to_le64(iova);
>  
>      if (nvme_cmd_sync(bs, s->queues[0], &cmd)) {
>          error_setg(errp, "Failed to identify controller");
> @@ -629,7 +629,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
>      }
>      cmd = (NvmeCmd) {
>          .opcode = NVME_ADM_CMD_CREATE_CQ,
> -        .prp1 = cpu_to_le64(q->cq.iova),
> +        .dptr.prp1 = cpu_to_le64(q->cq.iova),
>          .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
>          .cdw11 = cpu_to_le32(0x3),
>      };
> @@ -640,7 +640,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
>      }
>      cmd = (NvmeCmd) {
>          .opcode = NVME_ADM_CMD_CREATE_SQ,
> -        .prp1 = cpu_to_le64(q->sq.iova),
> +        .dptr.prp1 = cpu_to_le64(q->sq.iova),
>          .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
>          .cdw11 = cpu_to_le32(0x1 | (n << 16)),
>      };
> @@ -988,16 +988,16 @@ try_map:
>      case 0:
>          abort();
>      case 1:
> -        cmd->prp1 = pagelist[0];
> -        cmd->prp2 = 0;
> +        cmd->dptr.prp1 = pagelist[0];
> +        cmd->dptr.prp2 = 0;
>          break;
>      case 2:
> -        cmd->prp1 = pagelist[0];
> -        cmd->prp2 = pagelist[1];
> +        cmd->dptr.prp1 = pagelist[0];
> +        cmd->dptr.prp2 = pagelist[1];
>          break;
>      default:
> -        cmd->prp1 = pagelist[0];
> -        cmd->prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
> +        cmd->dptr.prp1 = pagelist[0];
> +        cmd->dptr.prp2 = cpu_to_le64(req->prp_list_iova + sizeof(uint64_t));
>          break;
>      }
>      trace_nvme_cmd_map_qiov(s, cmd, req, qiov, entries);
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 1aee042d4cb2..71b388aa0e20 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -397,8 +397,8 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
>      NvmeRwCmd *rw = (NvmeRwCmd *)cmd;
>      uint32_t nlb  = le32_to_cpu(rw->nlb) + 1;
>      uint64_t slba = le64_to_cpu(rw->slba);
> -    uint64_t prp1 = le64_to_cpu(rw->prp1);
> -    uint64_t prp2 = le64_to_cpu(rw->prp2);
> +    uint64_t prp1 = le64_to_cpu(rw->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(rw->dptr.prp2);
>  
>      uint8_t lba_index  = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
>      uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
> @@ -795,8 +795,8 @@ static inline uint64_t nvme_get_timestamp(const NvmeCtrl *n)
>  
>  static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
>  {
> -    uint64_t prp1 = le64_to_cpu(cmd->prp1);
> -    uint64_t prp2 = le64_to_cpu(cmd->prp2);
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
>  
>      uint64_t timestamp = nvme_get_timestamp(n);
>  
> @@ -834,8 +834,8 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
>  {
>      uint16_t ret;
>      uint64_t timestamp;
> -    uint64_t prp1 = le64_to_cpu(cmd->prp1);
> -    uint64_t prp2 = le64_to_cpu(cmd->prp2);
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
>  
>      ret = nvme_dma_write_prp(n, (uint8_t *)&timestamp,
>                                  sizeof(timestamp), prp1, prp2);
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 1720ee1d5158..2a80d2a7ed89 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -377,15 +377,53 @@ enum NvmePmrmscMask {
>  #define NVME_PMRMSC_SET_CBA(pmrmsc, val)   \
>      (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT)
>  
> +enum NvmeSglDescriptorType {
> +    NVME_SGL_DESCR_TYPE_DATA_BLOCK          = 0x0,
> +    NVME_SGL_DESCR_TYPE_BIT_BUCKET          = 0x1,
> +    NVME_SGL_DESCR_TYPE_SEGMENT             = 0x2,
> +    NVME_SGL_DESCR_TYPE_LAST_SEGMENT        = 0x3,
> +    NVME_SGL_DESCR_TYPE_KEYED_DATA_BLOCK    = 0x4,
> +
> +    NVME_SGL_DESCR_TYPE_VENDOR_SPECIFIC     = 0xf,
> +};
> +
> +enum NvmeSglDescriptorSubtype {
> +    NVME_SGL_DESCR_SUBTYPE_ADDRESS = 0x0,
> +};
> +
> +typedef struct NvmeSglDescriptor {
> +    uint64_t addr;
> +    uint32_t len;
> +    uint8_t  rsvd[3];
> +    uint8_t  type;
> +} NvmeSglDescriptor;
> +
> +#define NVME_SGL_TYPE(type)     ((type >> 4) & 0xf)
> +#define NVME_SGL_SUBTYPE(type)  (type & 0xf)
> +
> +typedef union NvmeCmdDptr {
> +    struct {
> +        uint64_t    prp1;
> +        uint64_t    prp2;
> +    };
> +
> +    NvmeSglDescriptor sgl;
> +} NvmeCmdDptr;
> +
> +enum NvmePsdt {
> +    PSDT_PRP                 = 0x0,
> +    PSDT_SGL_MPTR_CONTIGUOUS = 0x1,
> +    PSDT_SGL_MPTR_SGL        = 0x2,
> +};
> +
>  typedef struct NvmeCmd {
>      uint8_t     opcode;
> -    uint8_t     fuse;
> +    uint8_t     flags;
>      uint16_t    cid;
>      uint32_t    nsid;
>      uint64_t    res1;
>      uint64_t    mptr;
> -    uint64_t    prp1;
> -    uint64_t    prp2;
> +    NvmeCmdDptr dptr;
>      uint32_t    cdw10;
>      uint32_t    cdw11;
>      uint32_t    cdw12;
> @@ -394,6 +432,9 @@ typedef struct NvmeCmd {
>      uint32_t    cdw15;
>  } NvmeCmd;
>  
> +#define NVME_CMD_FLAGS_FUSE(flags) (flags & 0x3)
> +#define NVME_CMD_FLAGS_PSDT(flags) ((flags >> 6) & 0x3)
> +
>  enum NvmeAdminCommands {
>      NVME_ADM_CMD_DELETE_SQ      = 0x00,
>      NVME_ADM_CMD_CREATE_SQ      = 0x01,
> @@ -493,8 +534,7 @@ typedef struct NvmeRwCmd {
>      uint32_t    nsid;
>      uint64_t    rsvd2;
>      uint64_t    mptr;
> -    uint64_t    prp1;
> -    uint64_t    prp2;
> +    NvmeCmdDptr dptr;
>      uint64_t    slba;
>      uint16_t    nlb;
>      uint16_t    control;
> @@ -534,8 +574,7 @@ typedef struct NvmeDsmCmd {
>      uint16_t    cid;
>      uint32_t    nsid;
>      uint64_t    rsvd2[2];
> -    uint64_t    prp1;
> -    uint64_t    prp2;
> +    NvmeCmdDptr dptr;
>      uint32_t    nr;
>      uint32_t    attributes;
>      uint32_t    rsvd12[4];
> @@ -599,6 +638,12 @@ enum NvmeStatusCodes {
>      NVME_CMD_ABORT_MISSING_FUSE = 0x000a,
>      NVME_INVALID_NSID           = 0x000b,
>      NVME_CMD_SEQ_ERROR          = 0x000c,
> +    NVME_INVALID_SGL_SEG_DESCR  = 0x000d,
> +    NVME_INVALID_NUM_SGL_DESCRS = 0x000e,
> +    NVME_DATA_SGL_LEN_INVALID   = 0x000f,
> +    NVME_MD_SGL_LEN_INVALID     = 0x0010,
> +    NVME_SGL_DESCR_TYPE_INVALID = 0x0011,
> +    NVME_INVALID_USE_OF_CMB     = 0x0012,
>      NVME_LBA_RANGE              = 0x0080,
>      NVME_CAP_EXCEEDED           = 0x0081,
>      NVME_NS_NOT_READY           = 0x0082,
> @@ -687,7 +732,7 @@ enum NvmeSmartWarn {
>      NVME_SMART_FAILED_VOLATILE_MEDIA  = 1 << 4,
>  };
>  
> -enum LogIdentifier {
> +enum NvmeLogIdentifier {
>      NVME_LOG_ERROR_INFO     = 0x01,
>      NVME_LOG_SMART_INFO     = 0x02,
>      NVME_LOG_FW_SLOT_INFO   = 0x03,
> @@ -711,6 +756,7 @@ enum {
>      NVME_ID_CNS_NS             = 0x0,
>      NVME_ID_CNS_CTRL           = 0x1,
>      NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
> +    NVME_ID_CNS_NS_DESCR_LIST  = 0x3,
>  };
>  
>  typedef struct NvmeIdCtrl {
> @@ -723,7 +769,15 @@ typedef struct NvmeIdCtrl {
>      uint8_t     ieee[3];
>      uint8_t     cmic;
>      uint8_t     mdts;
> -    uint8_t     rsvd255[178];
> +    uint16_t    cntlid;
> +    uint32_t    ver;
> +    uint32_t    rtd3r;
> +    uint32_t    rtd3e;
> +    uint32_t    oaes;
> +    uint32_t    ctratt;
> +    uint8_t     rsvd100[12];
> +    uint8_t     fguid[16];
> +    uint8_t     rsvd128[128];
>      uint16_t    oacs;
>      uint8_t     acl;
>      uint8_t     aerl;
> @@ -731,10 +785,28 @@ typedef struct NvmeIdCtrl {
>      uint8_t     lpa;
>      uint8_t     elpe;
>      uint8_t     npss;
> -    uint8_t     rsvd511[248];
> +    uint8_t     avscc;
> +    uint8_t     apsta;
> +    uint16_t    wctemp;
> +    uint16_t    cctemp;
> +    uint16_t    mtfa;
> +    uint32_t    hmpre;
> +    uint32_t    hmmin;
> +    uint8_t     tnvmcap[16];
> +    uint8_t     unvmcap[16];
> +    uint32_t    rpmbs;
> +    uint16_t    edstt;
> +    uint8_t     dsto;
> +    uint8_t     fwug;
> +    uint16_t    kas;
> +    uint16_t    hctma;
> +    uint16_t    mntmt;
> +    uint16_t    mxtmt;
> +    uint32_t    sanicap;
> +    uint8_t     rsvd332[180];
>      uint8_t     sqes;
>      uint8_t     cqes;
> -    uint16_t    rsvd515;
> +    uint16_t    maxcmd;
>      uint32_t    nn;
>      uint16_t    oncs;
>      uint16_t    fuses;
> @@ -742,8 +814,14 @@ typedef struct NvmeIdCtrl {
>      uint8_t     vwc;
>      uint16_t    awun;
>      uint16_t    awupf;
> -    uint8_t     rsvd703[174];
> -    uint8_t     rsvd2047[1344];
> +    uint8_t     nvscc;
> +    uint8_t     rsvd531;
> +    uint16_t    acwu;
> +    uint8_t     rsvd534[2];
> +    uint32_t    sgls;
> +    uint8_t     rsvd540[228];
> +    uint8_t     subnqn[256];
> +    uint8_t     rsvd1024[1024];
>      NvmePSD     psd[32];
>      uint8_t     vs[1024];
>  } NvmeIdCtrl;
> @@ -769,6 +847,16 @@ enum NvmeIdCtrlOncs {
>  #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
>  #define NVME_CTRL_CQES_MAX(cqes) (((cqes) >> 4) & 0xf)
>  
> +#define NVME_CTRL_SGLS_SUPPORT_MASK        (0x3 <<  0)
> +#define NVME_CTRL_SGLS_SUPPORT_NO_ALIGN    (0x1 <<  0)
> +#define NVME_CTRL_SGLS_SUPPORT_DWORD_ALIGN (0x1 <<  1)
> +#define NVME_CTRL_SGLS_KEYED               (0x1 <<  2)
> +#define NVME_CTRL_SGLS_BITBUCKET           (0x1 << 16)
> +#define NVME_CTRL_SGLS_MPTR_CONTIGUOUS     (0x1 << 17)
> +#define NVME_CTRL_SGLS_EXCESS_LENGTH       (0x1 << 18)
> +#define NVME_CTRL_SGLS_MPTR_SGL            (0x1 << 19)
> +#define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
> +
>  typedef struct NvmeFeatureVal {
>      uint32_t    arbitration;
>      uint32_t    power_mgmt;
> @@ -791,6 +879,15 @@ typedef struct NvmeFeatureVal {
>  #define NVME_INTC_THR(intc)     (intc & 0xff)
>  #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
>  
> +#define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
> +#define NVME_TEMP_THSEL_OVER   0x0
> +#define NVME_TEMP_THSEL_UNDER  0x1
> +
> +#define NVME_TEMP_TMPSEL(temp)     ((temp >> 16) & 0xf)
> +#define NVME_TEMP_TMPSEL_COMPOSITE 0x0
> +
> +#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)

There is an extra space after temp >>

> +
>  enum NvmeFeatureIds {
>      NVME_ARBITRATION                = 0x1,
>      NVME_POWER_MANAGEMENT           = 0x2,
> @@ -833,18 +930,43 @@ typedef struct NvmeIdNs {
>      uint8_t     mc;
>      uint8_t     dpc;
>      uint8_t     dps;
> -
>      uint8_t     nmic;
>      uint8_t     rescap;
>      uint8_t     fpi;
>      uint8_t     dlfeat;
> -
> -    uint8_t     res34[94];
> +    uint16_t    nawun;
> +    uint16_t    nawupf;
> +    uint16_t    nacwu;
> +    uint16_t    nabsn;
> +    uint16_t    nabo;
> +    uint16_t    nabspf;
> +    uint16_t    noiob;
> +    uint8_t     nvmcap[16];
> +    uint8_t     rsvd64[40];
> +    uint8_t     nguid[16];
> +    uint64_t    eui64;
>      NvmeLBAF    lbaf[16];
> -    uint8_t     res192[192];
> +    uint8_t     rsvd192[192];
>      uint8_t     vs[3712];
>  } NvmeIdNs;
>  
> +typedef struct NvmeIdNsDescr {
> +    uint8_t nidt;
> +    uint8_t nidl;
> +    uint8_t rsvd2[2];
> +} NvmeIdNsDescr;
> +
> +enum {
> +    NVME_NIDT_EUI64_LEN =  8,
> +    NVME_NIDT_NGUID_LEN = 16,
> +    NVME_NIDT_UUID_LEN  = 16,
> +};
> +
> +enum NvmeNsIdentifierType {
> +    NVME_NIDT_EUI64 = 0x1,
> +    NVME_NIDT_NGUID = 0x2,
> +    NVME_NIDT_UUID  = 0x3,
> +};
>  
>  /*Deallocate Logical Block Features*/
>  #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat)       ((dlfeat) & 0x10)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion
  2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
@ 2020-07-08 19:20   ` Dmitry Fomichev
  2020-07-29  8:49   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:20 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Fix a missing cpu_to conversion by moving conversion to just before
> returning instead.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  hw/block/nvme.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 71b388aa0e20..766cd5b33bb1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -815,8 +815,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
>          break;
>      case NVME_NUMBER_OF_QUEUES:
> -        result = cpu_to_le32((n->params.max_ioqpairs - 1) |
> -                             ((n->params.max_ioqpairs - 1) << 16));
> +        result = (n->params.max_ioqpairs - 1) |
> +            ((n->params.max_ioqpairs - 1) << 16);
>          trace_pci_nvme_getfeat_numq(result);
>          break;
>      case NVME_TIMESTAMP:
> @@ -826,7 +826,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    req->cqe.result = result;
> +    req->cqe.result = cpu_to_le32(result);
>      return NVME_SUCCESS;
>  }
>  

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 03/18] hw/block/nvme: additional tracing
  2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
@ 2020-07-08 19:21   ` Dmitry Fomichev
  2020-07-29  8:52   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:21 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add various additional tracing and streamline nvme_identify_ns and
> nvme_identify_nslist (they do not need to repeat the command, it is
> already in the trace name).
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c       | 33 +++++++++++++++++++++++++++++++++
>  hw/block/trace-events | 13 +++++++++++--
>  2 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 766cd5b33bb1..09ef54d771c4 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -69,6 +69,20 @@
>  
>  static void nvme_process_sq(void *opaque);
>  
> +static uint16_t nvme_cid(NvmeRequest *req)
> +{
> +    if (!req) {
> +        return 0xffff;
> +    }
> +
> +    return le16_to_cpu(req->cqe.cid);
> +}
> +
> +static uint16_t nvme_sqid(NvmeRequest *req)
> +{
> +    return le16_to_cpu(req->sq->sqid);
> +}
> +
>  static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
>  {
>      hwaddr low = n->ctrl_mem.addr;
> @@ -331,6 +345,8 @@ static void nvme_post_cqes(void *opaque)
>  static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
>  {
>      assert(cq->cqid == req->sq->cqid);
> +    trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
> +                                          req->status);
>      QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
>      QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
>      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> @@ -343,6 +359,8 @@ static void nvme_rw_cb(void *opaque, int ret)
>      NvmeCtrl *n = sq->ctrl;
>      NvmeCQueue *cq = n->cq[sq->cqid];
>  
> +    trace_pci_nvme_rw_cb(nvme_cid(req));
> +
>      if (!ret) {
>          block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
>          req->status = NVME_SUCCESS;
> @@ -378,6 +396,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
>      uint64_t offset = slba << data_shift;
>      uint32_t count = nlb << data_shift;
>  
> +    trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb);
> +
>      if (unlikely(slba + nlb > ns->id_ns.nsze)) {
>          trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
>          return NVME_LBA_RANGE | NVME_DNR;
> @@ -445,6 +465,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      NvmeNamespace *ns;
>      uint32_t nsid = le32_to_cpu(cmd->nsid);
>  
> +    trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), cmd->opcode);
> +
>      if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
>          trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
>          return NVME_INVALID_NSID | NVME_DNR;
> @@ -876,6 +898,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  
>  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
> +    trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> +
>      switch (cmd->opcode) {
>      case NVME_ADM_CMD_DELETE_SQ:
>          return nvme_del_sq(n, cmd);
> @@ -1204,6 +1228,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
>      uint8_t *ptr = (uint8_t *)&n->bar;
>      uint64_t val = 0;
>  
> +    trace_pci_nvme_mmio_read(addr);
> +
>      if (unlikely(addr & (sizeof(uint32_t) - 1))) {
>          NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
>                         "MMIO read not 32-bit aligned,"
> @@ -1273,6 +1299,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>              return;
>          }
>  
> +        trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
> +
>          start_sqs = nvme_cq_full(cq) ? 1 : 0;
>          cq->head = new_head;
>          if (start_sqs) {
> @@ -1311,6 +1339,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>              return;
>          }
>  
> +        trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
> +
>          sq->tail = new_tail;
>          timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>      }
> @@ -1320,6 +1350,9 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, uint64_t data,
>      unsigned size)
>  {
>      NvmeCtrl *n = (NvmeCtrl *)opaque;
> +
> +    trace_pci_nvme_mmio_write(addr, data);
> +
>      if (addr < sizeof(n->bar)) {
>          nvme_write_bar(n, addr, data, size);
>      } else if (addr >= 0x1000) {
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 958fcc5508d1..c40c0d2e4b28 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -33,19 +33,28 @@ pci_nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
>  pci_nvme_irq_pin(void) "pulsing IRQ pin"
>  pci_nvme_irq_masked(void) "IRQ is masked"
>  pci_nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
> +pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
> +pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
>  pci_nvme_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
> +pci_nvme_rw_cb(uint16_t cid) "cid %"PRIu16""
> +pci_nvme_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
>  pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
>  pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
>  pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
>  pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
> -pci_nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
> -pci_nvme_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
> +pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
>  pci_nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
>  pci_nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
> +pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
> +pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
> +pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
> +pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16""
> +pci_nvme_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_tail %"PRIu16""
>  pci_nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
>  pci_nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
>  pci_nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
@ 2020-07-08 19:22   ` Dmitry Fomichev
  2020-07-29 10:24   ` Maxim Levitsky
  2020-09-29 13:11   ` Peter Maydell
  2 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:22 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add support for the Get Log Page command and basic implementations of
> the mandatory Error Information, SMART / Health Information and Firmware
> Slot Information log pages.
> 
> In violation of the specification, the SMART / Health Information log
> page does not persist information over the lifetime of the controller
> because the device has no place to store such persistent state.
> 
> Note that the LPA field in the Identify Controller data structure
> intentionally has bit 0 cleared because there is no namespace specific
> information in the SMART / Health information log page.
> 
> Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> Section 5.14 ("Get Log Page command").
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c       | 140 +++++++++++++++++++++++++++++++++++++++++-
>  hw/block/nvme.h       |   2 +
>  hw/block/trace-events |   2 +
>  include/block/nvme.h  |   8 ++-
>  4 files changed, 149 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index b6bc75eb61a2..7cb3787638f6 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -606,6 +606,140 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
>      return NVME_SUCCESS;
>  }
>  
> +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                uint64_t off, NvmeRequest *req)
> +{
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> +
> +    uint32_t trans_len;
> +    time_t current_ms;
> +    uint64_t units_read = 0, units_written = 0;
> +    uint64_t read_commands = 0, write_commands = 0;
> +    NvmeSmartLog smart;
> +    BlockAcctStats *s;
> +
> +    if (nsid && nsid != 0xffffffff) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    s = blk_get_stats(n->conf.blk);
> +
> +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> +
> +    if (off > sizeof(smart)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trans_len = MIN(sizeof(smart) - off, buf_len);
> +
> +    memset(&smart, 0x0, sizeof(smart));
> +
> +    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
> +    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
> +    smart.host_read_commands[0] = cpu_to_le64(read_commands);
> +    smart.host_write_commands[0] = cpu_to_le64(write_commands);
> +
> +    smart.temperature = cpu_to_le16(n->temperature);
> +
> +    if ((n->temperature >= n->features.temp_thresh_hi) ||
> +        (n->temperature <= n->features.temp_thresh_low)) {
> +        smart.critical_warning |= NVME_SMART_TEMPERATURE;
> +    }
> +
> +    current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> +    smart.power_on_hours[0] =
> +        cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> +                             prp2);
> +}
> +
> +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                 uint64_t off, NvmeRequest *req)
> +{
> +    uint32_t trans_len;
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    NvmeFwSlotInfoLog fw_log = {
> +        .afi = 0x1,
> +    };
> +
> +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> +
> +    if (off > sizeof(fw_log)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> +                             prp2);
> +}
> +
> +static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                uint64_t off, NvmeRequest *req)
> +{
> +    uint32_t trans_len;
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    NvmeErrorLog errlog;
> +
> +    if (off > sizeof(errlog)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    memset(&errlog, 0x0, sizeof(errlog));
> +
> +    trans_len = MIN(sizeof(errlog) - off, buf_len);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2);
> +}
> +
> +static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> +{
> +    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
> +    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
> +    uint8_t  lid = dw10 & 0xff;
> +    uint8_t  lsp = (dw10 >> 8) & 0xf;
> +    uint8_t  rae = (dw10 >> 15) & 0x1;
> +    uint32_t numdl, numdu;
> +    uint64_t off, lpol, lpou;
> +    size_t   len;
> +
> +    numdl = (dw10 >> 16);
> +    numdu = (dw11 & 0xffff);
> +    lpol = dw12;
> +    lpou = dw13;
> +
> +    len = (((numdu << 16) | numdl) + 1) << 2;
> +    off = (lpou << 32ULL) | lpol;
> +
> +    if (off & 0x3) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trace_pci_nvme_get_log(nvme_cid(req), lid, lsp, rae, len, off);
> +
> +    switch (lid) {
> +    case NVME_LOG_ERROR_INFO:
> +        return nvme_error_info(n, cmd, len, off, req);
> +    case NVME_LOG_SMART_INFO:
> +        return nvme_smart_info(n, cmd, len, off, req);
> +    case NVME_LOG_FW_SLOT_INFO:
> +        return nvme_fw_log_info(n, cmd, len, off, req);
> +    default:
> +        trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +}
> +
>  static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
>  {
>      n->cq[cq->cqid] = NULL;
> @@ -960,6 +1094,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return nvme_del_sq(n, cmd);
>      case NVME_ADM_CMD_CREATE_SQ:
>          return nvme_create_sq(n, cmd);
> +    case NVME_ADM_CMD_GET_LOG_PAGE:
> +        return nvme_get_log(n, cmd, req);
>      case NVME_ADM_CMD_DELETE_CQ:
>          return nvme_del_cq(n, cmd);
>      case NVME_ADM_CMD_CREATE_CQ:
> @@ -1511,7 +1647,9 @@ static void nvme_init_state(NvmeCtrl *n)
>      n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>      n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>      n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +    n->temperature = NVME_TEMPERATURE;
>      n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
> +    n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>  }
>  
>  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> @@ -1668,7 +1806,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>       */
>      id->acl = 3;
>      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> -    id->lpa = 1 << 0;
> +    id->lpa = NVME_LPA_EXTENDED;
>  
>      /* recommended default value (~70 C) */
>      id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index e3a2c907e210..8228978e93de 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -98,6 +98,8 @@ typedef struct NvmeCtrl {
>      uint32_t    irq_status;
>      uint64_t    host_timestamp;                 /* Timestamp sent by the host */
>      uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
> +    uint64_t    starttime_ms;
> +    uint16_t    temperature;
>  
>      HostMemoryBackend *pmrdev;
>  
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index c40c0d2e4b28..3330d74e48db 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> @@ -94,6 +95,7 @@ pci_nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completi
>  pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
>  pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
>  pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
> +pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16""
>  pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
>  pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
>  pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index d639e8bbee92..49ce97ae1ab4 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -704,9 +704,9 @@ typedef struct NvmeErrorLog {
>      uint8_t     resv[35];
>  } NvmeErrorLog;
>  
> -typedef struct NvmeSmartLog {
> +typedef struct QEMU_PACKED NvmeSmartLog {
>      uint8_t     critical_warning;
> -    uint8_t     temperature[2];
> +    uint16_t    temperature;
>      uint8_t     available_spare;
>      uint8_t     available_spare_threshold;
>      uint8_t     percentage_used;
> @@ -846,6 +846,10 @@ enum NvmeIdCtrlFrmw {
>      NVME_FRMW_SLOT1_RO = 1 << 0,
>  };
>  
> +enum NvmeIdCtrlLpa {
> +    NVME_LPA_EXTENDED = 1 << 2,
> +};
> +
>  #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
>  #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
>  #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature
  2020-07-06  6:12 ` [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature Klaus Jensen
@ 2020-07-08 19:24   ` Dmitry Fomichev
  0 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:24 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> It might seem weird to implement this feature for an emulated device,
> but it is mandatory to support and the feature is useful for testing
> asynchronous event request support, which will be added in a later
> patch.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  hw/block/nvme.c      | 48 ++++++++++++++++++++++++++++++++++++++++++++
>  hw/block/nvme.h      |  1 +
>  include/block/nvme.h |  5 ++++-
>  3 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 415d3b036897..a330ccf91620 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -59,6 +59,9 @@
>  #define NVME_DB_SIZE  4
>  #define NVME_CMB_BIR 2
>  #define NVME_PMR_BIR 2
> +#define NVME_TEMPERATURE 0x143
> +#define NVME_TEMPERATURE_WARNING 0x157
> +#define NVME_TEMPERATURE_CRITICAL 0x175
>  
>  #define NVME_GUEST_ERR(trace, fmt, ...) \
>      do { \
> @@ -841,9 +844,31 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeCmd *cmd)
>  static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>      uint32_t result;
>  
>      switch (dw10) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        result = 0;
> +
> +        /*
> +         * The controller only implements the Composite Temperature sensor, so
> +         * return 0 for all other sensors.
> +         */
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        switch (NVME_TEMP_THSEL(dw11)) {
> +        case NVME_TEMP_THSEL_OVER:
> +            result = n->features.temp_thresh_hi;
> +            break;
> +        case NVME_TEMP_THSEL_UNDER:
> +            result = n->features.temp_thresh_low;
> +            break;
> +        }
> +
> +        break;
>      case NVME_VOLATILE_WRITE_CACHE:
>          result = blk_enable_write_cache(n->conf.blk);
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
> @@ -888,6 +913,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>  
>      switch (dw10) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        switch (NVME_TEMP_THSEL(dw11)) {
> +        case NVME_TEMP_THSEL_OVER:
> +            n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
> +            break;
> +        case NVME_TEMP_THSEL_UNDER:
> +            n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
> +            break;
> +        default:
> +            return NVME_INVALID_FIELD | NVME_DNR;
> +        }
> +
> +        break;
>      case NVME_VOLATILE_WRITE_CACHE:
>          blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>          break;
> @@ -1468,6 +1510,7 @@ static void nvme_init_state(NvmeCtrl *n)
>      n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>      n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>      n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +    n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
>  }
>  
>  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> @@ -1625,6 +1668,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->acl = 3;
>      id->frmw = 7 << 1;
>      id->lpa = 1 << 0;
> +
> +    /* recommended default value (~70 C) */
> +    id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> +    id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
> +
>      id->sqes = (0x6 << 4) | 0x6;
>      id->cqes = (0x4 << 4) | 0x4;
>      id->nn = cpu_to_le32(n->num_namespaces);
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 1d30c0bca283..e3a2c907e210 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -107,6 +107,7 @@ typedef struct NvmeCtrl {
>      NvmeSQueue      admin_sq;
>      NvmeCQueue      admin_cq;
>      NvmeIdCtrl      id_ctrl;
> +    NvmeFeatureVal  features;
>  } NvmeCtrl;
>  
>  /* calculate the number of LBAs that the namespace can accomodate */
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 2a80d2a7ed89..d2c457695b38 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -860,7 +860,10 @@ enum NvmeIdCtrlOncs {
>  typedef struct NvmeFeatureVal {
>      uint32_t    arbitration;
>      uint32_t    power_mgmt;
> -    uint32_t    temp_thresh;
> +    struct {
> +        uint16_t temp_thresh_hi;
> +        uint16_t temp_thresh_low;
> +    };
>      uint32_t    err_rec;
>      uint32_t    volatile_wc;
>      uint32_t    num_queues;

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields
  2020-07-06  6:12 ` [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields Klaus Jensen
@ 2020-07-08 19:25   ` Dmitry Fomichev
  2020-07-29 13:17   ` Maxim Levitsky
  1 sibling, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:25 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Since the device does not have any persistent state storage, no
> features are "saveable" and setting the Save (SV) field in any Set
> Features command will result in a Feature Identifier Not Saveable status
> code.
> 
> Similarly, if the Select (SEL) field is set to request saved values, the
> devices will (as it should) return the default values instead.
> 
> Since this also introduces "Supported Capabilities", the nsid field is
> now also checked for validity wrt. the feature being get/set'ed.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c       | 103 +++++++++++++++++++++++++++++++++++++-----
>  hw/block/trace-events |   4 +-
>  include/block/nvme.h  |  27 ++++++++++-
>  3 files changed, 119 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 2d85e853403f..df8b786e4875 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -85,6 +85,14 @@ static const bool nvme_feature_support[NVME_FID_MAX] = {
>      [NVME_TIMESTAMP]                = true,
>  };
>  
> +static const uint32_t nvme_feature_cap[NVME_FID_MAX] = {
> +    [NVME_TEMPERATURE_THRESHOLD]    = NVME_FEAT_CAP_CHANGE,
> +    [NVME_VOLATILE_WRITE_CACHE]     = NVME_FEAT_CAP_CHANGE,
> +    [NVME_NUMBER_OF_QUEUES]         = NVME_FEAT_CAP_CHANGE,
> +    [NVME_ASYNCHRONOUS_EVENT_CONF]  = NVME_FEAT_CAP_CHANGE,
> +    [NVME_TIMESTAMP]                = NVME_FEAT_CAP_CHANGE,
> +};
> +
>  static void nvme_process_sq(void *opaque);
>  
>  static uint16_t nvme_cid(NvmeRequest *req)
> @@ -1083,20 +1091,47 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
>      uint32_t result;
>      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +    NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
>      uint16_t iv;
>  
>      static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
>          [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
>      };
>  
> -    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> +    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
>  
>      if (!nvme_feature_support[fid]) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> +        if (!nsid || nsid > n->num_namespaces) {
> +            /*
> +             * The Reservation Notification Mask and Reservation Persistence
> +             * features require a status code of Invalid Field in Command when
> +             * NSID is 0xFFFFFFFF. Since the device does not support those
> +             * features we can always return Invalid Namespace or Format as we
> +             * should do for all other features.
> +             */
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +    }
> +
> +    switch (sel) {
> +    case NVME_GETFEAT_SELECT_CURRENT:
> +        break;
> +    case NVME_GETFEAT_SELECT_SAVED:
> +        /* no features are saveable by the controller; fallthrough */
> +    case NVME_GETFEAT_SELECT_DEFAULT:
> +        goto defaults;
> +    case NVME_GETFEAT_SELECT_CAP:
> +        result = nvme_feature_cap[fid];
> +        goto out;
> +    }
> +
>      switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          result = 0;
> @@ -1106,22 +1141,45 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>           * return 0 for all other sensors.
>           */
>          if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> -            break;
> +            goto out;
>          }
>  
>          switch (NVME_TEMP_THSEL(dw11)) {
>          case NVME_TEMP_THSEL_OVER:
>              result = n->features.temp_thresh_hi;
> -            break;
> +            goto out;
>          case NVME_TEMP_THSEL_UNDER:
>              result = n->features.temp_thresh_low;
> -            break;
> +            goto out;
>          }
>  
> -        break;
> +        return NVME_INVALID_FIELD | NVME_DNR;
>      case NVME_VOLATILE_WRITE_CACHE:
>          result = blk_enable_write_cache(n->conf.blk);
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
> +        goto out;
> +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> +        result = n->features.async_config;
> +        goto out;
> +    case NVME_TIMESTAMP:
> +        return nvme_get_feature_timestamp(n, cmd);
> +    default:
> +        break;
> +    }
> +
> +defaults:
> +    switch (fid) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        result = 0;
> +
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        if (NVME_TEMP_THSEL(dw11) == NVME_TEMP_THSEL_OVER) {
> +            result = NVME_TEMPERATURE_WARNING;
> +        }
> +
>          break;
>      case NVME_NUMBER_OF_QUEUES:
>          result = (n->params.max_ioqpairs - 1) |
> @@ -1140,16 +1198,12 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          }
>  
>          break;
> -    case NVME_ASYNCHRONOUS_EVENT_CONF:
> -        result = n->features.async_config;
> -        break;
> -    case NVME_TIMESTAMP:
> -        return nvme_get_feature_timestamp(n, cmd);
>      default:
>          result = nvme_feature_default[fid];
>          break;
>      }
>  
> +out:
>      req->cqe.result = cpu_to_le32(result);
>      return NVME_SUCCESS;
>  }
> @@ -1176,14 +1230,37 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
>      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +    uint8_t save = NVME_SETFEAT_SAVE(dw10);
>  
> -    trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
> +    trace_pci_nvme_setfeat(nvme_cid(req), fid, save, dw11);
> +
> +    if (save) {
> +        return NVME_FID_NOT_SAVEABLE | NVME_DNR;
> +    }
>  
>      if (!nvme_feature_support[fid]) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> +        if (!nsid || (nsid != NVME_NSID_BROADCAST &&
> +                      nsid > n->num_namespaces)) {
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +    } else if (nsid && nsid != NVME_NSID_BROADCAST) {
> +        if (nsid > n->num_namespaces) {
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +
> +        return NVME_FEAT_NOT_NS_SPEC | NVME_DNR;
> +    }
> +
> +    if (!(nvme_feature_cap[fid] & NVME_FEAT_CAP_CHANGE)) {
> +        return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
> +    }
> +
>      switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> @@ -2028,7 +2105,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->sqes = (0x6 << 4) | 0x6;
>      id->cqes = (0x4 << 4) | 0x4;
>      id->nn = cpu_to_le32(n->num_namespaces);
> -    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
> +    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
> +                           NVME_ONCS_FEATURES);
> +
>      id->psd[0].mp = cpu_to_le16(0x9c4);
>      id->psd[0].enlat = cpu_to_le32(0x10);
>      id->psd[0].exlat = cpu_to_le32(0x4);
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 42e62f4649f8..4a4ef34071df 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -46,8 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
> -pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
> -pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
> +pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
> +pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index cd396111b2f5..179e20a01477 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -663,7 +663,7 @@ enum NvmeStatusCodes {
>      NVME_INVALID_QUEUE_DEL      = 0x010c,
>      NVME_FID_NOT_SAVEABLE       = 0x010d,
>      NVME_FEAT_NOT_CHANGEABLE    = 0x010e,
> -    NVME_FID_NOT_NSID_SPEC      = 0x010f,
> +    NVME_FEAT_NOT_NS_SPEC       = 0x010f,
>      NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
>      NVME_CONFLICTING_ATTRS      = 0x0180,
>      NVME_INVALID_PROT_INFO      = 0x0181,
> @@ -907,9 +907,32 @@ enum NvmeFeatureIds {
>      NVME_FID_MAX                    = 0x100,
>  };
>  
> +typedef enum NvmeFeatureCap {
> +    NVME_FEAT_CAP_SAVE      = 1 << 0,
> +    NVME_FEAT_CAP_NS        = 1 << 1,
> +    NVME_FEAT_CAP_CHANGE    = 1 << 2,
> +} NvmeFeatureCap;
> +
> +typedef enum NvmeGetFeatureSelect {
> +    NVME_GETFEAT_SELECT_CURRENT = 0x0,
> +    NVME_GETFEAT_SELECT_DEFAULT = 0x1,
> +    NVME_GETFEAT_SELECT_SAVED   = 0x2,
> +    NVME_GETFEAT_SELECT_CAP     = 0x3,
> +} NvmeGetFeatureSelect;
> +
>  #define NVME_GETSETFEAT_FID_MASK 0xff
>  #define NVME_GETSETFEAT_FID(dw10) (dw10 & NVME_GETSETFEAT_FID_MASK)
>  
> +#define NVME_GETFEAT_SELECT_SHIFT 8
> +#define NVME_GETFEAT_SELECT_MASK  0x7
> +#define NVME_GETFEAT_SELECT(dw10) \
> +    ((dw10 >> NVME_GETFEAT_SELECT_SHIFT) & NVME_GETFEAT_SELECT_MASK)
> +
> +#define NVME_SETFEAT_SAVE_SHIFT 31
> +#define NVME_SETFEAT_SAVE_MASK  0x1
> +#define NVME_SETFEAT_SAVE(dw10) \
> +    ((dw10 >> NVME_SETFEAT_SAVE_SHIFT) & NVME_SETFEAT_SAVE_MASK)
> +
>  typedef struct NvmeRangeType {
>      uint8_t     type;
>      uint8_t     attributes;
> @@ -926,6 +949,8 @@ typedef struct NvmeLBAF {
>      uint8_t     rp;
>  } NvmeLBAF;
>  
> +#define NVME_NSID_BROADCAST 0xffffffff
> +
>  typedef struct NvmeIdNs {
>      uint64_t    nsze;
>      uint64_t    ncap;

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list
  2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
@ 2020-07-08 19:26   ` Dmitry Fomichev
  2020-07-29 13:27   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:26 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

On Mon, 2020-07-06 at 08:13 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Reject the nsid broadcast value (0xffffffff) and 0xfffffffe in the
> Active Namespace ID list.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index fc58f3d76530..af39126cd8d1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -992,6 +992,16 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>  
>      trace_pci_nvme_identify_nslist(min_nsid);
>  
> +    /*
> +     * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values
> +     * since the Active Namespace ID List should return namespaces with ids
> +     * *higher* than the NSID specified in the command. This is also specified
> +     * in the spec (NVM Express v1.3d, Section 5.15.4).
> +     */
> +    if (min_nsid >= NVME_NSID_BROADCAST - 1) {
> +        return NVME_INVALID_NSID | NVME_DNR;
> +    }
> +
>      list = g_malloc0(data_len);
>      for (i = 0; i < n->num_namespaces; i++) {
>          if (i < min_nsid) {

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field
  2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
@ 2020-07-08 19:26   ` Dmitry Fomichev
  2020-07-29 13:34   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 19:26 UTC (permalink / raw)
  To: its, qemu-block
  Cc: kwolf, k.jensen, qemu-devel, mlevitsk, kbusch, javier.gonz,
	mreitz, philmd

Looks good,

Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

On Mon, 2020-07-06 at 08:13 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The SUBNQN field is mandatory in NVM Express 1.3.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 07d58aa945f2..e3984157926b 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2141,6 +2141,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>  {
>      NvmeIdCtrl *id = &n->id_ctrl;
>      uint8_t *pci_conf = pci_dev->config;
> +    char *subnqn;
>  
>      id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
>      id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
> @@ -2179,6 +2180,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
>                             NVME_ONCS_FEATURES);
>  
> +    subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s", n->params.serial);
> +    strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
> +    g_free(subnqn);
> +
>      id->psd[0].mp = cpu_to_le16(0x9c4);
>      id->psd[0].enlat = cpu_to_le32(0x10);
>      id->psd[0].exlat = cpu_to_le32(0x4);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to v1.3
  2020-07-08 19:19   ` Dmitry Fomichev
@ 2020-07-08 21:24     ` Klaus Jensen
  2020-07-08 21:47       ` Dmitry Fomichev
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-08 21:24 UTC (permalink / raw)
  To: Dmitry Fomichev
  Cc: fam, kwolf, qemu-block, k.jensen, qemu-devel, mreitz, kbusch,
	javier.gonz, mlevitsk, philmd

On Jul  8 19:19, Dmitry Fomichev wrote:
> Looks good with a small nit (see below),
> 
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> 
> > 
> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > +#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
> 
> There is an extra space after temp >>
> 

Good catch! I won't repost for this ;) - but I'll fix it and add it in
the qemu-nvme tree.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to v1.3
  2020-07-08 21:24     ` Klaus Jensen
@ 2020-07-08 21:47       ` Dmitry Fomichev
  2020-07-09  6:17         ` Klaus Jensen
  0 siblings, 1 reply; 60+ messages in thread
From: Dmitry Fomichev @ 2020-07-08 21:47 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: fam, kwolf, qemu-block, k.jensen, qemu-devel, mreitz, kbusch,
	javier.gonz, mlevitsk, philmd


> -----Original Message-----
> From: Klaus Jensen <its@irrelevant.dk>
> Sent: Wednesday, July 8, 2020 5:24 PM
> To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
> Cc: qemu-block@nongnu.org; qemu-devel@nongnu.org; fam@euphon.net;
> javier.gonz@samsung.com; kwolf@redhat.com; mreitz@redhat.com;
> mlevitsk@redhat.com; philmd@redhat.com; kbusch@kernel.org;
> k.jensen@samsung.com
> Subject: Re: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to
> v1.3
> 
> On Jul  8 19:19, Dmitry Fomichev wrote:
> > Looks good with a small nit (see below),
> >
> > Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> >
> > >
> > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > +#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
> >
> > There is an extra space after temp >>
> >
> 
> Good catch! I won't repost for this ;) - but I'll fix it and add it in
> the qemu-nvme tree.

Yes, no need to repost :) Thanks for reviewing our ZNS series! I am working
on addressing your comments and I am also starting to review your
"AIO and address mapping refactoring" patchset.

Cheers,
Dmitry

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to v1.3
  2020-07-08 21:47       ` Dmitry Fomichev
@ 2020-07-09  6:17         ` Klaus Jensen
  0 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-09  6:17 UTC (permalink / raw)
  To: Dmitry Fomichev
  Cc: fam, kwolf, qemu-block, k.jensen, qemu-devel, mreitz, kbusch,
	javier.gonz, mlevitsk, philmd

On Jul  8 21:47, Dmitry Fomichev wrote:
> 
> > -----Original Message-----
> > From: Klaus Jensen <its@irrelevant.dk>
> > Sent: Wednesday, July 8, 2020 5:24 PM
> > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
> > Cc: qemu-block@nongnu.org; qemu-devel@nongnu.org; fam@euphon.net;
> > javier.gonz@samsung.com; kwolf@redhat.com; mreitz@redhat.com;
> > mlevitsk@redhat.com; philmd@redhat.com; kbusch@kernel.org;
> > k.jensen@samsung.com
> > Subject: Re: [PATCH v3 01/18] hw/block/nvme: bump spec data structures to
> > v1.3
> > 
> > On Jul  8 19:19, Dmitry Fomichev wrote:
> > > Looks good with a small nit (see below),
> > >
> > > Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > >
> > > >
> > > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > > +#define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
> > >
> > > There is an extra space after temp >>
> > >
> > 
> > Good catch! I won't repost for this ;) - but I'll fix it and add it in
> > the qemu-nvme tree.
> 
> Yes, no need to repost :) Thanks for reviewing our ZNS series! I am working
> on addressing your comments and I am also starting to review your
> "AIO and address mapping refactoring" patchset.
> 

Since I think this patchset gets merged on nvme-next today, there is a
v2 on the way for that set.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 00/18] hw/block/nvme: bump to v1.3
  2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
                   ` (17 preceding siblings ...)
  2020-07-06  6:13 ` [PATCH v3 18/18] hw/block/nvme: bump supported version to v1.3 Klaus Jensen
@ 2020-07-20  9:13 ` Klaus Jensen
  18 siblings, 0 replies; 60+ messages in thread
From: Klaus Jensen @ 2020-07-20  9:13 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Philippe Mathieu-Daudé

On Jul  6 08:12, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> This adds mandatory features of NVM Express v1.3 to the emulated NVMe
> device.
> 
> 
> v3:
>   * hw/block/nvme: additional tracing
>     - Reverse logic in nvme_cid(). (Philippe)
>     - Move nvme_cid() and nvme_sqid() to source file. (Philippe)
>   * hw/block/nvme: fix missing endian conversion
>     - Move this patch to very early in the series and fix the bug properly as
>       suggested by Philippe. Then let the change trickle down through
>       the series. (Philippe)
>   * hw/block/nvme: add remaining mandatory controller parameters
>     - Move the nvme_feature_{support,default} arrays to the source file.
>       (Philippe)
>     - Add a NVME_FID_MAX constant. (Philippe)
>   * hw/block/nvme: support the get/set features select and save fields
>     - Move the nvme_feature_cap array to the source file. (Philippe)
>   * hw/block/nvme: reject invalid nsid values in active namespace id list
>     - Rework the condition and add a comment and reference to the spec.
>       (Philippe)
>   * hw/block/nvme: provide the mandatory subnqn field
>     - Change to use strpadcpy(). (Philippe)
> 
>   Had to clear some R-b's due to functional changes.
> 
>   Missing review: 2, 3, 7, 12, 16, 17
> 
> 
> v2:
>   * hw/block/nvme: bump spec data structures to v1.3
>     - Shorten some constants. (Dmitry)
>   * hw/block/nvme: add temperature threshold feature
>     - Remove unused temp_thresh member. (Dmitry)
>   * hw/block/nvme: add support for the get log page command
>     - Change the temperature field in the NvmeSmartLog struct to be an
>       uint16_t and handle wierd alignment by adding QEMU_PACKED to the
>       struct. (Dmitry)
>   * hw/block/nvme: add remaining mandatory controller parameters
>     - Fix spelling. (Dmitry)
>   * hw/block/nvme: support the get/set features select and save fields
>     - Fix bad logic causing temperature thresholds to always report
>       defaults. (Dmitry)
>   * hw/block/nvme: reject invalid nsid values in active namespace id list
>     - Added patch; reject the 0xfffffffe and 0xffffffff nsid values.
> 
> 
> $ git-backport-diff -u for-master/bump-to-v1.3-v2 -r upstream/master... -S
> Key:
> [----] : patches are identical
> [####] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively
> 
> 001/18:[----] [--] 'hw/block/nvme: bump spec data structures to v1.3'
> 002/18:[0008] [FC] 'hw/block/nvme: fix missing endian conversion'
> 003/18:[0028] [FC] 'hw/block/nvme: additional tracing'
> 004/18:[----] [--] 'hw/block/nvme: add support for the abort command'
> 005/18:[0004] [FC] 'hw/block/nvme: add temperature threshold feature'
> 006/18:[----] [--] 'hw/block/nvme: mark fw slot 1 as read-only'
> 007/18:[----] [--] 'hw/block/nvme: add support for the get log page command'
> 008/18:[0002] [FC] 'hw/block/nvme: add support for the asynchronous event request command'
> 009/18:[----] [--] 'hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h'
> 010/18:[----] [--] 'hw/block/nvme: flush write cache when disabled'
> 011/18:[0044] [FC] 'hw/block/nvme: add remaining mandatory controller parameters'
> 012/18:[0024] [FC] 'hw/block/nvme: support the get/set features select and save fields'
> 013/18:[----] [--] 'hw/block/nvme: make sure ncqr and nsqr is valid'
> 014/18:[----] [--] 'hw/block/nvme: support identify namespace descriptor list'
> 015/18:[0008] [FC] 'hw/block/nvme: reject invalid nsid values in active namespace id list'
> 016/18:[----] [--] 'hw/block/nvme: enforce valid queue creation sequence'
> 017/18:[0006] [FC] 'hw/block/nvme: provide the mandatory subnqn field'
> 018/18:[----] [--] 'hw/block/nvme: bump supported version to v1.3'
> 
> 
> Klaus Jensen (18):
>   hw/block/nvme: bump spec data structures to v1.3
>   hw/block/nvme: fix missing endian conversion
>   hw/block/nvme: additional tracing
>   hw/block/nvme: add support for the abort command
>   hw/block/nvme: add temperature threshold feature
>   hw/block/nvme: mark fw slot 1 as read-only
>   hw/block/nvme: add support for the get log page command
>   hw/block/nvme: add support for the asynchronous event request command
>   hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h
>   hw/block/nvme: flush write cache when disabled
>   hw/block/nvme: add remaining mandatory controller parameters
>   hw/block/nvme: support the get/set features select and save fields
>   hw/block/nvme: make sure ncqr and nsqr is valid
>   hw/block/nvme: support identify namespace descriptor list
>   hw/block/nvme: reject invalid nsid values in active namespace id list
>   hw/block/nvme: enforce valid queue creation sequence
>   hw/block/nvme: provide the mandatory subnqn field
>   hw/block/nvme: bump supported version to v1.3
> 
>  block/nvme.c          |  18 +-
>  hw/block/nvme.c       | 676 ++++++++++++++++++++++++++++++++++++++++--
>  hw/block/nvme.h       |  22 +-
>  hw/block/trace-events |  27 +-
>  include/block/nvme.h  | 225 +++++++++++---
>  5 files changed, 892 insertions(+), 76 deletions(-)
> 
> -- 
> 2.27.0
> 
> 

Thanks for the reviews everyone, applied to nvme-next.


Klaus


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion
  2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
  2020-07-08 19:20   ` Dmitry Fomichev
@ 2020-07-29  8:49   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29  8:49 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Fix a missing cpu_to conversion by moving conversion to just before
> returning instead.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  hw/block/nvme.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 71b388aa0e20..766cd5b33bb1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -815,8 +815,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
>          break;
>      case NVME_NUMBER_OF_QUEUES:
> -        result = cpu_to_le32((n->params.max_ioqpairs - 1) |
> -                             ((n->params.max_ioqpairs - 1) << 16));
> +        result = (n->params.max_ioqpairs - 1) |
> +            ((n->params.max_ioqpairs - 1) << 16);
>          trace_pci_nvme_getfeat_numq(result);
>          break;
>      case NVME_TIMESTAMP:
> @@ -826,7 +826,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> -    req->cqe.result = result;
> +    req->cqe.result = cpu_to_le32(result);
>      return NVME_SUCCESS;
>  }
>  
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 03/18] hw/block/nvme: additional tracing
  2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
  2020-07-06  9:50   ` Philippe Mathieu-Daudé
  2020-07-08 19:21   ` Dmitry Fomichev
@ 2020-07-29  8:52   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29  8:52 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add various additional tracing and streamline nvme_identify_ns and
> nvme_identify_nslist (they do not need to repeat the command, it is
> already in the trace name).
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c       | 33 +++++++++++++++++++++++++++++++++
>  hw/block/trace-events | 13 +++++++++++--
>  2 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 766cd5b33bb1..09ef54d771c4 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -69,6 +69,20 @@
>  
>  static void nvme_process_sq(void *opaque);
>  
> +static uint16_t nvme_cid(NvmeRequest *req)
> +{
> +    if (!req) {
> +        return 0xffff;
> +    }
> +
> +    return le16_to_cpu(req->cqe.cid);
> +}
> +
> +static uint16_t nvme_sqid(NvmeRequest *req)
> +{
> +    return le16_to_cpu(req->sq->sqid);
> +}
> +
>  static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
>  {
>      hwaddr low = n->ctrl_mem.addr;
> @@ -331,6 +345,8 @@ static void nvme_post_cqes(void *opaque)
>  static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
>  {
>      assert(cq->cqid == req->sq->cqid);
> +    trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
> +                                          req->status);
>      QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
>      QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
>      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> @@ -343,6 +359,8 @@ static void nvme_rw_cb(void *opaque, int ret)
>      NvmeCtrl *n = sq->ctrl;
>      NvmeCQueue *cq = n->cq[sq->cqid];
>  
> +    trace_pci_nvme_rw_cb(nvme_cid(req));
> +
>      if (!ret) {
>          block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
>          req->status = NVME_SUCCESS;
> @@ -378,6 +396,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
>      uint64_t offset = slba << data_shift;
>      uint32_t count = nlb << data_shift;
>  
> +    trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb);
> +
>      if (unlikely(slba + nlb > ns->id_ns.nsze)) {
>          trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
>          return NVME_LBA_RANGE | NVME_DNR;
> @@ -445,6 +465,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      NvmeNamespace *ns;
>      uint32_t nsid = le32_to_cpu(cmd->nsid);
>  
> +    trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), cmd->opcode);
> +
>      if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
>          trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
>          return NVME_INVALID_NSID | NVME_DNR;
> @@ -876,6 +898,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  
>  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
> +    trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> +
>      switch (cmd->opcode) {
>      case NVME_ADM_CMD_DELETE_SQ:
>          return nvme_del_sq(n, cmd);
> @@ -1204,6 +1228,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
>      uint8_t *ptr = (uint8_t *)&n->bar;
>      uint64_t val = 0;
>  
> +    trace_pci_nvme_mmio_read(addr);
> +
>      if (unlikely(addr & (sizeof(uint32_t) - 1))) {
>          NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
>                         "MMIO read not 32-bit aligned,"
> @@ -1273,6 +1299,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>              return;
>          }
>  
> +        trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
> +
>          start_sqs = nvme_cq_full(cq) ? 1 : 0;
>          cq->head = new_head;
>          if (start_sqs) {
> @@ -1311,6 +1339,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>              return;
>          }
>  
> +        trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
> +
>          sq->tail = new_tail;
>          timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>      }
> @@ -1320,6 +1350,9 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, uint64_t data,
>      unsigned size)
>  {
>      NvmeCtrl *n = (NvmeCtrl *)opaque;
> +
> +    trace_pci_nvme_mmio_write(addr, data);
> +
>      if (addr < sizeof(n->bar)) {
>          nvme_write_bar(n, addr, data, size);
>      } else if (addr >= 0x1000) {
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 958fcc5508d1..c40c0d2e4b28 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -33,19 +33,28 @@ pci_nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
>  pci_nvme_irq_pin(void) "pulsing IRQ pin"
>  pci_nvme_irq_masked(void) "IRQ is masked"
>  pci_nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
> +pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8""
> +pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8""
>  pci_nvme_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
> +pci_nvme_rw_cb(uint16_t cid) "cid %"PRIu16""
> +pci_nvme_write_zeroes(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba %"PRIu64" nlb %"PRIu32""
>  pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
>  pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
>  pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
>  pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
> -pci_nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
> -pci_nvme_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
> +pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
>  pci_nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
>  pci_nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
> +pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
> +pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
> +pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
> +pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16""
> +pci_nvme_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "cqid %"PRIu16" new_tail %"PRIu16""
>  pci_nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
>  pci_nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
>  pci_nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only
  2020-07-06  6:12 ` [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only Klaus Jensen
@ 2020-07-29  9:14   ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29  9:14 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Mark firmware slot 1 as read-only and only support that slot.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.c      | 3 ++-
>  include/block/nvme.h | 4 ++++
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index a330ccf91620..b6bc75eb61a2 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -62,6 +62,7 @@
>  #define NVME_TEMPERATURE 0x143
>  #define NVME_TEMPERATURE_WARNING 0x157
>  #define NVME_TEMPERATURE_CRITICAL 0x175
> +#define NVME_NUM_FW_SLOTS 1
>  
>  #define NVME_GUEST_ERR(trace, fmt, ...) \
>      do { \
> @@ -1666,7 +1667,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>       * inconsequential.
>       */
>      id->acl = 3;
> -    id->frmw = 7 << 1;
> +    id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;

Offtopic:
They didn't make this one 0 based value, and yet ask for at least 1 slot. 
Just to keep us entertained I guess :-)

>      id->lpa = 1 << 0;
>  
>      /* recommended default value (~70 C) */
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index d2c457695b38..d639e8bbee92 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -842,6 +842,10 @@ enum NvmeIdCtrlOncs {
>      NVME_ONCS_TIMESTAMP     = 1 << 6,
>  };
>  
> +enum NvmeIdCtrlFrmw {
> +    NVME_FRMW_SLOT1_RO = 1 << 0,
> +};
> +
>  #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
>  #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
>  #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)

Looks good,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
  2020-07-08 19:22   ` Dmitry Fomichev
@ 2020-07-29 10:24   ` Maxim Levitsky
  2020-07-29 11:44     ` Klaus Jensen
  2020-09-29 13:11   ` Peter Maydell
  2 siblings, 1 reply; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 10:24 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add support for the Get Log Page command and basic implementations of
> the mandatory Error Information, SMART / Health Information and Firmware
> Slot Information log pages.
> 
> In violation of the specification, the SMART / Health Information log
> page does not persist information over the lifetime of the controller
> because the device has no place to store such persistent state.
> 
> Note that the LPA field in the Identify Controller data structure
> intentionally has bit 0 cleared because there is no namespace specific
> information in the SMART / Health information log page.
> 
> Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> Section 5.14 ("Get Log Page command").
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> ---
>  hw/block/nvme.c       | 140 +++++++++++++++++++++++++++++++++++++++++-
>  hw/block/nvme.h       |   2 +
>  hw/block/trace-events |   2 +
>  include/block/nvme.h  |   8 ++-
>  4 files changed, 149 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index b6bc75eb61a2..7cb3787638f6 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -606,6 +606,140 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
>      return NVME_SUCCESS;
>  }
>  
> +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                uint64_t off, NvmeRequest *req)
> +{
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> +
> +    uint32_t trans_len;
> +    time_t current_ms;
> +    uint64_t units_read = 0, units_written = 0;
> +    uint64_t read_commands = 0, write_commands = 0;
> +    NvmeSmartLog smart;
> +    BlockAcctStats *s;
> +
> +    if (nsid && nsid != 0xffffffff) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
Correct.
> +
> +    s = blk_get_stats(n->conf.blk);
> +
> +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> +
> +    if (off > sizeof(smart)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trans_len = MIN(sizeof(smart) - off, buf_len);
> +
> +    memset(&smart, 0x0, sizeof(smart));
> +
> +    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
> +    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
Tiny nitpick - the spec asks the value to be rounded up

> +    smart.host_read_commands[0] = cpu_to_le64(read_commands);
> +    smart.host_write_commands[0] = cpu_to_le64(write_commands);
> +
> +    smart.temperature = cpu_to_le16(n->temperature);
> +
> +    if ((n->temperature >= n->features.temp_thresh_hi) ||
> +        (n->temperature <= n->features.temp_thresh_low)) {
> +        smart.critical_warning |= NVME_SMART_TEMPERATURE;
> +    }
> +
> +    current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> +    smart.power_on_hours[0] =
> +        cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> +                             prp2);
> +}
> +
> +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                 uint64_t off, NvmeRequest *req)
> +{
> +    uint32_t trans_len;
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    NvmeFwSlotInfoLog fw_log = {
> +        .afi = 0x1,
> +    };
> +
> +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');

I always thought that firmware log can be just zeroed out, but this is correct
now that I checked the spec again.
> +
> +    if (off > sizeof(fw_log)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> +                             prp2);
> +}
> +
> +static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                uint64_t off, NvmeRequest *req)
> +{
> +    uint32_t trans_len;
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    NvmeErrorLog errlog;
> +
> +    if (off > sizeof(errlog)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    memset(&errlog, 0x0, sizeof(errlog));
> +
> +    trans_len = MIN(sizeof(errlog) - off, buf_len);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *)&errlog, trans_len, prp1, prp2);
Looks good.
> +}
> +
> +static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> +{
> +    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
> +    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
> +    uint8_t  lid = dw10 & 0xff;
> +    uint8_t  lsp = (dw10 >> 8) & 0xf;
> +    uint8_t  rae = (dw10 >> 15) & 0x1;
> +    uint32_t numdl, numdu;
> +    uint64_t off, lpol, lpou;
> +    size_t   len;
> +
Nitpick: don't we want to check NSID=0 || NSID=0xFFFFFFFF here too?

> +    numdl = (dw10 >> 16);
> +    numdu = (dw11 & 0xffff);
> +    lpol = dw12;
> +    lpou = dw13;
> +
> +    len = (((numdu << 16) | numdl) + 1) << 2;
> +    off = (lpou << 32ULL) | lpol;
> +
> +    if (off & 0x3) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
Looks OK
> +
> +    trace_pci_nvme_get_log(nvme_cid(req), lid, lsp, rae, len, off);
> +
> +    switch (lid) {
> +    case NVME_LOG_ERROR_INFO:
> +        return nvme_error_info(n, cmd, len, off, req);
> +    case NVME_LOG_SMART_INFO:
> +        return nvme_smart_info(n, cmd, len, off, req);
> +    case NVME_LOG_FW_SLOT_INFO:
> +        return nvme_fw_log_info(n, cmd, len, off, req);
> +    default:
> +        trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +}
> +
>  static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
>  {
>      n->cq[cq->cqid] = NULL;
> @@ -960,6 +1094,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return nvme_del_sq(n, cmd);
>      case NVME_ADM_CMD_CREATE_SQ:
>          return nvme_create_sq(n, cmd);
> +    case NVME_ADM_CMD_GET_LOG_PAGE:
> +        return nvme_get_log(n, cmd, req);
>      case NVME_ADM_CMD_DELETE_CQ:
>          return nvme_del_cq(n, cmd);
>      case NVME_ADM_CMD_CREATE_CQ:
> @@ -1511,7 +1647,9 @@ static void nvme_init_state(NvmeCtrl *n)
>      n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>      n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>      n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +    n->temperature = NVME_TEMPERATURE;
>      n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
> +    n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>  }
>  
>  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> @@ -1668,7 +1806,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>       */
>      id->acl = 3;
>      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> -    id->lpa = 1 << 0;
> +    id->lpa = NVME_LPA_EXTENDED;
>  
>      /* recommended default value (~70 C) */
>      id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index e3a2c907e210..8228978e93de 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -98,6 +98,8 @@ typedef struct NvmeCtrl {
>      uint32_t    irq_status;
>      uint64_t    host_timestamp;                 /* Timestamp sent by the host */
>      uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
> +    uint64_t    starttime_ms;
> +    uint16_t    temperature;
>  
>      HostMemoryBackend *pmrdev;
>  
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index c40c0d2e4b28..3330d74e48db 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> @@ -94,6 +95,7 @@ pci_nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completi
>  pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
>  pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
>  pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
> +pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16""
>  pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
>  pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
>  pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index d639e8bbee92..49ce97ae1ab4 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -704,9 +704,9 @@ typedef struct NvmeErrorLog {
>      uint8_t     resv[35];
>  } NvmeErrorLog;
>  
> -typedef struct NvmeSmartLog {
> +typedef struct QEMU_PACKED NvmeSmartLog {
>      uint8_t     critical_warning;
> -    uint8_t     temperature[2];
> +    uint16_t    temperature;
>      uint8_t     available_spare;
>      uint8_t     available_spare_threshold;
>      uint8_t     percentage_used;
> @@ -846,6 +846,10 @@ enum NvmeIdCtrlFrmw {
>      NVME_FRMW_SLOT1_RO = 1 << 0,
>  };
>  
> +enum NvmeIdCtrlLpa {
> +    NVME_LPA_EXTENDED = 1 << 2,
> +};
> +
>  #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
>  #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
>  #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)

Other than few nitpicks that don't matter much,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-06  6:12 ` [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command Klaus Jensen
@ 2020-07-29 10:43   ` Maxim Levitsky
  2020-07-29 13:37     ` Klaus Jensen
  0 siblings, 1 reply; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 10:43 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add support for the Asynchronous Event Request command. Required for
> compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.2
> ("Asynchronous Event Request command").
> 
> Mostly imported from Keith's qemu-nvme tree. Modified with a max number
> of queued events (controllable with the aer_max_queued device
> parameter). The spec states that the controller *should* retain
> events, so we do best effort here.
> 
> Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.c       | 180 ++++++++++++++++++++++++++++++++++++++++--
>  hw/block/nvme.h       |  10 ++-
>  hw/block/trace-events |   9 +++
>  include/block/nvme.h  |   8 +-
>  4 files changed, 198 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 7cb3787638f6..80c7285bc1cf 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -356,6 +356,85 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
>      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>  }
>  
> +static void nvme_process_aers(void *opaque)
> +{
> +    NvmeCtrl *n = opaque;
> +    NvmeAsyncEvent *event, *next;
> +
> +    trace_pci_nvme_process_aers(n->aer_queued);
> +
> +    QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
> +        NvmeRequest *req;
> +        NvmeAerResult *result;
> +
> +        /* can't post cqe if there is nothing to complete */
> +        if (!n->outstanding_aers) {
> +            trace_pci_nvme_no_outstanding_aers();
> +            break;
> +        }
> +
> +        /* ignore if masked (cqe posted, but event not cleared) */
> +        if (n->aer_mask & (1 << event->result.event_type)) {
> +            trace_pci_nvme_aer_masked(event->result.event_type, n->aer_mask);
> +            continue;
> +        }
> +
> +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> +        n->aer_queued--;
> +
> +        n->aer_mask |= 1 << event->result.event_type;
> +        n->outstanding_aers--;
> +
> +        req = n->aer_reqs[n->outstanding_aers];
> +
> +        result = (NvmeAerResult *) &req->cqe.result;
> +        result->event_type = event->result.event_type;
> +        result->event_info = event->result.event_info;
> +        result->log_page = event->result.log_page;
> +        g_free(event);
> +
> +        req->status = NVME_SUCCESS;
> +
> +        trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
> +                                    result->log_page);
> +
> +        nvme_enqueue_req_completion(&n->admin_cq, req);
> +    }
> +}
> +
> +static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
> +                               uint8_t event_info, uint8_t log_page)
> +{
> +    NvmeAsyncEvent *event;
> +
> +    trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
> +
> +    if (n->aer_queued == n->params.aer_max_queued) {
> +        trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
> +        return;
> +    }
> +
> +    event = g_new(NvmeAsyncEvent, 1);
> +    event->result = (NvmeAerResult) {
> +        .event_type = event_type,
> +        .event_info = event_info,
> +        .log_page   = log_page,
> +    };
> +
> +    QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
> +    n->aer_queued++;
> +
> +    nvme_process_aers(n);
> +}
> +
> +static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
> +{
> +    n->aer_mask &= ~(1 << event_type);
> +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> +        nvme_process_aers(n);
> +    }
> +}
> +
>  static void nvme_rw_cb(void *opaque, int ret)
>  {
>      NvmeRequest *req = opaque;
> @@ -606,8 +685,9 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
>      return NVME_SUCCESS;
>  }
>  
> -static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> -                                uint64_t off, NvmeRequest *req)
> +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> +                                uint32_t buf_len, uint64_t off,
> +                                NvmeRequest *req)
>  {
>      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
>      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> @@ -655,6 +735,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
>      smart.power_on_hours[0] =
>          cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
>  
> +    if (!rae) {
> +        nvme_clear_events(n, NVME_AER_TYPE_SMART);
> +    }
> +
>      return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
>                               prp2);
>  }
> @@ -681,14 +765,19 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
>                               prp2);
>  }
>  
> -static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> -                                uint64_t off, NvmeRequest *req)
> +static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> +                                uint32_t buf_len, uint64_t off,
> +                                NvmeRequest *req)
>  {
>      uint32_t trans_len;
>      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
>      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
>      NvmeErrorLog errlog;
>  
> +    if (!rae) {
> +        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
> +    }
> +
>      if (off > sizeof(errlog)) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
> @@ -729,9 +818,9 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  
>      switch (lid) {
>      case NVME_LOG_ERROR_INFO:
> -        return nvme_error_info(n, cmd, len, off, req);
> +        return nvme_error_info(n, cmd, rae, len, off, req);
>      case NVME_LOG_SMART_INFO:
> -        return nvme_smart_info(n, cmd, len, off, req);
> +        return nvme_smart_info(n, cmd, rae, len, off, req);
>      case NVME_LOG_FW_SLOT_INFO:
>          return nvme_fw_log_info(n, cmd, len, off, req);
>      default:
> @@ -1013,6 +1102,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>              ((n->params.max_ioqpairs - 1) << 16);
>          trace_pci_nvme_getfeat_numq(result);
>          break;
> +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> +        result = n->features.async_config;
> +        break;
>      case NVME_TIMESTAMP:
>          return nvme_get_feature_timestamp(n, cmd);
>      default:
> @@ -1064,6 +1156,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>              return NVME_INVALID_FIELD | NVME_DNR;
>          }
>  
> +        if (((n->temperature >= n->features.temp_thresh_hi) ||
> +            (n->temperature <= n->features.temp_thresh_low)) &&
> +            NVME_AEC_SMART(n->features.async_config) & NVME_SMART_TEMPERATURE) {
> +            nvme_enqueue_event(n, NVME_AER_TYPE_SMART,
> +                               NVME_AER_INFO_SMART_TEMP_THRESH,
> +                               NVME_LOG_SMART_INFO);
> +        }
> +
>          break;
>      case NVME_VOLATILE_WRITE_CACHE:
>          blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
> @@ -1076,6 +1176,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
>                                        ((n->params.max_ioqpairs - 1) << 16));
>          break;
> +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> +        n->features.async_config = dw11;
> +        break;
>      case NVME_TIMESTAMP:
>          return nvme_set_feature_timestamp(n, cmd);
>      default:
> @@ -1085,6 +1188,25 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      return NVME_SUCCESS;
>  }
>  
> +static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> +{
> +    trace_pci_nvme_aer(nvme_cid(req));
> +
> +    if (n->outstanding_aers > n->params.aerl) {
> +        trace_pci_nvme_aer_aerl_exceeded();
> +        return NVME_AER_LIMIT_EXCEEDED;
> +    }
> +
> +    n->aer_reqs[n->outstanding_aers] = req;
> +    n->outstanding_aers++;
> +
> +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> +        nvme_process_aers(n);
> +    }
> +
> +    return NVME_NO_COMPLETE;
> +}

Looks good so far

> +
>  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> @@ -1108,6 +1230,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          return nvme_set_feature(n, cmd, req);
>      case NVME_ADM_CMD_GET_FEATURES:
>          return nvme_get_feature(n, cmd, req);
> +    case NVME_ADM_CMD_ASYNC_EV_REQ:
> +        return nvme_aer(n, cmd, req);
>      default:
>          trace_pci_nvme_err_invalid_admin_opc(cmd->opcode);
>          return NVME_INVALID_OPCODE | NVME_DNR;
> @@ -1162,6 +1286,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>          }
>      }
>  
> +    while (!QTAILQ_EMPTY(&n->aer_queue)) {
> +        NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
> +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> +        g_free(event);
> +    }
> +
> +    n->aer_queued = 0;
> +    n->outstanding_aers = 0;
> +
>      blk_flush(n->conf.blk);
>      n->bar.cc = 0;
>  }
> @@ -1258,6 +1391,8 @@ static int nvme_start_ctrl(NvmeCtrl *n)
>  
>      nvme_set_timestamp(n, 0ULL);
>  
> +    QTAILQ_INIT(&n->aer_queue);
> +
>      return 0;
>  }
>  
> @@ -1479,6 +1614,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>                             "completion queue doorbell write"
>                             " for nonexistent queue,"
>                             " sqid=%"PRIu32", ignoring", qid);
> +
> +            if (n->outstanding_aers) {
> +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> +                                   NVME_LOG_ERROR_INFO);
> +            }
To be honest I would move the check for outstanding AERs to nvme_enqueue_event.

Also the logic seems a bit off. The code checks that we have outstanding AER requests,
however we do have internal AER queue for this situation.
It seems that SMART events are generated without this check but ERROR events only when
outstanding AERs exist.
Could you explain? I am probably forgot something from the spec which I haven't read for long time.


> +
>              return;
>          }
>  
> @@ -1489,6 +1631,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>                             " beyond queue size, sqid=%"PRIu32","
>                             " new_head=%"PRIu16", ignoring",
>                             qid, new_head);
> +
> +            if (n->outstanding_aers) {
> +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> +                                   NVME_LOG_ERROR_INFO);
> +            }
> +
>              return;
>          }
>  
> @@ -1519,6 +1668,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>                             "submission queue doorbell write"
>                             " for nonexistent queue,"
>                             " sqid=%"PRIu32", ignoring", qid);
> +
> +            if (n->outstanding_aers) {
> +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> +                                   NVME_LOG_ERROR_INFO);
> +            }
> +
>              return;
>          }
>  
> @@ -1529,6 +1685,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>                             " beyond queue size, sqid=%"PRIu32","
>                             " new_tail=%"PRIu16", ignoring",
>                             qid, new_tail);
> +
> +            if (n->outstanding_aers) {
> +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> +                                   NVME_LOG_ERROR_INFO);
> +            }
> +
>              return;
>          }
>  
> @@ -1650,6 +1813,7 @@ static void nvme_init_state(NvmeCtrl *n)
>      n->temperature = NVME_TEMPERATURE;
>      n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
>      n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> +    n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
>  }
>  
>  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> @@ -1805,6 +1969,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>       * inconsequential.
>       */
>      id->acl = 3;
> +    id->aerl = n->params.aerl;
Name a tiny bit unclear. I know that this is from the spec but still.

>      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
>      id->lpa = NVME_LPA_EXTENDED;
>  
> @@ -1879,6 +2044,7 @@ static void nvme_exit(PCIDevice *pci_dev)
>      g_free(n->namespaces);
>      g_free(n->cq);
>      g_free(n->sq);
> +    g_free(n->aer_reqs);
>  
>      if (n->params.cmb_size_mb) {
>          g_free(n->cmbuf);
> @@ -1899,6 +2065,8 @@ static Property nvme_props[] = {
>      DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
>      DEFINE_PROP_UINT32("max_ioqpairs", NvmeCtrl, params.max_ioqpairs, 64),
>      DEFINE_PROP_UINT16("msix_qsize", NvmeCtrl, params.msix_qsize, 65),
> +    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
So this is number of AERs that we allow the user to be outstanding

> +    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
And this is the number of AERs that we keep in our internal AER queue untill user posts and AER so that we
can complete it.

>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 8228978e93de..1837233617bb 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -9,10 +9,12 @@ typedef struct NvmeParams {
>      uint32_t max_ioqpairs;
>      uint16_t msix_qsize;
>      uint32_t cmb_size_mb;
> +    uint8_t  aerl;
> +    uint32_t aer_max_queued;
>  } NvmeParams;
>  
>  typedef struct NvmeAsyncEvent {
> -    QSIMPLEQ_ENTRY(NvmeAsyncEvent) entry;
> +    QTAILQ_ENTRY(NvmeAsyncEvent) entry;
>      NvmeAerResult result;
>  } NvmeAsyncEvent;
>  
> @@ -94,6 +96,7 @@ typedef struct NvmeCtrl {
>      uint32_t    num_namespaces;
>      uint32_t    max_q_ents;
>      uint64_t    ns_size;
> +    uint8_t     outstanding_aers;
>      uint8_t     *cmbuf;
>      uint32_t    irq_status;
>      uint64_t    host_timestamp;                 /* Timestamp sent by the host */
> @@ -103,6 +106,11 @@ typedef struct NvmeCtrl {
>  
>      HostMemoryBackend *pmrdev;
>  
> +    uint8_t     aer_mask;
> +    NvmeRequest **aer_reqs;
> +    QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue;
> +    int         aer_queued;
> +
>      NvmeNamespace   *namespaces;
>      NvmeSQueue      **sq;
>      NvmeCQueue      **cq;
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 3330d74e48db..091af16ca7d7 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -51,6 +51,15 @@ pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
>  pci_nvme_setfeat_timestamp(uint64_t ts) "set feature timestamp = 0x%"PRIx64""
>  pci_nvme_getfeat_timestamp(uint64_t ts) "get feature timestamp = 0x%"PRIx64""
> +pci_nvme_process_aers(int queued) "queued %d"
> +pci_nvme_aer(uint16_t cid) "cid %"PRIu16""
> +pci_nvme_aer_aerl_exceeded(void) "aerl exceeded"
> +pci_nvme_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8""
> +pci_nvme_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
> +pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
> +pci_nvme_enqueue_event_noqueue(int queued) "queued %d"
> +pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
> +pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
>  pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16""
>  pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64""
>  pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64""
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 49ce97ae1ab4..2101292ed5e8 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -597,8 +597,8 @@ enum NvmeAsyncEventRequest {
>      NVME_AER_TYPE_SMART                     = 1,
>      NVME_AER_TYPE_IO_SPECIFIC               = 6,
>      NVME_AER_TYPE_VENDOR_SPECIFIC           = 7,
> -    NVME_AER_INFO_ERR_INVALID_SQ            = 0,
> -    NVME_AER_INFO_ERR_INVALID_DB            = 1,
> +    NVME_AER_INFO_ERR_INVALID_DB_REGISTER   = 0,
> +    NVME_AER_INFO_ERR_INVALID_DB_VALUE      = 1,
>      NVME_AER_INFO_ERR_DIAG_FAIL             = 2,
>      NVME_AER_INFO_ERR_PERS_INTERNAL_ERR     = 3,
>      NVME_AER_INFO_ERR_TRANS_INTERNAL_ERR    = 4,
> @@ -899,6 +899,10 @@ typedef struct NvmeFeatureVal {
>  
>  #define NVME_TEMP_TMPTH(temp) ((temp >>  0) & 0xffff)
>  
> +#define NVME_AEC_SMART(aec)         (aec & 0xff)
> +#define NVME_AEC_NS_ATTR(aec)       ((aec >> 8) & 0x1)
> +#define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1)
> +
>  enum NvmeFeatureIds {
>      NVME_ARBITRATION                = 0x1,
>      NVME_POWER_MANAGEMENT           = 0x2,


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h
  2020-07-06  6:12 ` [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h Klaus Jensen
@ 2020-07-29 10:46   ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 10:46 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The NvmeFeatureVal does not belong with the spec-related data structures
> in include/block/nvme.h that is shared between the block-level nvme
> driver and the emulated nvme device.
> 
> Move it into the nvme device specific header file as it is the only
> user of the structure. Also, remove the unused members.

Agree.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.h      |  8 ++++++++
>  include/block/nvme.h | 17 -----------------
>  2 files changed, 8 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 1837233617bb..b93067c9e4a1 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -79,6 +79,14 @@ static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns)
>  #define NVME(obj) \
>          OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME)
>  
> +typedef struct NvmeFeatureVal {
> +    struct {
> +        uint16_t temp_thresh_hi;
> +        uint16_t temp_thresh_low;
> +    };
> +    uint32_t    async_config;
> +} NvmeFeatureVal;
> +
>  typedef struct NvmeCtrl {
>      PCIDevice    parent_obj;
>      MemoryRegion iomem;
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 2101292ed5e8..0dce15af6bcf 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -865,23 +865,6 @@ enum NvmeIdCtrlLpa {
>  #define NVME_CTRL_SGLS_MPTR_SGL            (0x1 << 19)
>  #define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
>  
> -typedef struct NvmeFeatureVal {
> -    uint32_t    arbitration;
> -    uint32_t    power_mgmt;
> -    struct {
> -        uint16_t temp_thresh_hi;
> -        uint16_t temp_thresh_low;
> -    };
> -    uint32_t    err_rec;
> -    uint32_t    volatile_wc;
> -    uint32_t    num_queues;
> -    uint32_t    int_coalescing;
> -    uint32_t    *int_vector_config;
> -    uint32_t    write_atomicity;
> -    uint32_t    async_config;
> -    uint32_t    sw_prog_marker;
> -} NvmeFeatureVal;
> -
>  #define NVME_ARB_AB(arb)    (arb & 0x7)
>  #define NVME_ARB_LPW(arb)   ((arb >> 8) & 0xff)
>  #define NVME_ARB_MPW(arb)   ((arb >> 16) & 0xff)




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled
  2020-07-06  6:12 ` [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled Klaus Jensen
@ 2020-07-29 11:03   ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 11:03 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> If the write cache is disabled with a Set Features command, flush it if
> currently enabled.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 80c7285bc1cf..8fce2ebf69e7 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1166,6 +1166,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  
>          break;
>      case NVME_VOLATILE_WRITE_CACHE:
> +        if (!(dw11 & 0x1) && blk_enable_write_cache(n->conf.blk)) {
> +            blk_flush(n->conf.blk);
> +        }
> +
>          blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>          break;
>      case NVME_NUMBER_OF_QUEUES:
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters
  2020-07-06  6:12 ` [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters Klaus Jensen
@ 2020-07-29 11:31   ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 11:31 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Add support for any remaining mandatory controller operating parameters
> (features).
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.c       | 56 ++++++++++++++++++++++++++++++++++++++-----
>  hw/block/trace-events |  2 ++
>  include/block/nvme.h  | 10 +++++++-
>  3 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8fce2ebf69e7..2d85e853403f 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -71,6 +71,20 @@
>              " in %s: " fmt "\n", __func__, ## __VA_ARGS__); \
>      } while (0)
>  
> +static const bool nvme_feature_support[NVME_FID_MAX] = {
> +    [NVME_ARBITRATION]              = true,
> +    [NVME_POWER_MANAGEMENT]         = true,
> +    [NVME_TEMPERATURE_THRESHOLD]    = true,
> +    [NVME_ERROR_RECOVERY]           = true,
> +    [NVME_VOLATILE_WRITE_CACHE]     = true,
> +    [NVME_NUMBER_OF_QUEUES]         = true,
> +    [NVME_INTERRUPT_COALESCING]     = true,
> +    [NVME_INTERRUPT_VECTOR_CONF]    = true,
> +    [NVME_WRITE_ATOMICITY]          = true,
> +    [NVME_ASYNCHRONOUS_EVENT_CONF]  = true,
> +    [NVME_TIMESTAMP]                = true,
I checked the spec and mandatory features are all here.

> +};
> +
>  static void nvme_process_sq(void *opaque);
>  
>  static uint16_t nvme_cid(NvmeRequest *req)
> @@ -1070,8 +1084,20 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>      uint32_t result;
> +    uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +    uint16_t iv;
>  
> -    switch (dw10) {
> +    static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
> +        [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
> +    };
Nice idea!

> +
> +    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> +
> +    if (!nvme_feature_support[fid]) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          result = 0;
>  
> @@ -1101,6 +1127,18 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          result = (n->params.max_ioqpairs - 1) |
>              ((n->params.max_ioqpairs - 1) << 16);
>          trace_pci_nvme_getfeat_numq(result);
> +        break;
> +    case NVME_INTERRUPT_VECTOR_CONF:
> +        iv = dw11 & 0xffff;
> +        if (iv >= n->params.max_ioqpairs + 1) {
> +            return NVME_INVALID_FIELD | NVME_DNR;
> +        }
> +
> +        result = iv;
> +        if (iv == n->admin_cq.vector) {
> +            result |= NVME_INTVC_NOCOALESCING;
> +        }
I wonder if this is needed, but it doesn't hurt to have this.
Spec is not clear about this.

> +
>          break;
>      case NVME_ASYNCHRONOUS_EVENT_CONF:
>          result = n->features.async_config;
> @@ -1108,8 +1146,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      case NVME_TIMESTAMP:
>          return nvme_get_feature_timestamp(n, cmd);
>      default:
> -        trace_pci_nvme_err_invalid_getfeat(dw10);
> -        return NVME_INVALID_FIELD | NVME_DNR;
> +        result = nvme_feature_default[fid];
> +        break;
>      }
>  
>      req->cqe.result = cpu_to_le32(result);
> @@ -1138,8 +1176,15 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint8_t fid = NVME_GETSETFEAT_FID(dw10);
>  
> -    switch (dw10) {
> +    trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
> +
> +    if (!nvme_feature_support[fid]) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
>              break;
> @@ -1186,8 +1231,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>      case NVME_TIMESTAMP:
>          return nvme_set_feature_timestamp(n, cmd);
>      default:
> -        trace_pci_nvme_err_invalid_setfeat(dw10);
> -        return NVME_INVALID_FIELD | NVME_DNR;
> +        return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
>      }
>      return NVME_SUCCESS;
>  }
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 091af16ca7d7..42e62f4649f8 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -46,6 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
> +pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
> +pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 0dce15af6bcf..cd396111b2f5 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -662,6 +662,7 @@ enum NvmeStatusCodes {
>      NVME_FW_REQ_RESET           = 0x010b,
>      NVME_INVALID_QUEUE_DEL      = 0x010c,
>      NVME_FID_NOT_SAVEABLE       = 0x010d,
> +    NVME_FEAT_NOT_CHANGEABLE    = 0x010e,
>      NVME_FID_NOT_NSID_SPEC      = 0x010f,
>      NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
>      NVME_CONFLICTING_ATTRS      = 0x0180,
> @@ -866,6 +867,7 @@ enum NvmeIdCtrlLpa {
>  #define NVME_CTRL_SGLS_ADDR_OFFSET         (0x1 << 20)
>  
>  #define NVME_ARB_AB(arb)    (arb & 0x7)
> +#define NVME_ARB_AB_NOLIMIT 0x7
>  #define NVME_ARB_LPW(arb)   ((arb >> 8) & 0xff)
>  #define NVME_ARB_MPW(arb)   ((arb >> 16) & 0xff)
>  #define NVME_ARB_HPW(arb)   ((arb >> 24) & 0xff)
> @@ -873,6 +875,8 @@ enum NvmeIdCtrlLpa {
>  #define NVME_INTC_THR(intc)     (intc & 0xff)
>  #define NVME_INTC_TIME(intc)    ((intc >> 8) & 0xff)
>  
> +#define NVME_INTVC_NOCOALESCING (0x1 << 16)
> +
>  #define NVME_TEMP_THSEL(temp)  ((temp >> 20) & 0x3)
>  #define NVME_TEMP_THSEL_OVER   0x0
>  #define NVME_TEMP_THSEL_UNDER  0x1
> @@ -899,9 +903,13 @@ enum NvmeFeatureIds {
>      NVME_WRITE_ATOMICITY            = 0xa,
>      NVME_ASYNCHRONOUS_EVENT_CONF    = 0xb,
>      NVME_TIMESTAMP                  = 0xe,
> -    NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
> +    NVME_SOFTWARE_PROGRESS_MARKER   = 0x80,
> +    NVME_FID_MAX                    = 0x100,
>  };
>  
> +#define NVME_GETSETFEAT_FID_MASK 0xff
> +#define NVME_GETSETFEAT_FID(dw10) (dw10 & NVME_GETSETFEAT_FID_MASK)
> +
>  typedef struct NvmeRangeType {
>      uint8_t     type;
>      uint8_t     attributes;
Looks good,
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-29 10:24   ` Maxim Levitsky
@ 2020-07-29 11:44     ` Klaus Jensen
  2020-07-29 18:35       ` Maxim Levitsky
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-29 11:44 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Jul 29 13:24, Maxim Levitsky wrote:
> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Add support for the Get Log Page command and basic implementations of
> > the mandatory Error Information, SMART / Health Information and Firmware
> > Slot Information log pages.
> > 
> > In violation of the specification, the SMART / Health Information log
> > page does not persist information over the lifetime of the controller
> > because the device has no place to store such persistent state.
> > 
> > Note that the LPA field in the Identify Controller data structure
> > intentionally has bit 0 cleared because there is no namespace specific
> > information in the SMART / Health information log page.
> > 
> > Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> > Section 5.14 ("Get Log Page command").
> > 
> > Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Acked-by: Keith Busch <kbusch@kernel.org>
> > ---
> >  hw/block/nvme.c       | 140 +++++++++++++++++++++++++++++++++++++++++-
> >  hw/block/nvme.h       |   2 +
> >  hw/block/trace-events |   2 +
> >  include/block/nvme.h  |   8 ++-
> >  4 files changed, 149 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index b6bc75eb61a2..7cb3787638f6 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -606,6 +606,140 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
> >      return NVME_SUCCESS;
> >  }
> >  
> > +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > +                                uint64_t off, NvmeRequest *req)
> > +{
> > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> > +
> > +    uint32_t trans_len;
> > +    time_t current_ms;
> > +    uint64_t units_read = 0, units_written = 0;
> > +    uint64_t read_commands = 0, write_commands = 0;
> > +    NvmeSmartLog smart;
> > +    BlockAcctStats *s;
> > +
> > +    if (nsid && nsid != 0xffffffff) {
> > +        return NVME_INVALID_FIELD | NVME_DNR;
> > +    }
> Correct.
> > +
> > +    s = blk_get_stats(n->conf.blk);
> > +
> > +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> > +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> > +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> > +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> > +
> > +    if (off > sizeof(smart)) {
> > +        return NVME_INVALID_FIELD | NVME_DNR;
> > +    }
> > +
> > +    trans_len = MIN(sizeof(smart) - off, buf_len);
> > +
> > +    memset(&smart, 0x0, sizeof(smart));
> > +
> > +    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
> > +    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
> Tiny nitpick - the spec asks the value to be rounded up
> 

Ouch. You are correct. I'll swap that for a DIV_ROUND_UP.

> > +static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > +{
> > +    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> > +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> > +    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
> > +    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
> > +    uint8_t  lid = dw10 & 0xff;
> > +    uint8_t  lsp = (dw10 >> 8) & 0xf;
> > +    uint8_t  rae = (dw10 >> 15) & 0x1;
> > +    uint32_t numdl, numdu;
> > +    uint64_t off, lpol, lpou;
> > +    size_t   len;
> > +
> Nitpick: don't we want to check NSID=0 || NSID=0xFFFFFFFF here too?
> 

The spec lists Get Log Page with "Yes" under "Namespace Identifier Used"
but no log pages in v1.3 or v1.4 are namespace specific so we expect
NSID to always be 0 or 0xffffffff. But, there are TPs that have
namespace specific log pages (i.e. TP 4053 Zoned Namepaces). So, it is
not invalid to have NSID set to something.

So, I think we have to defer handling of NSID values to the individual
log pages (like we do for the SMART page).


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields
  2020-07-06  6:12 ` [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields Klaus Jensen
  2020-07-08 19:25   ` Dmitry Fomichev
@ 2020-07-29 13:17   ` Maxim Levitsky
  2020-07-29 13:48     ` Klaus Jensen
  1 sibling, 1 reply; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 13:17 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Since the device does not have any persistent state storage, no
> features are "saveable" and setting the Save (SV) field in any Set
> Features command will result in a Feature Identifier Not Saveable status
> code.
> 
> Similarly, if the Select (SEL) field is set to request saved values, the
> devices will (as it should) return the default values instead.
> 
> Since this also introduces "Supported Capabilities", the nsid field is
> now also checked for validity wrt. the feature being get/set'ed.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c       | 103 +++++++++++++++++++++++++++++++++++++-----
>  hw/block/trace-events |   4 +-
>  include/block/nvme.h  |  27 ++++++++++-
>  3 files changed, 119 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 2d85e853403f..df8b786e4875 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -85,6 +85,14 @@ static const bool nvme_feature_support[NVME_FID_MAX] = {
>      [NVME_TIMESTAMP]                = true,
>  };
>  
> +static const uint32_t nvme_feature_cap[NVME_FID_MAX] = {
> +    [NVME_TEMPERATURE_THRESHOLD]    = NVME_FEAT_CAP_CHANGE,
> +    [NVME_VOLATILE_WRITE_CACHE]     = NVME_FEAT_CAP_CHANGE,
> +    [NVME_NUMBER_OF_QUEUES]         = NVME_FEAT_CAP_CHANGE,
> +    [NVME_ASYNCHRONOUS_EVENT_CONF]  = NVME_FEAT_CAP_CHANGE,
> +    [NVME_TIMESTAMP]                = NVME_FEAT_CAP_CHANGE,
> +};
> +
>  static void nvme_process_sq(void *opaque);
>  
>  static uint16_t nvme_cid(NvmeRequest *req)
> @@ -1083,20 +1091,47 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
>      uint32_t result;
>      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +    NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
>      uint16_t iv;
>  
>      static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
>          [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
>      };
>  
> -    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> +    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
>  
>      if (!nvme_feature_support[fid]) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> +        if (!nsid || nsid > n->num_namespaces) {
> +            /*
> +             * The Reservation Notification Mask and Reservation Persistence
> +             * features require a status code of Invalid Field in Command when
> +             * NSID is 0xFFFFFFFF. Since the device does not support those
> +             * features we can always return Invalid Namespace or Format as we
> +             * should do for all other features.
> +             */
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +    }
> +
> +    switch (sel) {
> +    case NVME_GETFEAT_SELECT_CURRENT:
> +        break;
> +    case NVME_GETFEAT_SELECT_SAVED:
> +        /* no features are saveable by the controller; fallthrough */
> +    case NVME_GETFEAT_SELECT_DEFAULT:
> +        goto defaults;

I hate to say it, but while I have nothing against using 'goto' (unlike some types I met),
In this particular case it feels like it would be better to have  a separate function for
defaults, or have even have a a separate function per feature and have it return current/default/saved/whatever
value. The later would allow to have each feature self contained in its own function.

But on the other hand I see that you fail back to defaults for unchangeble features, which does make
sense. In other words, I don't have strong opinion against using goto here after all.

When feature code will be getting more features in the future (pun intended) you probably will have to split it,\
like I suggest to keep code complexity low.

> +    case NVME_GETFEAT_SELECT_CAP:
> +        result = nvme_feature_cap[fid];
> +        goto out;
> +    }
> +
>      switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          result = 0;
> @@ -1106,22 +1141,45 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>           * return 0 for all other sensors.
>           */
>          if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> -            break;
> +            goto out;
>          }
>  
>          switch (NVME_TEMP_THSEL(dw11)) {
>          case NVME_TEMP_THSEL_OVER:
>              result = n->features.temp_thresh_hi;
> -            break;
> +            goto out;
>          case NVME_TEMP_THSEL_UNDER:
>              result = n->features.temp_thresh_low;
> -            break;
> +            goto out;
>          }
>  
> -        break;
> +        return NVME_INVALID_FIELD | NVME_DNR;
>      case NVME_VOLATILE_WRITE_CACHE:
>          result = blk_enable_write_cache(n->conf.blk);
>          trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
> +        goto out;
> +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> +        result = n->features.async_config;
> +        goto out;
> +    case NVME_TIMESTAMP:
> +        return nvme_get_feature_timestamp(n, cmd);
> +    default:
> +        break;
> +    }
> +
> +defaults:
> +    switch (fid) {
> +    case NVME_TEMPERATURE_THRESHOLD:
> +        result = 0;
> +
> +        if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +            break;
> +        }
> +
> +        if (NVME_TEMP_THSEL(dw11) == NVME_TEMP_THSEL_OVER) {
> +            result = NVME_TEMPERATURE_WARNING;
> +        }
> +
>          break;
>      case NVME_NUMBER_OF_QUEUES:
>          result = (n->params.max_ioqpairs - 1) |
> @@ -1140,16 +1198,12 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>          }
>  
>          break;
> -    case NVME_ASYNCHRONOUS_EVENT_CONF:
> -        result = n->features.async_config;
> -        break;
> -    case NVME_TIMESTAMP:
> -        return nvme_get_feature_timestamp(n, cmd);
>      default:
>          result = nvme_feature_default[fid];
>          break;
>      }
>  
> +out:
>      req->cqe.result = cpu_to_le32(result);
>      return NVME_SUCCESS;
>  }
> @@ -1176,14 +1230,37 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
>      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +    uint8_t save = NVME_SETFEAT_SAVE(dw10);
>  
> -    trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
> +    trace_pci_nvme_setfeat(nvme_cid(req), fid, save, dw11);
> +
> +    if (save) {
> +        return NVME_FID_NOT_SAVEABLE | NVME_DNR;
> +    }
Good.
>  
>      if (!nvme_feature_support[fid]) {
>          return NVME_INVALID_FIELD | NVME_DNR;
>      }
>  
> +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> +        if (!nsid || (nsid != NVME_NSID_BROADCAST &&
> +                      nsid > n->num_namespaces)) {
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +    } else if (nsid && nsid != NVME_NSID_BROADCAST) {
> +        if (nsid > n->num_namespaces) {
> +            return NVME_INVALID_NSID | NVME_DNR;
> +        }
> +
> +        return NVME_FEAT_NOT_NS_SPEC | NVME_DNR;
> +    }
> +
> +    if (!(nvme_feature_cap[fid] & NVME_FEAT_CAP_CHANGE)) {
> +        return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
> +    }
> +
>      switch (fid) {
>      case NVME_TEMPERATURE_THRESHOLD:
>          if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> @@ -2028,7 +2105,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->sqes = (0x6 << 4) | 0x6;
>      id->cqes = (0x4 << 4) | 0x4;
>      id->nn = cpu_to_le32(n->num_namespaces);
> -    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP);
> +    id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
> +                           NVME_ONCS_FEATURES);
OK.
> +
>      id->psd[0].mp = cpu_to_le16(0x9c4);
>      id->psd[0].enlat = cpu_to_le32(0x10);
>      id->psd[0].exlat = cpu_to_le32(0x4);
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 42e62f4649f8..4a4ef34071df 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -46,8 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
> -pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
> -pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" cdw11 0x%"PRIx32""
> +pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
> +pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index cd396111b2f5..179e20a01477 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -663,7 +663,7 @@ enum NvmeStatusCodes {
>      NVME_INVALID_QUEUE_DEL      = 0x010c,
>      NVME_FID_NOT_SAVEABLE       = 0x010d,
>      NVME_FEAT_NOT_CHANGEABLE    = 0x010e,
> -    NVME_FID_NOT_NSID_SPEC      = 0x010f,
> +    NVME_FEAT_NOT_NS_SPEC       = 0x010f,
>      NVME_FW_REQ_SUSYSTEM_RESET  = 0x0110,
>      NVME_CONFLICTING_ATTRS      = 0x0180,
>      NVME_INVALID_PROT_INFO      = 0x0181,
> @@ -907,9 +907,32 @@ enum NvmeFeatureIds {
>      NVME_FID_MAX                    = 0x100,
>  };
>  
> +typedef enum NvmeFeatureCap {
> +    NVME_FEAT_CAP_SAVE      = 1 << 0,
> +    NVME_FEAT_CAP_NS        = 1 << 1,
> +    NVME_FEAT_CAP_CHANGE    = 1 << 2,
> +} NvmeFeatureCap;
> +
> +typedef enum NvmeGetFeatureSelect {
> +    NVME_GETFEAT_SELECT_CURRENT = 0x0,
> +    NVME_GETFEAT_SELECT_DEFAULT = 0x1,
> +    NVME_GETFEAT_SELECT_SAVED   = 0x2,
> +    NVME_GETFEAT_SELECT_CAP     = 0x3,
> +} NvmeGetFeatureSelect;
> +
>  #define NVME_GETSETFEAT_FID_MASK 0xff
>  #define NVME_GETSETFEAT_FID(dw10) (dw10 & NVME_GETSETFEAT_FID_MASK)
>  
> +#define NVME_GETFEAT_SELECT_SHIFT 8
> +#define NVME_GETFEAT_SELECT_MASK  0x7
> +#define NVME_GETFEAT_SELECT(dw10) \
> +    ((dw10 >> NVME_GETFEAT_SELECT_SHIFT) & NVME_GETFEAT_SELECT_MASK)
> +
> +#define NVME_SETFEAT_SAVE_SHIFT 31
> +#define NVME_SETFEAT_SAVE_MASK  0x1
> +#define NVME_SETFEAT_SAVE(dw10) \
> +    ((dw10 >> NVME_SETFEAT_SAVE_SHIFT) & NVME_SETFEAT_SAVE_MASK)

OK.
> +
>  typedef struct NvmeRangeType {
>      uint8_t     type;
>      uint8_t     attributes;
> @@ -926,6 +949,8 @@ typedef struct NvmeLBAF {
>      uint8_t     rp;
>  } NvmeLBAF;
>  
> +#define NVME_NSID_BROADCAST 0xffffffff

Cool, you probably want eventually to go over code and
change all places that use the number to the define.
(No need to do this now)

> +
>  typedef struct NvmeIdNs {
>      uint64_t    nsze;
>      uint64_t    ncap;

Overall looks OK, other that nitpick about that goto so 

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list
  2020-07-06  6:12 ` [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list Klaus Jensen
@ 2020-07-29 13:25   ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 13:25 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Since we are not providing the NGUID or EUI64 fields, we must support
> the Namespace UUID. We do not have any way of storing a persistent
> unique identifier, so conjure up a UUID that is just the namespace id.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  hw/block/nvme.c       | 41 +++++++++++++++++++++++++++++++++++++++++
>  hw/block/trace-events |  1 +
>  2 files changed, 42 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 37e4fd8dfce1..fc58f3d76530 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1007,6 +1007,45 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>      return ret;
>  }
>  
> +static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +    uint32_t nsid = le32_to_cpu(c->nsid);
> +    uint64_t prp1 = le64_to_cpu(c->prp1);
> +    uint64_t prp2 = le64_to_cpu(c->prp2);
> +
> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE];
> +
> +    struct data {
> +        struct {
> +            NvmeIdNsDescr hdr;
> +            uint8_t v[16];
> +        } uuid;
> +    };
> +
> +    struct data *ns_descrs = (struct data *)list;
> +
> +    trace_pci_nvme_identify_ns_descr_list(nsid);
> +
> +    if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> +        trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
> +        return NVME_INVALID_NSID | NVME_DNR;
> +    }
> +
> +    memset(list, 0x0, sizeof(list));
> +
> +    /*
> +     * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
> +     * structure, a Namespace UUID (nidt = 0x3) must be reported in the
> +     * Namespace Identification Descriptor. Add a very basic Namespace UUID
> +     * here.
> +     */
> +    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
> +    ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
> +    stl_be_p(&ns_descrs->uuid.v, nsid);
> +
> +    return nvme_dma_read_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2);
> +}
> +
>  static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
>  {
>      NvmeIdentify *c = (NvmeIdentify *)cmd;
> @@ -1018,6 +1057,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
>          return nvme_identify_ctrl(n, c);
>      case NVME_ID_CNS_NS_ACTIVE_LIST:
>          return nvme_identify_nslist(n, c);
> +    case NVME_ID_CNS_NS_DESCR_LIST:
> +        return nvme_identify_ns_descr_list(n, c);
>      default:
>          trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
>          return NVME_INVALID_FIELD | NVME_DNR;
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 4a4ef34071df..7b7303cab1dd 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
>  pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list
  2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
  2020-07-08 19:26   ` Dmitry Fomichev
@ 2020-07-29 13:27   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 13:27 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:13 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Reject the nsid broadcast value (0xffffffff) and 0xfffffffe in the
> Active Namespace ID list.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index fc58f3d76530..af39126cd8d1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -992,6 +992,16 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>  
>      trace_pci_nvme_identify_nslist(min_nsid);
>  
> +    /*
> +     * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values
> +     * since the Active Namespace ID List should return namespaces with ids
> +     * *higher* than the NSID specified in the command. This is also specified
> +     * in the spec (NVM Express v1.3d, Section 5.15.4).
> +     */
> +    if (min_nsid >= NVME_NSID_BROADCAST - 1) {
> +        return NVME_INVALID_NSID | NVME_DNR;
> +    }
> +
>      list = g_malloc0(data_len);
>      for (i = 0; i < n->num_namespaces; i++) {
>          if (i < min_nsid) {
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field
  2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
  2020-07-06  9:47   ` Philippe Mathieu-Daudé
  2020-07-08 19:26   ` Dmitry Fomichev
@ 2020-07-29 13:34   ` Maxim Levitsky
  2 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 13:34 UTC (permalink / raw)
  To: Klaus Jensen, qemu-block
  Cc: Kevin Wolf, Dmitry Fomichev, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Mon, 2020-07-06 at 08:13 +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The SUBNQN field is mandatory in NVM Express 1.3.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 07d58aa945f2..e3984157926b 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2141,6 +2141,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>  {
>      NvmeIdCtrl *id = &n->id_ctrl;
>      uint8_t *pci_conf = pci_dev->config;
> +    char *subnqn;
>  
>      id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
>      id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
> @@ -2179,6 +2180,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
>      id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
>                             NVME_ONCS_FEATURES);
>  
> +    subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s", n->params.serial);
> +    strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
> +    g_free(subnqn);
> +
>      id->psd[0].mp = cpu_to_le16(0x9c4);
>      id->psd[0].enlat = cpu_to_le32(0x10);
>      id->psd[0].exlat = cpu_to_le32(0x4);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-29 10:43   ` Maxim Levitsky
@ 2020-07-29 13:37     ` Klaus Jensen
  2020-07-29 18:45       ` Maxim Levitsky
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-29 13:37 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Jul 29 13:43, Maxim Levitsky wrote:
> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Add support for the Asynchronous Event Request command. Required for
> > compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.2
> > ("Asynchronous Event Request command").
> > 
> > Mostly imported from Keith's qemu-nvme tree. Modified with a max number
> > of queued events (controllable with the aer_max_queued device
> > parameter). The spec states that the controller *should* retain
> > events, so we do best effort here.
> > 
> > Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Acked-by: Keith Busch <kbusch@kernel.org>
> > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> > Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > ---
> >  hw/block/nvme.c       | 180 ++++++++++++++++++++++++++++++++++++++++--
> >  hw/block/nvme.h       |  10 ++-
> >  hw/block/trace-events |   9 +++
> >  include/block/nvme.h  |   8 +-
> >  4 files changed, 198 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 7cb3787638f6..80c7285bc1cf 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -356,6 +356,85 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
> >      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> >  }
> >  
> > +static void nvme_process_aers(void *opaque)
> > +{
> > +    NvmeCtrl *n = opaque;
> > +    NvmeAsyncEvent *event, *next;
> > +
> > +    trace_pci_nvme_process_aers(n->aer_queued);
> > +
> > +    QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
> > +        NvmeRequest *req;
> > +        NvmeAerResult *result;
> > +
> > +        /* can't post cqe if there is nothing to complete */
> > +        if (!n->outstanding_aers) {
> > +            trace_pci_nvme_no_outstanding_aers();
> > +            break;
> > +        }
> > +
> > +        /* ignore if masked (cqe posted, but event not cleared) */
> > +        if (n->aer_mask & (1 << event->result.event_type)) {
> > +            trace_pci_nvme_aer_masked(event->result.event_type, n->aer_mask);
> > +            continue;
> > +        }
> > +
> > +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> > +        n->aer_queued--;
> > +
> > +        n->aer_mask |= 1 << event->result.event_type;
> > +        n->outstanding_aers--;
> > +
> > +        req = n->aer_reqs[n->outstanding_aers];
> > +
> > +        result = (NvmeAerResult *) &req->cqe.result;
> > +        result->event_type = event->result.event_type;
> > +        result->event_info = event->result.event_info;
> > +        result->log_page = event->result.log_page;
> > +        g_free(event);
> > +
> > +        req->status = NVME_SUCCESS;
> > +
> > +        trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
> > +                                    result->log_page);
> > +
> > +        nvme_enqueue_req_completion(&n->admin_cq, req);
> > +    }
> > +}
> > +
> > +static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
> > +                               uint8_t event_info, uint8_t log_page)
> > +{
> > +    NvmeAsyncEvent *event;
> > +
> > +    trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
> > +
> > +    if (n->aer_queued == n->params.aer_max_queued) {
> > +        trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
> > +        return;
> > +    }
> > +
> > +    event = g_new(NvmeAsyncEvent, 1);
> > +    event->result = (NvmeAerResult) {
> > +        .event_type = event_type,
> > +        .event_info = event_info,
> > +        .log_page   = log_page,
> > +    };
> > +
> > +    QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
> > +    n->aer_queued++;
> > +
> > +    nvme_process_aers(n);
> > +}
> > +
> > +static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
> > +{
> > +    n->aer_mask &= ~(1 << event_type);
> > +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> > +        nvme_process_aers(n);
> > +    }
> > +}
> > +
> >  static void nvme_rw_cb(void *opaque, int ret)
> >  {
> >      NvmeRequest *req = opaque;
> > @@ -606,8 +685,9 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
> >      return NVME_SUCCESS;
> >  }
> >  
> > -static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > -                                uint64_t off, NvmeRequest *req)
> > +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> > +                                uint32_t buf_len, uint64_t off,
> > +                                NvmeRequest *req)
> >  {
> >      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> >      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > @@ -655,6 +735,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> >      smart.power_on_hours[0] =
> >          cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
> >  
> > +    if (!rae) {
> > +        nvme_clear_events(n, NVME_AER_TYPE_SMART);
> > +    }
> > +
> >      return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> >                               prp2);
> >  }
> > @@ -681,14 +765,19 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> >                               prp2);
> >  }
> >  
> > -static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > -                                uint64_t off, NvmeRequest *req)
> > +static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> > +                                uint32_t buf_len, uint64_t off,
> > +                                NvmeRequest *req)
> >  {
> >      uint32_t trans_len;
> >      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> >      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> >      NvmeErrorLog errlog;
> >  
> > +    if (!rae) {
> > +        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
> > +    }
> > +
> >      if (off > sizeof(errlog)) {
> >          return NVME_INVALID_FIELD | NVME_DNR;
> >      }
> > @@ -729,9 +818,9 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >  
> >      switch (lid) {
> >      case NVME_LOG_ERROR_INFO:
> > -        return nvme_error_info(n, cmd, len, off, req);
> > +        return nvme_error_info(n, cmd, rae, len, off, req);
> >      case NVME_LOG_SMART_INFO:
> > -        return nvme_smart_info(n, cmd, len, off, req);
> > +        return nvme_smart_info(n, cmd, rae, len, off, req);
> >      case NVME_LOG_FW_SLOT_INFO:
> >          return nvme_fw_log_info(n, cmd, len, off, req);
> >      default:
> > @@ -1013,6 +1102,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >              ((n->params.max_ioqpairs - 1) << 16);
> >          trace_pci_nvme_getfeat_numq(result);
> >          break;
> > +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> > +        result = n->features.async_config;
> > +        break;
> >      case NVME_TIMESTAMP:
> >          return nvme_get_feature_timestamp(n, cmd);
> >      default:
> > @@ -1064,6 +1156,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >              return NVME_INVALID_FIELD | NVME_DNR;
> >          }
> >  
> > +        if (((n->temperature >= n->features.temp_thresh_hi) ||
> > +            (n->temperature <= n->features.temp_thresh_low)) &&
> > +            NVME_AEC_SMART(n->features.async_config) & NVME_SMART_TEMPERATURE) {
> > +            nvme_enqueue_event(n, NVME_AER_TYPE_SMART,
> > +                               NVME_AER_INFO_SMART_TEMP_THRESH,
> > +                               NVME_LOG_SMART_INFO);
> > +        }
> > +
> >          break;
> >      case NVME_VOLATILE_WRITE_CACHE:
> >          blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
> > @@ -1076,6 +1176,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >          req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
> >                                        ((n->params.max_ioqpairs - 1) << 16));
> >          break;
> > +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> > +        n->features.async_config = dw11;
> > +        break;
> >      case NVME_TIMESTAMP:
> >          return nvme_set_feature_timestamp(n, cmd);
> >      default:
> > @@ -1085,6 +1188,25 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >      return NVME_SUCCESS;
> >  }
> >  
> > +static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > +{
> > +    trace_pci_nvme_aer(nvme_cid(req));
> > +
> > +    if (n->outstanding_aers > n->params.aerl) {
> > +        trace_pci_nvme_aer_aerl_exceeded();
> > +        return NVME_AER_LIMIT_EXCEEDED;
> > +    }
> > +
> > +    n->aer_reqs[n->outstanding_aers] = req;
> > +    n->outstanding_aers++;
> > +
> > +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> > +        nvme_process_aers(n);
> > +    }
> > +
> > +    return NVME_NO_COMPLETE;
> > +}
> 
> Looks good so far
> 
> > +
> >  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >  {
> >      trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> > @@ -1108,6 +1230,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >          return nvme_set_feature(n, cmd, req);
> >      case NVME_ADM_CMD_GET_FEATURES:
> >          return nvme_get_feature(n, cmd, req);
> > +    case NVME_ADM_CMD_ASYNC_EV_REQ:
> > +        return nvme_aer(n, cmd, req);
> >      default:
> >          trace_pci_nvme_err_invalid_admin_opc(cmd->opcode);
> >          return NVME_INVALID_OPCODE | NVME_DNR;
> > @@ -1162,6 +1286,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
> >          }
> >      }
> >  
> > +    while (!QTAILQ_EMPTY(&n->aer_queue)) {
> > +        NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
> > +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> > +        g_free(event);
> > +    }
> > +
> > +    n->aer_queued = 0;
> > +    n->outstanding_aers = 0;
> > +
> >      blk_flush(n->conf.blk);
> >      n->bar.cc = 0;
> >  }
> > @@ -1258,6 +1391,8 @@ static int nvme_start_ctrl(NvmeCtrl *n)
> >  
> >      nvme_set_timestamp(n, 0ULL);
> >  
> > +    QTAILQ_INIT(&n->aer_queue);
> > +
> >      return 0;
> >  }
> >  
> > @@ -1479,6 +1614,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> >                             "completion queue doorbell write"
> >                             " for nonexistent queue,"
> >                             " sqid=%"PRIu32", ignoring", qid);
> > +
> > +            if (n->outstanding_aers) {
> > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> > +                                   NVME_LOG_ERROR_INFO);
> > +            }
> To be honest I would move the check for outstanding AERs to nvme_enqueue_event.
> 
> Also the logic seems a bit off. The code checks that we have outstanding AER requests,
> however we do have internal AER queue for this situation.
> It seems that SMART events are generated without this check but ERROR events only when
> outstanding AERs exist.
> Could you explain? I am probably forgot something from the spec which I haven't read for long time.
> 

I'm pretty sure this has been mentioned before, but I can't find it
anywhere, maybe it was an internal review...

Anyway, I'm interpreting the AER logic as a special case for doorbell writes:

NVM Express v1.3d, Section 4.1 state: "If host software writes an
invalid value to the Submission Queue Tail Doorbell or Completion Queue
Head Doorbell regiter and an Asynchronous Event Request command is
outstanding, then an asynchronous event is posted to the Admin
Completion Queue with a status code of Invalid Doorbell Write Value."

> 
> > +
> >              return;
> >          }
> >  
> > @@ -1489,6 +1631,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> >                             " beyond queue size, sqid=%"PRIu32","
> >                             " new_head=%"PRIu16", ignoring",
> >                             qid, new_head);
> > +
> > +            if (n->outstanding_aers) {
> > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> > +                                   NVME_LOG_ERROR_INFO);
> > +            }
> > +
> >              return;
> >          }
> >  
> > @@ -1519,6 +1668,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> >                             "submission queue doorbell write"
> >                             " for nonexistent queue,"
> >                             " sqid=%"PRIu32", ignoring", qid);
> > +
> > +            if (n->outstanding_aers) {
> > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> > +                                   NVME_LOG_ERROR_INFO);
> > +            }
> > +
> >              return;
> >          }
> >  
> > @@ -1529,6 +1685,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> >                             " beyond queue size, sqid=%"PRIu32","
> >                             " new_tail=%"PRIu16", ignoring",
> >                             qid, new_tail);
> > +
> > +            if (n->outstanding_aers) {
> > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> > +                                   NVME_LOG_ERROR_INFO);
> > +            }
> > +
> >              return;
> >          }
> >  
> > @@ -1650,6 +1813,7 @@ static void nvme_init_state(NvmeCtrl *n)
> >      n->temperature = NVME_TEMPERATURE;
> >      n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
> >      n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> > +    n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
> >  }
> >  
> >  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> > @@ -1805,6 +1969,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
> >       * inconsequential.
> >       */
> >      id->acl = 3;
> > +    id->aerl = n->params.aerl;
> Name a tiny bit unclear. I know that this is from the spec but still.
> 

Yes I know, but I really prefer the spec names if possible (makes it
easy to look them up).

> >      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> >      id->lpa = NVME_LPA_EXTENDED;
> >  
> > @@ -1879,6 +2044,7 @@ static void nvme_exit(PCIDevice *pci_dev)
> >      g_free(n->namespaces);
> >      g_free(n->cq);
> >      g_free(n->sq);
> > +    g_free(n->aer_reqs);
> >  
> >      if (n->params.cmb_size_mb) {
> >          g_free(n->cmbuf);
> > @@ -1899,6 +2065,8 @@ static Property nvme_props[] = {
> >      DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
> >      DEFINE_PROP_UINT32("max_ioqpairs", NvmeCtrl, params.max_ioqpairs, 64),
> >      DEFINE_PROP_UINT16("msix_qsize", NvmeCtrl, params.msix_qsize, 65),
> > +    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
> So this is number of AERs that we allow the user to be outstanding

Yeah, and per the spec, 0's based.

> 
> > +    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
> And this is the number of AERs that we keep in our internal AER queue untill user posts and AER so that we
> can complete it.
> 

Correct.




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields
  2020-07-29 13:17   ` Maxim Levitsky
@ 2020-07-29 13:48     ` Klaus Jensen
  2020-07-29 18:47       ` Maxim Levitsky
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-29 13:48 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Jul 29 16:17, Maxim Levitsky wrote:
> On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Since the device does not have any persistent state storage, no
> > features are "saveable" and setting the Save (SV) field in any Set
> > Features command will result in a Feature Identifier Not Saveable status
> > code.
> > 
> > Similarly, if the Select (SEL) field is set to request saved values, the
> > devices will (as it should) return the default values instead.
> > 
> > Since this also introduces "Supported Capabilities", the nsid field is
> > now also checked for validity wrt. the feature being get/set'ed.
> > 
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > ---
> >  hw/block/nvme.c       | 103 +++++++++++++++++++++++++++++++++++++-----
> >  hw/block/trace-events |   4 +-
> >  include/block/nvme.h  |  27 ++++++++++-
> >  3 files changed, 119 insertions(+), 15 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 2d85e853403f..df8b786e4875 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -1083,20 +1091,47 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >  {
> >      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> >      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> >      uint32_t result;
> >      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> > +    NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
> >      uint16_t iv;
> >  
> >      static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
> >          [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
> >      };
> >  
> > -    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> > +    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
> >  
> >      if (!nvme_feature_support[fid]) {
> >          return NVME_INVALID_FIELD | NVME_DNR;
> >      }
> >  
> > +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> > +        if (!nsid || nsid > n->num_namespaces) {
> > +            /*
> > +             * The Reservation Notification Mask and Reservation Persistence
> > +             * features require a status code of Invalid Field in Command when
> > +             * NSID is 0xFFFFFFFF. Since the device does not support those
> > +             * features we can always return Invalid Namespace or Format as we
> > +             * should do for all other features.
> > +             */
> > +            return NVME_INVALID_NSID | NVME_DNR;
> > +        }
> > +    }
> > +
> > +    switch (sel) {
> > +    case NVME_GETFEAT_SELECT_CURRENT:
> > +        break;
> > +    case NVME_GETFEAT_SELECT_SAVED:
> > +        /* no features are saveable by the controller; fallthrough */
> > +    case NVME_GETFEAT_SELECT_DEFAULT:
> > +        goto defaults;
> 
> I hate to say it, but while I have nothing against using 'goto' (unlike some types I met),
> In this particular case it feels like it would be better to have  a separate function for
> defaults, or have even have a a separate function per feature and have it return current/default/saved/whatever
> value. The later would allow to have each feature self contained in its own function.
> 
> But on the other hand I see that you fail back to defaults for unchangeble features, which does make
> sense. In other words, I don't have strong opinion against using goto here after all.
> 
> When feature code will be getting more features in the future (pun intended) you probably will have to split it,\
> like I suggest to keep code complexity low.
> 

Argh... I know you are right.

Since you are "accepting" the current state with your R-b and it already
carries one from Dmitry I think I'll let this stay for now, but I will
fix this in a follow up patch for sure.

> > @@ -926,6 +949,8 @@ typedef struct NvmeLBAF {
> >      uint8_t     rp;
> >  } NvmeLBAF;
> >  
> > +#define NVME_NSID_BROADCAST 0xffffffff
> 
> Cool, you probably want eventually to go over code and
> change all places that use the number to the define.
> (No need to do this now)
> 

True. Noted :)


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-29 11:44     ` Klaus Jensen
@ 2020-07-29 18:35       ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 18:35 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Wed, 2020-07-29 at 13:44 +0200, Klaus Jensen wrote:
> On Jul 29 13:24, Maxim Levitsky wrote:
> > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > From: Klaus Jensen <k.jensen@samsung.com>
> > > 
> > > Add support for the Get Log Page command and basic implementations of
> > > the mandatory Error Information, SMART / Health Information and Firmware
> > > Slot Information log pages.
> > > 
> > > In violation of the specification, the SMART / Health Information log
> > > page does not persist information over the lifetime of the controller
> > > because the device has no place to store such persistent state.
> > > 
> > > Note that the LPA field in the Identify Controller data structure
> > > intentionally has bit 0 cleared because there is no namespace specific
> > > information in the SMART / Health information log page.
> > > 
> > > Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> > > Section 5.14 ("Get Log Page command").
> > > 
> > > Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > > Acked-by: Keith Busch <kbusch@kernel.org>
> > > ---
> > >  hw/block/nvme.c       | 140 +++++++++++++++++++++++++++++++++++++++++-
> > >  hw/block/nvme.h       |   2 +
> > >  hw/block/trace-events |   2 +
> > >  include/block/nvme.h  |   8 ++-
> > >  4 files changed, 149 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > > index b6bc75eb61a2..7cb3787638f6 100644
> > > --- a/hw/block/nvme.c
> > > +++ b/hw/block/nvme.c
> > > @@ -606,6 +606,140 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
> > >      return NVME_SUCCESS;
> > >  }
> > >  
> > > +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > > +                                uint64_t off, NvmeRequest *req)
> > > +{
> > > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> > > +
> > > +    uint32_t trans_len;
> > > +    time_t current_ms;
> > > +    uint64_t units_read = 0, units_written = 0;
> > > +    uint64_t read_commands = 0, write_commands = 0;
> > > +    NvmeSmartLog smart;
> > > +    BlockAcctStats *s;
> > > +
> > > +    if (nsid && nsid != 0xffffffff) {
> > > +        return NVME_INVALID_FIELD | NVME_DNR;
> > > +    }
> > Correct.
> > > +
> > > +    s = blk_get_stats(n->conf.blk);
> > > +
> > > +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> > > +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> > > +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> > > +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> > > +
> > > +    if (off > sizeof(smart)) {
> > > +        return NVME_INVALID_FIELD | NVME_DNR;
> > > +    }
> > > +
> > > +    trans_len = MIN(sizeof(smart) - off, buf_len);
> > > +
> > > +    memset(&smart, 0x0, sizeof(smart));
> > > +
> > > +    smart.data_units_read[0] = cpu_to_le64(units_read / 1000);
> > > +    smart.data_units_written[0] = cpu_to_le64(units_written / 1000);
> > Tiny nitpick - the spec asks the value to be rounded up
> > 
> 
> Ouch. You are correct. I'll swap that for a DIV_ROUND_UP.
Not a big deal though as these values don't matter much to anybody since we don't have
way of storing them permanently.

> 
> > > +static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > > +{
> > > +    uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> > > +    uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> > > +    uint32_t dw12 = le32_to_cpu(cmd->cdw12);
> > > +    uint32_t dw13 = le32_to_cpu(cmd->cdw13);
> > > +    uint8_t  lid = dw10 & 0xff;
> > > +    uint8_t  lsp = (dw10 >> 8) & 0xf;
> > > +    uint8_t  rae = (dw10 >> 15) & 0x1;
> > > +    uint32_t numdl, numdu;
> > > +    uint64_t off, lpol, lpou;
> > > +    size_t   len;
> > > +
> > Nitpick: don't we want to check NSID=0 || NSID=0xFFFFFFFF here too?
> > 
> 
> The spec lists Get Log Page with "Yes" under "Namespace Identifier Used"
> but no log pages in v1.3 or v1.4 are namespace specific so we expect
> NSID to always be 0 or 0xffffffff. But, there are TPs that have
> namespace specific log pages (i.e. TP 4053 Zoned Namepaces). So, it is
> not invalid to have NSID set to something.
> 
> So, I think we have to defer handling of NSID values to the individual
> log pages (like we do for the SMART page).
Ah, OK.

> 


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-29 13:37     ` Klaus Jensen
@ 2020-07-29 18:45       ` Maxim Levitsky
  2020-07-29 20:08         ` Klaus Jensen
  0 siblings, 1 reply; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 18:45 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Wed, 2020-07-29 at 15:37 +0200, Klaus Jensen wrote:
> On Jul 29 13:43, Maxim Levitsky wrote:
> > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > From: Klaus Jensen <k.jensen@samsung.com>
> > > 
> > > Add support for the Asynchronous Event Request command. Required for
> > > compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.2
> > > ("Asynchronous Event Request command").
> > > 
> > > Mostly imported from Keith's qemu-nvme tree. Modified with a max number
> > > of queued events (controllable with the aer_max_queued device
> > > parameter). The spec states that the controller *should* retain
> > > events, so we do best effort here.
> > > 
> > > Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
> > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > > Acked-by: Keith Busch <kbusch@kernel.org>
> > > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > > ---
> > >  hw/block/nvme.c       | 180 ++++++++++++++++++++++++++++++++++++++++--
> > >  hw/block/nvme.h       |  10 ++-
> > >  hw/block/trace-events |   9 +++
> > >  include/block/nvme.h  |   8 +-
> > >  4 files changed, 198 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > > index 7cb3787638f6..80c7285bc1cf 100644
> > > --- a/hw/block/nvme.c
> > > +++ b/hw/block/nvme.c
> > > @@ -356,6 +356,85 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
> > >      timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> > >  }
> > >  
> > > +static void nvme_process_aers(void *opaque)
> > > +{
> > > +    NvmeCtrl *n = opaque;
> > > +    NvmeAsyncEvent *event, *next;
> > > +
> > > +    trace_pci_nvme_process_aers(n->aer_queued);
> > > +
> > > +    QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
> > > +        NvmeRequest *req;
> > > +        NvmeAerResult *result;
> > > +
> > > +        /* can't post cqe if there is nothing to complete */
> > > +        if (!n->outstanding_aers) {
> > > +            trace_pci_nvme_no_outstanding_aers();
> > > +            break;
> > > +        }
> > > +
> > > +        /* ignore if masked (cqe posted, but event not cleared) */
> > > +        if (n->aer_mask & (1 << event->result.event_type)) {
> > > +            trace_pci_nvme_aer_masked(event->result.event_type, n->aer_mask);
> > > +            continue;
> > > +        }
> > > +
> > > +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> > > +        n->aer_queued--;
> > > +
> > > +        n->aer_mask |= 1 << event->result.event_type;
> > > +        n->outstanding_aers--;
> > > +
> > > +        req = n->aer_reqs[n->outstanding_aers];
> > > +
> > > +        result = (NvmeAerResult *) &req->cqe.result;
> > > +        result->event_type = event->result.event_type;
> > > +        result->event_info = event->result.event_info;
> > > +        result->log_page = event->result.log_page;
> > > +        g_free(event);
> > > +
> > > +        req->status = NVME_SUCCESS;
> > > +
> > > +        trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
> > > +                                    result->log_page);
> > > +
> > > +        nvme_enqueue_req_completion(&n->admin_cq, req);
> > > +    }
> > > +}
> > > +
> > > +static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
> > > +                               uint8_t event_info, uint8_t log_page)
> > > +{
> > > +    NvmeAsyncEvent *event;
> > > +
> > > +    trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
> > > +
> > > +    if (n->aer_queued == n->params.aer_max_queued) {
> > > +        trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
> > > +        return;
> > > +    }
> > > +
> > > +    event = g_new(NvmeAsyncEvent, 1);
> > > +    event->result = (NvmeAerResult) {
> > > +        .event_type = event_type,
> > > +        .event_info = event_info,
> > > +        .log_page   = log_page,
> > > +    };
> > > +
> > > +    QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
> > > +    n->aer_queued++;
> > > +
> > > +    nvme_process_aers(n);
> > > +}
> > > +
> > > +static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
> > > +{
> > > +    n->aer_mask &= ~(1 << event_type);
> > > +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> > > +        nvme_process_aers(n);
> > > +    }
> > > +}
> > > +
> > >  static void nvme_rw_cb(void *opaque, int ret)
> > >  {
> > >      NvmeRequest *req = opaque;
> > > @@ -606,8 +685,9 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
> > >      return NVME_SUCCESS;
> > >  }
> > >  
> > > -static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > > -                                uint64_t off, NvmeRequest *req)
> > > +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> > > +                                uint32_t buf_len, uint64_t off,
> > > +                                NvmeRequest *req)
> > >  {
> > >      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > >      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > > @@ -655,6 +735,10 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > >      smart.power_on_hours[0] =
> > >          cpu_to_le64((((current_ms - n->starttime_ms) / 1000) / 60) / 60);
> > >  
> > > +    if (!rae) {
> > > +        nvme_clear_events(n, NVME_AER_TYPE_SMART);
> > > +    }
> > > +
> > >      return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> > >                               prp2);
> > >  }
> > > @@ -681,14 +765,19 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > >                               prp2);
> > >  }
> > >  
> > > -static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > > -                                uint64_t off, NvmeRequest *req)
> > > +static uint16_t nvme_error_info(NvmeCtrl *n, NvmeCmd *cmd, uint8_t rae,
> > > +                                uint32_t buf_len, uint64_t off,
> > > +                                NvmeRequest *req)
> > >  {
> > >      uint32_t trans_len;
> > >      uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > >      uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > >      NvmeErrorLog errlog;
> > >  
> > > +    if (!rae) {
> > > +        nvme_clear_events(n, NVME_AER_TYPE_ERROR);
> > > +    }
> > > +
> > >      if (off > sizeof(errlog)) {
> > >          return NVME_INVALID_FIELD | NVME_DNR;
> > >      }
> > > @@ -729,9 +818,9 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >  
> > >      switch (lid) {
> > >      case NVME_LOG_ERROR_INFO:
> > > -        return nvme_error_info(n, cmd, len, off, req);
> > > +        return nvme_error_info(n, cmd, rae, len, off, req);
> > >      case NVME_LOG_SMART_INFO:
> > > -        return nvme_smart_info(n, cmd, len, off, req);
> > > +        return nvme_smart_info(n, cmd, rae, len, off, req);
> > >      case NVME_LOG_FW_SLOT_INFO:
> > >          return nvme_fw_log_info(n, cmd, len, off, req);
> > >      default:
> > > @@ -1013,6 +1102,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >              ((n->params.max_ioqpairs - 1) << 16);
> > >          trace_pci_nvme_getfeat_numq(result);
> > >          break;
> > > +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> > > +        result = n->features.async_config;
> > > +        break;
> > >      case NVME_TIMESTAMP:
> > >          return nvme_get_feature_timestamp(n, cmd);
> > >      default:
> > > @@ -1064,6 +1156,14 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >              return NVME_INVALID_FIELD | NVME_DNR;
> > >          }
> > >  
> > > +        if (((n->temperature >= n->features.temp_thresh_hi) ||
> > > +            (n->temperature <= n->features.temp_thresh_low)) &&
> > > +            NVME_AEC_SMART(n->features.async_config) & NVME_SMART_TEMPERATURE) {
> > > +            nvme_enqueue_event(n, NVME_AER_TYPE_SMART,
> > > +                               NVME_AER_INFO_SMART_TEMP_THRESH,
> > > +                               NVME_LOG_SMART_INFO);
> > > +        }
> > > +
> > >          break;
> > >      case NVME_VOLATILE_WRITE_CACHE:
> > >          blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
> > > @@ -1076,6 +1176,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >          req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
> > >                                        ((n->params.max_ioqpairs - 1) << 16));
> > >          break;
> > > +    case NVME_ASYNCHRONOUS_EVENT_CONF:
> > > +        n->features.async_config = dw11;
> > > +        break;
> > >      case NVME_TIMESTAMP:
> > >          return nvme_set_feature_timestamp(n, cmd);
> > >      default:
> > > @@ -1085,6 +1188,25 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >      return NVME_SUCCESS;
> > >  }
> > >  
> > > +static uint16_t nvme_aer(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > > +{
> > > +    trace_pci_nvme_aer(nvme_cid(req));
> > > +
> > > +    if (n->outstanding_aers > n->params.aerl) {
> > > +        trace_pci_nvme_aer_aerl_exceeded();
> > > +        return NVME_AER_LIMIT_EXCEEDED;
> > > +    }
> > > +
> > > +    n->aer_reqs[n->outstanding_aers] = req;
> > > +    n->outstanding_aers++;
> > > +
> > > +    if (!QTAILQ_EMPTY(&n->aer_queue)) {
> > > +        nvme_process_aers(n);
> > > +    }
> > > +
> > > +    return NVME_NO_COMPLETE;
> > > +}
> > 
> > Looks good so far
> > 
> > > +
> > >  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >  {
> > >      trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> > > @@ -1108,6 +1230,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >          return nvme_set_feature(n, cmd, req);
> > >      case NVME_ADM_CMD_GET_FEATURES:
> > >          return nvme_get_feature(n, cmd, req);
> > > +    case NVME_ADM_CMD_ASYNC_EV_REQ:
> > > +        return nvme_aer(n, cmd, req);
> > >      default:
> > >          trace_pci_nvme_err_invalid_admin_opc(cmd->opcode);
> > >          return NVME_INVALID_OPCODE | NVME_DNR;
> > > @@ -1162,6 +1286,15 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
> > >          }
> > >      }
> > >  
> > > +    while (!QTAILQ_EMPTY(&n->aer_queue)) {
> > > +        NvmeAsyncEvent *event = QTAILQ_FIRST(&n->aer_queue);
> > > +        QTAILQ_REMOVE(&n->aer_queue, event, entry);
> > > +        g_free(event);
> > > +    }
> > > +
> > > +    n->aer_queued = 0;
> > > +    n->outstanding_aers = 0;
> > > +
> > >      blk_flush(n->conf.blk);
> > >      n->bar.cc = 0;
> > >  }
> > > @@ -1258,6 +1391,8 @@ static int nvme_start_ctrl(NvmeCtrl *n)
> > >  
> > >      nvme_set_timestamp(n, 0ULL);
> > >  
> > > +    QTAILQ_INIT(&n->aer_queue);
> > > +
> > >      return 0;
> > >  }
> > >  
> > > @@ -1479,6 +1614,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> > >                             "completion queue doorbell write"
> > >                             " for nonexistent queue,"
> > >                             " sqid=%"PRIu32", ignoring", qid);
> > > +
> > > +            if (n->outstanding_aers) {
> > > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > > +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> > > +                                   NVME_LOG_ERROR_INFO);
> > > +            }
> > To be honest I would move the check for outstanding AERs to nvme_enqueue_event.
> > 
> > Also the logic seems a bit off. The code checks that we have outstanding AER requests,
> > however we do have internal AER queue for this situation.
> > It seems that SMART events are generated without this check but ERROR events only when
> > outstanding AERs exist.
> > Could you explain? I am probably forgot something from the spec which I haven't read for long time.
> > 
> 
> I'm pretty sure this has been mentioned before, but I can't find it
> anywhere, maybe it was an internal review...
> 
> Anyway, I'm interpreting the AER logic as a special case for doorbell writes:
> 
> NVM Express v1.3d, Section 4.1 state: "If host software writes an
> invalid value to the Submission Queue Tail Doorbell or Completion Queue
> Head Doorbell regiter and an Asynchronous Event Request command is
> outstanding, then an asynchronous event is posted to the Admin
> Completion Queue with a status code of Invalid Doorbell Write Value."
Ah, they indeed mention this. So let it be like that, it probably really 
doesn't matter anyway.
If you respin the patches, you could add the above as a comment to why this is done
to avoid confusions.

> 
> > > +
> > >              return;
> > >          }
> > >  
> > > @@ -1489,6 +1631,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> > >                             " beyond queue size, sqid=%"PRIu32","
> > >                             " new_head=%"PRIu16", ignoring",
> > >                             qid, new_head);
> > > +
> > > +            if (n->outstanding_aers) {
> > > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > > +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> > > +                                   NVME_LOG_ERROR_INFO);
> > > +            }
> > > +
> > >              return;
> > >          }
> > >  
> > > @@ -1519,6 +1668,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> > >                             "submission queue doorbell write"
> > >                             " for nonexistent queue,"
> > >                             " sqid=%"PRIu32", ignoring", qid);
> > > +
> > > +            if (n->outstanding_aers) {
> > > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > > +                                   NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
> > > +                                   NVME_LOG_ERROR_INFO);
> > > +            }
> > > +
> > >              return;
> > >          }
> > >  
> > > @@ -1529,6 +1685,13 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
> > >                             " beyond queue size, sqid=%"PRIu32","
> > >                             " new_tail=%"PRIu16", ignoring",
> > >                             qid, new_tail);
> > > +
> > > +            if (n->outstanding_aers) {
> > > +                nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
> > > +                                   NVME_AER_INFO_ERR_INVALID_DB_VALUE,
> > > +                                   NVME_LOG_ERROR_INFO);
> > > +            }
> > > +
> > >              return;
> > >          }
> > >  
> > > @@ -1650,6 +1813,7 @@ static void nvme_init_state(NvmeCtrl *n)
> > >      n->temperature = NVME_TEMPERATURE;
> > >      n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
> > >      n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> > > +    n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
> > >  }
> > >  
> > >  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> > > @@ -1805,6 +1969,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
> > >       * inconsequential.
> > >       */
> > >      id->acl = 3;
> > > +    id->aerl = n->params.aerl;
> > Name a tiny bit unclear. I know that this is from the spec but still.
> > 
> 
> Yes I know, but I really prefer the spec names if possible (makes it
> easy to look them up).
I understand. I only complained a bit about that name beeing exposed to
outside as a device property. It doesn't matter that much to me though.
> 
> > >      id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO;
> > >      id->lpa = NVME_LPA_EXTENDED;
> > >  
> > > @@ -1879,6 +2044,7 @@ static void nvme_exit(PCIDevice *pci_dev)
> > >      g_free(n->namespaces);
> > >      g_free(n->cq);
> > >      g_free(n->sq);
> > > +    g_free(n->aer_reqs);
> > >  
> > >      if (n->params.cmb_size_mb) {
> > >          g_free(n->cmbuf);
> > > @@ -1899,6 +2065,8 @@ static Property nvme_props[] = {
> > >      DEFINE_PROP_UINT32("num_queues", NvmeCtrl, params.num_queues, 0),
> > >      DEFINE_PROP_UINT32("max_ioqpairs", NvmeCtrl, params.max_ioqpairs, 64),
> > >      DEFINE_PROP_UINT16("msix_qsize", NvmeCtrl, params.msix_qsize, 65),
> > > +    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
> > So this is number of AERs that we allow the user to be outstanding
> 
> Yeah, and per the spec, 0's based.
> 
> > > +    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
> > And this is the number of AERs that we keep in our internal AER queue untill user posts and AER so that we
> > can complete it.
> > 
> 
> Correct.

Yep - this is what I understood after examining all of the patch, but from the names itself it is hard to understand this.
Maybe a comment next to property to at least make it easier for advanced user (e.g user that reads code)
to understand?

(I often end up reading source to understand various qemu device parameters).

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields
  2020-07-29 13:48     ` Klaus Jensen
@ 2020-07-29 18:47       ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-29 18:47 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Wed, 2020-07-29 at 15:48 +0200, Klaus Jensen wrote:
> On Jul 29 16:17, Maxim Levitsky wrote:
> > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > From: Klaus Jensen <k.jensen@samsung.com>
> > > 
> > > Since the device does not have any persistent state storage, no
> > > features are "saveable" and setting the Save (SV) field in any Set
> > > Features command will result in a Feature Identifier Not Saveable status
> > > code.
> > > 
> > > Similarly, if the Select (SEL) field is set to request saved values, the
> > > devices will (as it should) return the default values instead.
> > > 
> > > Since this also introduces "Supported Capabilities", the nsid field is
> > > now also checked for validity wrt. the feature being get/set'ed.
> > > 
> > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > > ---
> > >  hw/block/nvme.c       | 103 +++++++++++++++++++++++++++++++++++++-----
> > >  hw/block/trace-events |   4 +-
> > >  include/block/nvme.h  |  27 ++++++++++-
> > >  3 files changed, 119 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > > index 2d85e853403f..df8b786e4875 100644
> > > --- a/hw/block/nvme.c
> > > +++ b/hw/block/nvme.c
> > > @@ -1083,20 +1091,47 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> > >  {
> > >      uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> > >      uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> > > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> > >      uint32_t result;
> > >      uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> > > +    NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10);
> > >      uint16_t iv;
> > >  
> > >      static const uint32_t nvme_feature_default[NVME_FID_MAX] = {
> > >          [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT,
> > >      };
> > >  
> > > -    trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> > > +    trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11);
> > >  
> > >      if (!nvme_feature_support[fid]) {
> > >          return NVME_INVALID_FIELD | NVME_DNR;
> > >      }
> > >  
> > > +    if (nvme_feature_cap[fid] & NVME_FEAT_CAP_NS) {
> > > +        if (!nsid || nsid > n->num_namespaces) {
> > > +            /*
> > > +             * The Reservation Notification Mask and Reservation Persistence
> > > +             * features require a status code of Invalid Field in Command when
> > > +             * NSID is 0xFFFFFFFF. Since the device does not support those
> > > +             * features we can always return Invalid Namespace or Format as we
> > > +             * should do for all other features.
> > > +             */
> > > +            return NVME_INVALID_NSID | NVME_DNR;
> > > +        }
> > > +    }
> > > +
> > > +    switch (sel) {
> > > +    case NVME_GETFEAT_SELECT_CURRENT:
> > > +        break;
> > > +    case NVME_GETFEAT_SELECT_SAVED:
> > > +        /* no features are saveable by the controller; fallthrough */
> > > +    case NVME_GETFEAT_SELECT_DEFAULT:
> > > +        goto defaults;
> > 
> > I hate to say it, but while I have nothing against using 'goto' (unlike some types I met),
> > In this particular case it feels like it would be better to have  a separate function for
> > defaults, or have even have a a separate function per feature and have it return current/default/saved/whatever
> > value. The later would allow to have each feature self contained in its own function.
> > 
> > But on the other hand I see that you fail back to defaults for unchangeble features, which does make
> > sense. In other words, I don't have strong opinion against using goto here after all.
> > 
> > When feature code will be getting more features in the future (pun intended) you probably will have to split it,\
> > like I suggest to keep code complexity low.
> > 
> 
> Argh... I know you are right.
> 
> Since you are "accepting" the current state with your R-b and it already
> carries one from Dmitry I think I'll let this stay for now, but I will
> fix this in a follow up patch for sure.
Yep, this is exactly what I was thinking.

Best regards,
	Maxim Levitsky

> 
> > > @@ -926,6 +949,8 @@ typedef struct NvmeLBAF {
> > >      uint8_t     rp;
> > >  } NvmeLBAF;
> > >  
> > > +#define NVME_NSID_BROADCAST 0xffffffff
> > 
> > Cool, you probably want eventually to go over code and
> > change all places that use the number to the define.
> > (No need to do this now)
> > 
> 
> True. Noted :)
> 




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-29 18:45       ` Maxim Levitsky
@ 2020-07-29 20:08         ` Klaus Jensen
  2020-07-30  8:50           ` Maxim Levitsky
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-07-29 20:08 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Jul 29 21:45, Maxim Levitsky wrote:
> On Wed, 2020-07-29 at 15:37 +0200, Klaus Jensen wrote:
> > On Jul 29 13:43, Maxim Levitsky wrote:
> > > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > > +    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
> > > So this is number of AERs that we allow the user to be outstanding
> > 
> > Yeah, and per the spec, 0's based.
> > 
> > > > +    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
> > > And this is the number of AERs that we keep in our internal AER queue untill user posts and AER so that we
> > > can complete it.
> > > 
> > 
> > Correct.
> 
> Yep - this is what I understood after examining all of the patch, but from the names itself it is hard to understand this.
> Maybe a comment next to property to at least make it easier for advanced user (e.g user that reads code)
> to understand?
> 
> (I often end up reading source to understand various qemu device parameters).
> 

I should add this in docs/specs/nvme.txt (which shows up in one of my
next series when I add a new PCI id for the device). For now, I will add
it to the top of the file like the rest of the parameters.

Subsequent series contains a lot more additions of new parameters that
is directly from the spec and to me it really only makes sense that they
share the names if they can.

We could consider having them under a "spec namespace"? So, say, we do
DEFINE_PROP_UINT("spec.aerl", ...)?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command
  2020-07-29 20:08         ` Klaus Jensen
@ 2020-07-30  8:50           ` Maxim Levitsky
  0 siblings, 0 replies; 60+ messages in thread
From: Maxim Levitsky @ 2020-07-30  8:50 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Javier Gonzalez, Philippe Mathieu-Daudé

On Wed, 2020-07-29 at 22:08 +0200, Klaus Jensen wrote:
> On Jul 29 21:45, Maxim Levitsky wrote:
> > On Wed, 2020-07-29 at 15:37 +0200, Klaus Jensen wrote:
> > > On Jul 29 13:43, Maxim Levitsky wrote:
> > > > On Mon, 2020-07-06 at 08:12 +0200, Klaus Jensen wrote:
> > > > > +    DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3),
> > > > So this is number of AERs that we allow the user to be outstanding
> > > 
> > > Yeah, and per the spec, 0's based.
> > > 
> > > > > +    DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64),
> > > > And this is the number of AERs that we keep in our internal AER queue untill user posts and AER so that we
> > > > can complete it.
> > > > 
> > > 
> > > Correct.
> > 
> > Yep - this is what I understood after examining all of the patch, but from the names itself it is hard to understand this.
> > Maybe a comment next to property to at least make it easier for advanced user (e.g user that reads code)
> > to understand?
> > 
> > (I often end up reading source to understand various qemu device parameters).
> > 
> 
> I should add this in docs/specs/nvme.txt (which shows up in one of my
> next series when I add a new PCI id for the device). For now, I will add
> it to the top of the file like the rest of the parameters.
This is a good idea!
> 
> Subsequent series contains a lot more additions of new parameters that
> is directly from the spec and to me it really only makes sense that they
> share the names if they can.
> 
> We could consider having them under a "spec namespace"? So, say, we do
> DEFINE_PROP_UINT("spec.aerl", ...)?
I personally tend to think that it won't make it much more readable.

Best regards,
	Maxim Levitsky
> 




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
  2020-07-08 19:22   ` Dmitry Fomichev
  2020-07-29 10:24   ` Maxim Levitsky
@ 2020-09-29 13:11   ` Peter Maydell
  2020-09-29 21:46     ` Klaus Jensen
  2 siblings, 1 reply; 60+ messages in thread
From: Peter Maydell @ 2020-09-29 13:11 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Qemu-block, Dmitry Fomichev, Klaus Jensen,
	QEMU Developers, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

On Mon, 6 Jul 2020 at 07:15, Klaus Jensen <its@irrelevant.dk> wrote:
>
> From: Klaus Jensen <k.jensen@samsung.com>
>
> Add support for the Get Log Page command and basic implementations of
> the mandatory Error Information, SMART / Health Information and Firmware
> Slot Information log pages.
>
> In violation of the specification, the SMART / Health Information log
> page does not persist information over the lifetime of the controller
> because the device has no place to store such persistent state.
>
> Note that the LPA field in the Identify Controller data structure
> intentionally has bit 0 cleared because there is no namespace specific
> information in the SMART / Health information log page.
>
> Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> Section 5.14 ("Get Log Page command").

Hi; Coverity reports a potential issue in this code
(CID 1432413):

> +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                uint64_t off, NvmeRequest *req)
> +{
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> +
> +    uint32_t trans_len;
> +    time_t current_ms;
> +    uint64_t units_read = 0, units_written = 0;
> +    uint64_t read_commands = 0, write_commands = 0;
> +    NvmeSmartLog smart;
> +    BlockAcctStats *s;
> +
> +    if (nsid && nsid != 0xffffffff) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    s = blk_get_stats(n->conf.blk);
> +
> +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> +
> +    if (off > sizeof(smart)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }

Here we check for off > sizeof(smart), which means that we allow
off == sizeof(smart)...

> +
> +    trans_len = MIN(sizeof(smart) - off, buf_len);

> +    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> +                             prp2);

...in which case the pointer we pass to nvme_dma_read_prp() will
be off the end of the 'smart' object.

Now we are passing 0 as the trans_len, so I *think* this function
will not actually read the buffer (Coverity is not smart
enough to see this); so I could just close the Coverity issue as
a false-positive. But maybe there is a clearer-to-humans as well
as clearer-to-Coverity way to write this. What do you think ?

> +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> +                                 uint64_t off, NvmeRequest *req)
> +{
> +    uint32_t trans_len;
> +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> +    NvmeFwSlotInfoLog fw_log = {
> +        .afi = 0x1,
> +    };
> +
> +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> +
> +    if (off > sizeof(fw_log)) {
> +        return NVME_INVALID_FIELD | NVME_DNR;
> +    }
> +
> +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> +
> +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> +                             prp2);

Coverity warns about the same structure here (CID 1432411).

thanks
-- PMM


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-09-29 13:11   ` Peter Maydell
@ 2020-09-29 21:46     ` Klaus Jensen
  2020-09-29 22:34       ` Keith Busch
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-09-29 21:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Kevin Wolf, Qemu-block, Dmitry Fomichev, Klaus Jensen,
	QEMU Developers, Max Reitz, Keith Busch, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 4525 bytes --]

On Sep 29 14:11, Peter Maydell wrote:
> On Mon, 6 Jul 2020 at 07:15, Klaus Jensen <its@irrelevant.dk> wrote:
> >
> > From: Klaus Jensen <k.jensen@samsung.com>
> >
> > Add support for the Get Log Page command and basic implementations of
> > the mandatory Error Information, SMART / Health Information and Firmware
> > Slot Information log pages.
> >
> > In violation of the specification, the SMART / Health Information log
> > page does not persist information over the lifetime of the controller
> > because the device has no place to store such persistent state.
> >
> > Note that the LPA field in the Identify Controller data structure
> > intentionally has bit 0 cleared because there is no namespace specific
> > information in the SMART / Health information log page.
> >
> > Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d,
> > Section 5.14 ("Get Log Page command").
> 
> Hi; Coverity reports a potential issue in this code
> (CID 1432413):
> 
> > +static uint16_t nvme_smart_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > +                                uint64_t off, NvmeRequest *req)
> > +{
> > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > +    uint32_t nsid = le32_to_cpu(cmd->nsid);
> > +
> > +    uint32_t trans_len;
> > +    time_t current_ms;
> > +    uint64_t units_read = 0, units_written = 0;
> > +    uint64_t read_commands = 0, write_commands = 0;
> > +    NvmeSmartLog smart;
> > +    BlockAcctStats *s;
> > +
> > +    if (nsid && nsid != 0xffffffff) {
> > +        return NVME_INVALID_FIELD | NVME_DNR;
> > +    }
> > +
> > +    s = blk_get_stats(n->conf.blk);
> > +
> > +    units_read = s->nr_bytes[BLOCK_ACCT_READ] >> BDRV_SECTOR_BITS;
> > +    units_written = s->nr_bytes[BLOCK_ACCT_WRITE] >> BDRV_SECTOR_BITS;
> > +    read_commands = s->nr_ops[BLOCK_ACCT_READ];
> > +    write_commands = s->nr_ops[BLOCK_ACCT_WRITE];
> > +
> > +    if (off > sizeof(smart)) {
> > +        return NVME_INVALID_FIELD | NVME_DNR;
> > +    }
> 
> Here we check for off > sizeof(smart), which means that we allow
> off == sizeof(smart)...
> 
> > +
> > +    trans_len = MIN(sizeof(smart) - off, buf_len);
> 
> > +    return nvme_dma_read_prp(n, (uint8_t *) &smart + off, trans_len, prp1,
> > +                             prp2);
> 
> ...in which case the pointer we pass to nvme_dma_read_prp() will
> be off the end of the 'smart' object.
> 
> Now we are passing 0 as the trans_len, so I *think* this function
> will not actually read the buffer (Coverity is not smart
> enough to see this); so I could just close the Coverity issue as
> a false-positive. But maybe there is a clearer-to-humans as well
> as clearer-to-Coverity way to write this. What do you think ?
> 
> > +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > +                                 uint64_t off, NvmeRequest *req)
> > +{
> > +    uint32_t trans_len;
> > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > +    NvmeFwSlotInfoLog fw_log = {
> > +        .afi = 0x1,
> > +    };
> > +
> > +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> > +
> > +    if (off > sizeof(fw_log)) {
> > +        return NVME_INVALID_FIELD | NVME_DNR;
> > +    }
> > +
> > +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> > +
> > +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> > +                             prp2);
> 
> Coverity warns about the same structure here (CID 1432411).
> 
> thanks
> -- PMM

Hi Peter,

Thanks. This is somewhere in the middle of a bunch of patches I got
merged I think, commit 94a7897c41db? I just requested Coverity access.

What happens is that nvme_dma_read_prp will call into nvme_map_prp which
wont map anything because len is 0. This will cause the statically
allocated QEMUSGList and QEMUIOVector in the request to be
uninitialized. Returning from nvme_map_prp, nvme_dma_read_prp will
notice that req->qsg.nsg is zero so it will default to the iov and move
into qemu_iovec_{to,from}_buf(&req->iov, ...). In there we actually pass
the NULL struct iovec, but since there is a __builtin_constant_p(bytes)
condition at the end of it all, we never follow it.

Not "serious" I think, but definitely not good. We will of course fix
this up.

@keith, do you agree with my analysis?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-09-29 21:46     ` Klaus Jensen
@ 2020-09-29 22:34       ` Keith Busch
  2020-09-29 22:42         ` Klaus Jensen
  0 siblings, 1 reply; 60+ messages in thread
From: Keith Busch @ 2020-09-29 22:34 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Peter Maydell, Qemu-block, Dmitry Fomichev,
	Klaus Jensen, QEMU Developers, Max Reitz, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

On Tue, Sep 29, 2020 at 11:46:00PM +0200, Klaus Jensen wrote:
> On Sep 29 14:11, Peter Maydell wrote:
> > > +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > > +                                 uint64_t off, NvmeRequest *req)
> > > +{
> > > +    uint32_t trans_len;
> > > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > > +    NvmeFwSlotInfoLog fw_log = {
> > > +        .afi = 0x1,
> > > +    };
> > > +
> > > +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> > > +
> > > +    if (off > sizeof(fw_log)) {
> > > +        return NVME_INVALID_FIELD | NVME_DNR;
> > > +    }
> > > +
> > > +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> > > +
> > > +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> > > +                             prp2);
> > 
> > Coverity warns about the same structure here (CID 1432411).
> > 
> > thanks
> > -- PMM
> 
> Hi Peter,
> 
> Thanks. This is somewhere in the middle of a bunch of patches I got
> merged I think, commit 94a7897c41db? I just requested Coverity access.
> 
> What happens is that nvme_dma_read_prp will call into nvme_map_prp which
> wont map anything because len is 0. This will cause the statically
> allocated QEMUSGList and QEMUIOVector in the request to be
> uninitialized. Returning from nvme_map_prp, nvme_dma_read_prp will
> notice that req->qsg.nsg is zero so it will default to the iov and move
> into qemu_iovec_{to,from}_buf(&req->iov, ...). In there we actually pass
> the NULL struct iovec, but since there is a __builtin_constant_p(bytes)
> condition at the end of it all, we never follow it.
> 
> Not "serious" I think, but definitely not good. We will of course fix
> this up.
> 
> @keith, do you agree with my analysis?

Yeah, looks safe as-is, but we're missing out on returning the spec
required 'Invalid Field'.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-09-29 22:34       ` Keith Busch
@ 2020-09-29 22:42         ` Klaus Jensen
  2020-09-29 22:57           ` Keith Busch
  0 siblings, 1 reply; 60+ messages in thread
From: Klaus Jensen @ 2020-09-29 22:42 UTC (permalink / raw)
  To: Keith Busch
  Cc: Kevin Wolf, Peter Maydell, Qemu-block, Dmitry Fomichev,
	Klaus Jensen, QEMU Developers, Max Reitz, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 2406 bytes --]

On Sep 29 15:34, Keith Busch wrote:
> On Tue, Sep 29, 2020 at 11:46:00PM +0200, Klaus Jensen wrote:
> > On Sep 29 14:11, Peter Maydell wrote:
> > > > +static uint16_t nvme_fw_log_info(NvmeCtrl *n, NvmeCmd *cmd, uint32_t buf_len,
> > > > +                                 uint64_t off, NvmeRequest *req)
> > > > +{
> > > > +    uint32_t trans_len;
> > > > +    uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1);
> > > > +    uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2);
> > > > +    NvmeFwSlotInfoLog fw_log = {
> > > > +        .afi = 0x1,
> > > > +    };
> > > > +
> > > > +    strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' ');
> > > > +
> > > > +    if (off > sizeof(fw_log)) {
> > > > +        return NVME_INVALID_FIELD | NVME_DNR;
> > > > +    }
> > > > +
> > > > +    trans_len = MIN(sizeof(fw_log) - off, buf_len);
> > > > +
> > > > +    return nvme_dma_read_prp(n, (uint8_t *) &fw_log + off, trans_len, prp1,
> > > > +                             prp2);
> > > 
> > > Coverity warns about the same structure here (CID 1432411).
> > > 
> > > thanks
> > > -- PMM
> > 
> > Hi Peter,
> > 
> > Thanks. This is somewhere in the middle of a bunch of patches I got
> > merged I think, commit 94a7897c41db? I just requested Coverity access.
> > 
> > What happens is that nvme_dma_read_prp will call into nvme_map_prp which
> > wont map anything because len is 0. This will cause the statically
> > allocated QEMUSGList and QEMUIOVector in the request to be
> > uninitialized. Returning from nvme_map_prp, nvme_dma_read_prp will
> > notice that req->qsg.nsg is zero so it will default to the iov and move
> > into qemu_iovec_{to,from}_buf(&req->iov, ...). In there we actually pass
> > the NULL struct iovec, but since there is a __builtin_constant_p(bytes)
> > condition at the end of it all, we never follow it.
> > 
> > Not "serious" I think, but definitely not good. We will of course fix
> > this up.
> > 
> > @keith, do you agree with my analysis?
> 
> Yeah, looks safe as-is, but we're missing out on returning the spec
> required 'Invalid Field'.

I can't see where it says that we should do that? Invalid Field in
Command if offset is *greater* than the size of the log page.

Some dynamic log pages have side-effects of being read, so while this is
a super wierd way of specifying that we want nothing returned, I think
it is valid?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 07/18] hw/block/nvme: add support for the get log page command
  2020-09-29 22:42         ` Klaus Jensen
@ 2020-09-29 22:57           ` Keith Busch
  0 siblings, 0 replies; 60+ messages in thread
From: Keith Busch @ 2020-09-29 22:57 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, Peter Maydell, Qemu-block, Dmitry Fomichev,
	Klaus Jensen, QEMU Developers, Max Reitz, Javier Gonzalez,
	Maxim Levitsky, Philippe Mathieu-Daudé

On Wed, Sep 30, 2020 at 12:42:48AM +0200, Klaus Jensen wrote:
> On Sep 29 15:34, Keith Busch wrote:
> > Yeah, looks safe as-is, but we're missing out on returning the spec
> > required 'Invalid Field'.
> 
> I can't see where it says that we should do that? Invalid Field in
> Command if offset is *greater* than the size of the log page.
> 
> Some dynamic log pages have side-effects of being read, so while this is
> a super wierd way of specifying that we want nothing returned, I think
> it is valid?

Eh, when spec says "size of the log page", I assume they're using the
"zeroes based" definition for size as aligned with the NUMD field. So
512 is bigger than the sizeof the smart log occupying bytes 0-511.

But I guess there's room to see it the other way, so maybe it is a
way to request a no data log.


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2020-09-29 22:58 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-06  6:12 [PATCH v3 00/18] hw/block/nvme: bump to v1.3 Klaus Jensen
2020-07-06  6:12 ` [PATCH v3 01/18] hw/block/nvme: bump spec data structures " Klaus Jensen
2020-07-08 19:19   ` Dmitry Fomichev
2020-07-08 21:24     ` Klaus Jensen
2020-07-08 21:47       ` Dmitry Fomichev
2020-07-09  6:17         ` Klaus Jensen
2020-07-06  6:12 ` [PATCH v3 02/18] hw/block/nvme: fix missing endian conversion Klaus Jensen
2020-07-06  9:50   ` Philippe Mathieu-Daudé
2020-07-08 19:20   ` Dmitry Fomichev
2020-07-29  8:49   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 03/18] hw/block/nvme: additional tracing Klaus Jensen
2020-07-06  9:50   ` Philippe Mathieu-Daudé
2020-07-08 19:21   ` Dmitry Fomichev
2020-07-29  8:52   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 04/18] hw/block/nvme: add support for the abort command Klaus Jensen
2020-07-06  6:12 ` [PATCH v3 05/18] hw/block/nvme: add temperature threshold feature Klaus Jensen
2020-07-08 19:24   ` Dmitry Fomichev
2020-07-06  6:12 ` [PATCH v3 06/18] hw/block/nvme: mark fw slot 1 as read-only Klaus Jensen
2020-07-29  9:14   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 07/18] hw/block/nvme: add support for the get log page command Klaus Jensen
2020-07-08 19:22   ` Dmitry Fomichev
2020-07-29 10:24   ` Maxim Levitsky
2020-07-29 11:44     ` Klaus Jensen
2020-07-29 18:35       ` Maxim Levitsky
2020-09-29 13:11   ` Peter Maydell
2020-09-29 21:46     ` Klaus Jensen
2020-09-29 22:34       ` Keith Busch
2020-09-29 22:42         ` Klaus Jensen
2020-09-29 22:57           ` Keith Busch
2020-07-06  6:12 ` [PATCH v3 08/18] hw/block/nvme: add support for the asynchronous event request command Klaus Jensen
2020-07-29 10:43   ` Maxim Levitsky
2020-07-29 13:37     ` Klaus Jensen
2020-07-29 18:45       ` Maxim Levitsky
2020-07-29 20:08         ` Klaus Jensen
2020-07-30  8:50           ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 09/18] hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.h Klaus Jensen
2020-07-29 10:46   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 10/18] hw/block/nvme: flush write cache when disabled Klaus Jensen
2020-07-29 11:03   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 11/18] hw/block/nvme: add remaining mandatory controller parameters Klaus Jensen
2020-07-29 11:31   ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 12/18] hw/block/nvme: support the get/set features select and save fields Klaus Jensen
2020-07-08 19:25   ` Dmitry Fomichev
2020-07-29 13:17   ` Maxim Levitsky
2020-07-29 13:48     ` Klaus Jensen
2020-07-29 18:47       ` Maxim Levitsky
2020-07-06  6:12 ` [PATCH v3 13/18] hw/block/nvme: make sure ncqr and nsqr is valid Klaus Jensen
2020-07-06  6:12 ` [PATCH v3 14/18] hw/block/nvme: support identify namespace descriptor list Klaus Jensen
2020-07-29 13:25   ` Maxim Levitsky
2020-07-06  6:13 ` [PATCH v3 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list Klaus Jensen
2020-07-06  9:47   ` Philippe Mathieu-Daudé
2020-07-08 19:26   ` Dmitry Fomichev
2020-07-29 13:27   ` Maxim Levitsky
2020-07-06  6:13 ` [PATCH v3 16/18] hw/block/nvme: enforce valid queue creation sequence Klaus Jensen
2020-07-06  6:13 ` [PATCH v3 17/18] hw/block/nvme: provide the mandatory subnqn field Klaus Jensen
2020-07-06  9:47   ` Philippe Mathieu-Daudé
2020-07-08 19:26   ` Dmitry Fomichev
2020-07-29 13:34   ` Maxim Levitsky
2020-07-06  6:13 ` [PATCH v3 18/18] hw/block/nvme: bump supported version to v1.3 Klaus Jensen
2020-07-20  9:13 ` [PATCH v3 00/18] hw/block/nvme: bump " Klaus Jensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.