qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
@ 2020-09-21 16:29 Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

Instead of mapping 8K of I/O + doorbells as RW during the whole
execution, maps I/O temporarly at init, and map doorbells WO.

Replace various magic values by slighly more explicit macros from
"block/nvme.h".

Philippe Mathieu-Daudé (6):
  util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()
  block/nvme: Map doorbells pages write-only
  block/nvme: Reduce I/O registers scope
  block/nvme: Drop NVMeRegs structure, directly use NvmeBar
  block/nvme: Use register definitions from 'block/nvme.h'
  block/nvme: Replace magic value by SCALE_MS definition

 include/qemu/vfio-helpers.h |  2 +-
 block/nvme.c                | 73 +++++++++++++++++++++----------------
 util/vfio-helpers.c         |  4 +-
 3 files changed, 44 insertions(+), 35 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

Pages are currently mapped READ/WRITE. To be able to use different
protections, add a new argument to qemu_vfio_pci_map_bar().

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 include/qemu/vfio-helpers.h | 2 +-
 block/nvme.c                | 3 ++-
 util/vfio-helpers.c         | 4 ++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/qemu/vfio-helpers.h b/include/qemu/vfio-helpers.h
index 1f057c2b9e4..4491c8e1a6e 100644
--- a/include/qemu/vfio-helpers.h
+++ b/include/qemu/vfio-helpers.h
@@ -22,7 +22,7 @@ int qemu_vfio_dma_map(QEMUVFIOState *s, void *host, size_t size,
 int qemu_vfio_dma_reset_temporary(QEMUVFIOState *s);
 void qemu_vfio_dma_unmap(QEMUVFIOState *s, void *host);
 void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
-                            uint64_t offset, uint64_t size,
+                            uint64_t offset, uint64_t size, int prot,
                             Error **errp);
 void qemu_vfio_pci_unmap_bar(QEMUVFIOState *s, int index, void *bar,
                              uint64_t offset, uint64_t size);
diff --git a/block/nvme.c b/block/nvme.c
index f4f27b6da7d..5a4dc6a722a 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -712,7 +712,8 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         goto out;
     }
 
-    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE, errp);
+    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
+                                    PROT_READ | PROT_WRITE, errp);
     if (!s->regs) {
         ret = -EINVAL;
         goto out;
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 583bdfb36fc..9ac307e3d42 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -146,13 +146,13 @@ static int qemu_vfio_pci_init_bar(QEMUVFIOState *s, int index, Error **errp)
  * Map a PCI bar area.
  */
 void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
-                            uint64_t offset, uint64_t size,
+                            uint64_t offset, uint64_t size, int prot,
                             Error **errp)
 {
     void *p;
     assert_bar_index_valid(s, index);
     p = mmap(NULL, MIN(size, s->bar_region_info[index].size - offset),
-             PROT_READ | PROT_WRITE, MAP_SHARED,
+             prot, MAP_SHARED,
              s->device, s->bar_region_info[index].offset + offset);
     if (p == MAP_FAILED) {
         error_setg_errno(errp, errno, "Failed to map BAR region");
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-22  8:18   ` Fam Zheng
  2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

Per the datasheet sections 3.1.13/3.1.14:
  "The host should not read the doorbell registers."

As we don't need read access, map the doorbells with write-only
permission. We keep a reference to this mapped address in the
BDRVNVMeState structure.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 5a4dc6a722a..3c834da8fec 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -31,7 +31,7 @@
 #define NVME_SQ_ENTRY_BYTES 64
 #define NVME_CQ_ENTRY_BYTES 16
 #define NVME_QUEUE_SIZE 128
-#define NVME_BAR_SIZE 8192
+#define NVME_DOORBELL_SIZE 4096
 
 /*
  * We have to leave one slot empty as that is the full queue case where
@@ -84,10 +84,6 @@ typedef struct {
 /* Memory mapped registers */
 typedef volatile struct {
     NvmeBar ctrl;
-    struct {
-        uint32_t sq_tail;
-        uint32_t cq_head;
-    } doorbells[];
 } NVMeRegs;
 
 #define INDEX_ADMIN     0
@@ -103,6 +99,11 @@ struct BDRVNVMeState {
     AioContext *aio_context;
     QEMUVFIOState *vfio;
     NVMeRegs *regs;
+    /* Memory mapped registers */
+    volatile struct {
+        uint32_t sq_tail;
+        uint32_t cq_head;
+    } *doorbells;
     /* The submission/completion queue pairs.
      * [0]: admin queue.
      * [1..]: io queues.
@@ -247,14 +248,14 @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
         error_propagate(errp, local_err);
         goto fail;
     }
-    q->sq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].sq_tail;
+    q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
 
     nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         goto fail;
     }
-    q->cq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].cq_head;
+    q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
 
     return q;
 fail:
@@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         goto out;
     }
 
-    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
+    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
                                     PROT_READ | PROT_WRITE, errp);
     if (!s->regs) {
         ret = -EINVAL;
         goto out;
     }
-
     /* Perform initialize sequence as described in NVMe spec "7.6.1
      * Initialization". */
 
@@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         }
     }
 
+    s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0, sizeof(NvmeBar),
+                                         NVME_DOORBELL_SIZE, PROT_WRITE, errp);
+    if (!s->doorbells) {
+        ret = -EINVAL;
+        goto out;
+    }
+
     /* Set up admin queue. */
     s->queues = g_new(NVMeQueuePair *, 1);
     s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context, 0,
@@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
                            &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
                            false, NULL, NULL);
     event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
-    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE);
+    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
+                            sizeof(NvmeBar), NVME_DOORBELL_SIZE);
+    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, sizeof(NvmeBar));
     qemu_vfio_close(s->vfio);
 
     g_free(s->device);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] block/nvme: Reduce I/O registers scope
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

We only access the I/O register in nvme_init().
Remove the reference in BDRVNVMeState and reduce its scope.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 3c834da8fec..e517c7539ff 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -98,7 +98,6 @@ enum {
 struct BDRVNVMeState {
     AioContext *aio_context;
     QEMUVFIOState *vfio;
-    NVMeRegs *regs;
     /* Memory mapped registers */
     volatile struct {
         uint32_t sq_tail;
@@ -695,6 +694,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     uint64_t timeout_ms;
     uint64_t deadline, now;
     Error *local_err = NULL;
+    NVMeRegs *regs;
 
     qemu_co_mutex_init(&s->dma_map_lock);
     qemu_co_queue_init(&s->dma_flush_queue);
@@ -713,16 +713,16 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         goto out;
     }
 
-    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
-                                    PROT_READ | PROT_WRITE, errp);
-    if (!s->regs) {
+    regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
+                                 PROT_READ | PROT_WRITE, errp);
+    if (!regs) {
         ret = -EINVAL;
         goto out;
     }
     /* Perform initialize sequence as described in NVMe spec "7.6.1
      * Initialization". */
 
-    cap = le64_to_cpu(s->regs->ctrl.cap);
+    cap = le64_to_cpu(regs->ctrl.cap);
     if (!(cap & (1ULL << 37))) {
         error_setg(errp, "Device doesn't support NVMe command set");
         ret = -EINVAL;
@@ -735,10 +735,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
 
     /* Reset device to get a clean state. */
-    s->regs->ctrl.cc = cpu_to_le32(le32_to_cpu(s->regs->ctrl.cc) & 0xFE);
+    regs->ctrl.cc = cpu_to_le32(le32_to_cpu(regs->ctrl.cc) & 0xFE);
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-    while (le32_to_cpu(s->regs->ctrl.csts) & 0x1) {
+    while (le32_to_cpu(regs->ctrl.csts) & 0x1) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to reset (%"
                              PRId64 " ms)",
@@ -766,18 +766,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     }
     s->nr_queues = 1;
     QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-    s->regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
-    s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
-    s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+    regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+    regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
+    regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
 
     /* After setting up all control registers we can enable device now. */
-    s->regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+    regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
                               (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
                               0x1);
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     deadline = now + timeout_ms * 1000000;
-    while (!(le32_to_cpu(s->regs->ctrl.csts) & 0x1)) {
+    while (!(le32_to_cpu(regs->ctrl.csts) & 0x1)) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
                              PRId64 " ms)",
@@ -808,6 +808,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         ret = -EIO;
     }
 out:
+    if (regs) {
+        qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)regs, 0, sizeof(NvmeBar));
+    }
+
     /* Cleaning up is done in nvme_file_open() upon error. */
     return ret;
 }
@@ -882,7 +886,6 @@ static void nvme_close(BlockDriverState *bs)
     event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
     qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
                             sizeof(NvmeBar), NVME_DOORBELL_SIZE);
-    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, sizeof(NvmeBar));
     qemu_vfio_close(s->vfio);
 
     g_free(s->device);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
                   ` (2 preceding siblings ...)
  2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

NVMeRegs only contains NvmeBar. Simplify the code by using NvmeBar
directly.

This triggers a checkpatch.pl error:

  ERROR: Use of volatile is usually wrong, please add a comment
  #30: FILE: block/nvme.c:691:
  +    volatile NvmeBar *regs;

This is a false positive as in our case we are using I/O registers,
so the 'volatile' use is justified.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index e517c7539ff..83bae6b2f13 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -81,11 +81,6 @@ typedef struct {
     QEMUBH      *completion_bh;
 } NVMeQueuePair;
 
-/* Memory mapped registers */
-typedef volatile struct {
-    NvmeBar ctrl;
-} NVMeRegs;
-
 #define INDEX_ADMIN     0
 #define INDEX_IO(n)     (1 + n)
 
@@ -694,7 +689,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     uint64_t timeout_ms;
     uint64_t deadline, now;
     Error *local_err = NULL;
-    NVMeRegs *regs;
+    volatile NvmeBar *regs;
 
     qemu_co_mutex_init(&s->dma_map_lock);
     qemu_co_queue_init(&s->dma_flush_queue);
@@ -722,7 +717,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     /* Perform initialize sequence as described in NVMe spec "7.6.1
      * Initialization". */
 
-    cap = le64_to_cpu(regs->ctrl.cap);
+    cap = le64_to_cpu(regs->cap);
     if (!(cap & (1ULL << 37))) {
         error_setg(errp, "Device doesn't support NVMe command set");
         ret = -EINVAL;
@@ -735,10 +730,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
 
     /* Reset device to get a clean state. */
-    regs->ctrl.cc = cpu_to_le32(le32_to_cpu(regs->ctrl.cc) & 0xFE);
+    regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-    while (le32_to_cpu(regs->ctrl.csts) & 0x1) {
+    while (le32_to_cpu(regs->csts) & 0x1) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to reset (%"
                              PRId64 " ms)",
@@ -766,18 +761,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     }
     s->nr_queues = 1;
     QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-    regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
-    regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
-    regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+    regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+    regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
+    regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
 
     /* After setting up all control registers we can enable device now. */
-    regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+    regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
                               (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
                               0x1);
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     deadline = now + timeout_ms * 1000000;
-    while (!(le32_to_cpu(regs->ctrl.csts) & 0x1)) {
+    while (!(le32_to_cpu(regs->csts) & 0x1)) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
                              PRId64 " ms)",
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h'
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
                   ` (3 preceding siblings ...)
  2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

Use the NVMe register definitions from "block/nvme.h" which
ease a bit reviewing the code while matching the datasheet.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 83bae6b2f13..d18e897c7c9 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -718,22 +718,22 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
      * Initialization". */
 
     cap = le64_to_cpu(regs->cap);
-    if (!(cap & (1ULL << 37))) {
+    if (!NVME_CAP_CSS(cap)) {
         error_setg(errp, "Device doesn't support NVMe command set");
         ret = -EINVAL;
         goto out;
     }
 
-    s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF)));
-    s->doorbell_scale = (4 << (((cap >> 32) & 0xF))) / sizeof(uint32_t);
+    s->page_size = MAX(4096, 1 << NVME_CAP_MPSMIN(cap));
+    s->doorbell_scale = (4 << NVME_CAP_DSTRD(cap)) / sizeof(uint32_t);
     bs->bl.opt_mem_alignment = s->page_size;
-    timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
+    timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
 
     /* Reset device to get a clean state. */
     regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-    while (le32_to_cpu(regs->csts) & 0x1) {
+    while (NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to reset (%"
                              PRId64 " ms)",
@@ -761,18 +761,19 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     }
     s->nr_queues = 1;
     QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-    regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+    regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << AQA_ACQS_SHIFT) |
+                            (NVME_QUEUE_SIZE << AQA_ASQS_SHIFT));
     regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
     regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
 
     /* After setting up all control registers we can enable device now. */
-    regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
-                              (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
-                              0x1);
+    regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << CC_IOCQES_SHIFT) |
+                           (ctz32(NVME_SQ_ENTRY_BYTES) << CC_IOSQES_SHIFT) |
+                           CC_EN_MASK);
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     deadline = now + timeout_ms * 1000000;
-    while (!(le32_to_cpu(regs->csts) & 0x1)) {
+    while (!NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
                              PRId64 " ms)",
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
                   ` (4 preceding siblings ...)
  2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
  2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
  2020-09-21 21:56 ` no-reply
  7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
	qemu-block, Max Reitz

Use self-explicit SCALE_MS definition instead of magic value
(missed in similar commit e4f310fe7f5).

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nvme.c b/block/nvme.c
index d18e897c7c9..b3dab1c4610 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -772,7 +772,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
                            CC_EN_MASK);
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-    deadline = now + timeout_ms * 1000000;
+    deadline = now + timeout_ms * SCALE_MS;
     while (!NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
                   ` (5 preceding siblings ...)
  2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
@ 2020-09-21 21:38 ` no-reply
  2020-09-21 21:56 ` no-reply
  7 siblings, 0 replies; 14+ messages in thread
From: no-reply @ 2020-09-21 21:38 UTC (permalink / raw)
  To: philmd; +Cc: kwolf, fam, qemu-block, qemu-devel, mreitz, stefanha, philmd

Patchew URL: https://patchew.org/QEMU/20200921162949.553863-1-philmd@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20200921162949.553863-1-philmd@redhat.com
Subject: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
b0ef8b7 block/nvme: Replace magic value by SCALE_MS definition
1792c4d block/nvme: Use register definitions from 'block/nvme.h'
352be0c block/nvme: Drop NVMeRegs structure, directly use NvmeBar
7aa184a block/nvme: Reduce I/O registers scope
4b1effc block/nvme: Map doorbells pages write-only
7ae4653 util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()

=== OUTPUT BEGIN ===
1/6 Checking commit 7ae46536369a (util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar())
2/6 Checking commit 4b1effcfcd2b (block/nvme: Map doorbells pages write-only)
3/6 Checking commit 7aa184a756dc (block/nvme: Reduce I/O registers scope)
4/6 Checking commit 352be0c97869 (block/nvme: Drop NVMeRegs structure, directly use NvmeBar)
ERROR: Use of volatile is usually wrong, please add a comment
#43: FILE: block/nvme.c:692:
+    volatile NvmeBar *regs;

total: 1 errors, 0 warnings, 62 lines checked

Patch 4/6 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/6 Checking commit 1792c4da4bc2 (block/nvme: Use register definitions from 'block/nvme.h')
6/6 Checking commit b0ef8b776eb1 (block/nvme: Replace magic value by SCALE_MS definition)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200921162949.553863-1-philmd@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
  2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
                   ` (6 preceding siblings ...)
  2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
@ 2020-09-21 21:56 ` no-reply
  7 siblings, 0 replies; 14+ messages in thread
From: no-reply @ 2020-09-21 21:56 UTC (permalink / raw)
  To: philmd; +Cc: kwolf, fam, qemu-block, qemu-devel, mreitz, stefanha, philmd

Patchew URL: https://patchew.org/QEMU/20200921162949.553863-1-philmd@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

C linker for the host machine: cc ld.bfd 2.27-43
Host machine cpu family: x86_64
Host machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object authz/libauthz.fa.p/listfile.c.o
Compiling C object authz/libauthz.fa.p/list.c.o
../src/block/nvme.c: In function 'nvme_file_open':
../src/block/nvme.c:807:8: error: 'regs' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     if (regs) {
        ^
../src/block/nvme.c:692:23: note: 'regs' was declared here
     volatile NvmeBar *regs;
                       ^
cc1: all warnings being treated as errors
make: *** [libblock.fa.p/block_nvme.c.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=99233811d86a42b1b27aa376bb85979d', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-wu686lwl/src/docker-src.2020-09-21-17.52.21.21740:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=99233811d86a42b1b27aa376bb85979d
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-wu686lwl/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    3m46.350s
user    0m15.940s


The full log is available at
http://patchew.org/logs/20200921162949.553863-1-philmd@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
@ 2020-09-22  8:18   ` Fam Zheng
  2020-09-22  8:41     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 14+ messages in thread
From: Fam Zheng @ 2020-09-22  8:18 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, qemu-block, Max Reitz

On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
> Per the datasheet sections 3.1.13/3.1.14:
>   "The host should not read the doorbell registers."
> 
> As we don't need read access, map the doorbells with write-only
> permission. We keep a reference to this mapped address in the
> BDRVNVMeState structure.

Besides looking more correct in access mode, is there any side effect
of WO mapping?

Fam

> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  block/nvme.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index 5a4dc6a722a..3c834da8fec 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -31,7 +31,7 @@
>  #define NVME_SQ_ENTRY_BYTES 64
>  #define NVME_CQ_ENTRY_BYTES 16
>  #define NVME_QUEUE_SIZE 128
> -#define NVME_BAR_SIZE 8192
> +#define NVME_DOORBELL_SIZE 4096
>  
>  /*
>   * We have to leave one slot empty as that is the full queue case
> where
> @@ -84,10 +84,6 @@ typedef struct {
>  /* Memory mapped registers */
>  typedef volatile struct {
>      NvmeBar ctrl;
> -    struct {
> -        uint32_t sq_tail;
> -        uint32_t cq_head;
> -    } doorbells[];
>  } NVMeRegs;
>  
>  #define INDEX_ADMIN     0
> @@ -103,6 +99,11 @@ struct BDRVNVMeState {
>      AioContext *aio_context;
>      QEMUVFIOState *vfio;
>      NVMeRegs *regs;
> +    /* Memory mapped registers */
> +    volatile struct {
> +        uint32_t sq_tail;
> +        uint32_t cq_head;
> +    } *doorbells;
>      /* The submission/completion queue pairs.
>       * [0]: admin queue.
>       * [1..]: io queues.
> @@ -247,14 +248,14 @@ static NVMeQueuePair
> *nvme_create_queue_pair(BDRVNVMeState *s,
>          error_propagate(errp, local_err);
>          goto fail;
>      }
> -    q->sq.doorbell = &s->regs->doorbells[idx * s-
> >doorbell_scale].sq_tail;
> +    q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
>  
>      nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
> &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          goto fail;
>      }
> -    q->cq.doorbell = &s->regs->doorbells[idx * s-
> >doorbell_scale].cq_head;
> +    q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
>  
>      return q;
>  fail:
> @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
> const char *device, int namespace,
>          goto out;
>      }
>  
> -    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
> +    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
>                                      PROT_READ | PROT_WRITE, errp);
>      if (!s->regs) {
>          ret = -EINVAL;
>          goto out;
>      }
> -
>      /* Perform initialize sequence as described in NVMe spec "7.6.1
>       * Initialization". */
>  
> @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const
> char *device, int namespace,
>          }
>      }
>  
> +    s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
> sizeof(NvmeBar),
> +                                         NVME_DOORBELL_SIZE,
> PROT_WRITE, errp);
> +    if (!s->doorbells) {
> +        ret = -EINVAL;
> +        goto out;
> +    }
> +
>      /* Set up admin queue. */
>      s->queues = g_new(NVMeQueuePair *, 1);
>      s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context,
> 0,
> @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
>                             &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
>                             false, NULL, NULL);
>      event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
> -    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> NVME_BAR_SIZE);
> +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
> +                            sizeof(NvmeBar), NVME_DOORBELL_SIZE);
> +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> sizeof(NvmeBar));
>      qemu_vfio_close(s->vfio);
>  
>      g_free(s->device);



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-22  8:18   ` Fam Zheng
@ 2020-09-22  8:41     ` Philippe Mathieu-Daudé
  2020-09-22  9:04       ` Paolo Bonzini
  2020-09-22  9:33       ` Fam Zheng
  0 siblings, 2 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-22  8:41 UTC (permalink / raw)
  To: Fam Zheng, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Max Reitz

Hi Fam,

+Paolo?

On 9/22/20 10:18 AM, Fam Zheng wrote:
> On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
>> Per the datasheet sections 3.1.13/3.1.14:
>>   "The host should not read the doorbell registers."
>>
>> As we don't need read access, map the doorbells with write-only
>> permission. We keep a reference to this mapped address in the
>> BDRVNVMeState structure.
> 
> Besides looking more correct in access mode, is there any side effect
> of WO mapping?

TBH I don't have enough knowledge to answer this question.
I tested successfully on X86. I'm writing more tests.

> 
> Fam
> 
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>>  block/nvme.c | 29 +++++++++++++++++++----------
>>  1 file changed, 19 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/nvme.c b/block/nvme.c
>> index 5a4dc6a722a..3c834da8fec 100644
>> --- a/block/nvme.c
>> +++ b/block/nvme.c
>> @@ -31,7 +31,7 @@
>>  #define NVME_SQ_ENTRY_BYTES 64
>>  #define NVME_CQ_ENTRY_BYTES 16
>>  #define NVME_QUEUE_SIZE 128
>> -#define NVME_BAR_SIZE 8192
>> +#define NVME_DOORBELL_SIZE 4096
>>  
>>  /*
>>   * We have to leave one slot empty as that is the full queue case
>> where
>> @@ -84,10 +84,6 @@ typedef struct {
>>  /* Memory mapped registers */
>>  typedef volatile struct {
>>      NvmeBar ctrl;
>> -    struct {
>> -        uint32_t sq_tail;
>> -        uint32_t cq_head;
>> -    } doorbells[];
>>  } NVMeRegs;
>>  
>>  #define INDEX_ADMIN     0
>> @@ -103,6 +99,11 @@ struct BDRVNVMeState {
>>      AioContext *aio_context;
>>      QEMUVFIOState *vfio;
>>      NVMeRegs *regs;
>> +    /* Memory mapped registers */
>> +    volatile struct {
>> +        uint32_t sq_tail;
>> +        uint32_t cq_head;
>> +    } *doorbells;
>>      /* The submission/completion queue pairs.
>>       * [0]: admin queue.
>>       * [1..]: io queues.
>> @@ -247,14 +248,14 @@ static NVMeQueuePair
>> *nvme_create_queue_pair(BDRVNVMeState *s,
>>          error_propagate(errp, local_err);
>>          goto fail;
>>      }
>> -    q->sq.doorbell = &s->regs->doorbells[idx * s-
>>> doorbell_scale].sq_tail;
>> +    q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
>>  
>>      nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
>> &local_err);
>>      if (local_err) {
>>          error_propagate(errp, local_err);
>>          goto fail;
>>      }
>> -    q->cq.doorbell = &s->regs->doorbells[idx * s-
>>> doorbell_scale].cq_head;
>> +    q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
>>  
>>      return q;
>>  fail:
>> @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
>> const char *device, int namespace,
>>          goto out;
>>      }
>>  
>> -    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
>> +    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
>>                                      PROT_READ | PROT_WRITE, errp);
>>      if (!s->regs) {
>>          ret = -EINVAL;
>>          goto out;
>>      }
>> -
>>      /* Perform initialize sequence as described in NVMe spec "7.6.1
>>       * Initialization". */
>>  
>> @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const
>> char *device, int namespace,
>>          }
>>      }
>>  
>> +    s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
>> sizeof(NvmeBar),
>> +                                         NVME_DOORBELL_SIZE,
>> PROT_WRITE, errp);
>> +    if (!s->doorbells) {
>> +        ret = -EINVAL;
>> +        goto out;
>> +    }
>> +
>>      /* Set up admin queue. */
>>      s->queues = g_new(NVMeQueuePair *, 1);
>>      s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context,
>> 0,
>> @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
>>                             &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
>>                             false, NULL, NULL);
>>      event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
>> -    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
>> NVME_BAR_SIZE);
>> +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
>> +                            sizeof(NvmeBar), NVME_DOORBELL_SIZE);
>> +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
>> sizeof(NvmeBar));
>>      qemu_vfio_close(s->vfio);
>>  
>>      g_free(s->device);
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-22  8:41     ` Philippe Mathieu-Daudé
@ 2020-09-22  9:04       ` Paolo Bonzini
  2020-09-22  9:36         ` Philippe Mathieu-Daudé
  2020-09-22  9:33       ` Fam Zheng
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2020-09-22  9:04 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Fam Zheng, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, qemu-block, Max Reitz

On 22/09/20 10:41, Philippe Mathieu-Daudé wrote:
>> Besides looking more correct in access mode, is there any side effect
>> of WO mapping?
> TBH I don't have enough knowledge to answer this question.
> I tested successfully on X86. I'm writing more tests.

No problem with doing this, but PROT_WRITE does not work at all on x86.
:)  PROT_EXEC works if you have a machine with PKRU, but PROT_WRITE
silently becomes PROT_READ|PROT_WRITE because the processor does not
support it.

Paolo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-22  8:41     ` Philippe Mathieu-Daudé
  2020-09-22  9:04       ` Paolo Bonzini
@ 2020-09-22  9:33       ` Fam Zheng
  1 sibling, 0 replies; 14+ messages in thread
From: Fam Zheng @ 2020-09-22  9:33 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Max Reitz

On Tue, 2020-09-22 at 10:41 +0200, Philippe Mathieu-Daudé wrote:
> Hi Fam,
> 
> +Paolo?
> 
> On 9/22/20 10:18 AM, Fam Zheng wrote:
> > On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
> > > Per the datasheet sections 3.1.13/3.1.14:
> > >   "The host should not read the doorbell registers."
> > > 
> > > As we don't need read access, map the doorbells with write-only
> > > permission. We keep a reference to this mapped address in the
> > > BDRVNVMeState structure.
> > 
> > Besides looking more correct in access mode, is there any side
> > effect
> > of WO mapping?
> 
> TBH I don't have enough knowledge to answer this question.
> I tested successfully on X86. I'm writing more tests.

The reason I'm asking is more because x86 pages are either RO or RW. So
I'd like to see if there's a practical reason behind this patch (I have
no idea about the effects on MTRR and/or IOMMU).

Fam

> 
> > 
> > Fam
> > 
> > > 
> > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > ---
> > >  block/nvme.c | 29 +++++++++++++++++++----------
> > >  1 file changed, 19 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/block/nvme.c b/block/nvme.c
> > > index 5a4dc6a722a..3c834da8fec 100644
> > > --- a/block/nvme.c
> > > +++ b/block/nvme.c
> > > @@ -31,7 +31,7 @@
> > >  #define NVME_SQ_ENTRY_BYTES 64
> > >  #define NVME_CQ_ENTRY_BYTES 16
> > >  #define NVME_QUEUE_SIZE 128
> > > -#define NVME_BAR_SIZE 8192
> > > +#define NVME_DOORBELL_SIZE 4096
> > >  
> > >  /*
> > >   * We have to leave one slot empty as that is the full queue
> > > case
> > > where
> > > @@ -84,10 +84,6 @@ typedef struct {
> > >  /* Memory mapped registers */
> > >  typedef volatile struct {
> > >      NvmeBar ctrl;
> > > -    struct {
> > > -        uint32_t sq_tail;
> > > -        uint32_t cq_head;
> > > -    } doorbells[];
> > >  } NVMeRegs;
> > >  
> > >  #define INDEX_ADMIN     0
> > > @@ -103,6 +99,11 @@ struct BDRVNVMeState {
> > >      AioContext *aio_context;
> > >      QEMUVFIOState *vfio;
> > >      NVMeRegs *regs;
> > > +    /* Memory mapped registers */
> > > +    volatile struct {
> > > +        uint32_t sq_tail;
> > > +        uint32_t cq_head;
> > > +    } *doorbells;
> > >      /* The submission/completion queue pairs.
> > >       * [0]: admin queue.
> > >       * [1..]: io queues.
> > > @@ -247,14 +248,14 @@ static NVMeQueuePair
> > > *nvme_create_queue_pair(BDRVNVMeState *s,
> > >          error_propagate(errp, local_err);
> > >          goto fail;
> > >      }
> > > -    q->sq.doorbell = &s->regs->doorbells[idx * s-
> > > > doorbell_scale].sq_tail;
> > > 
> > > +    q->sq.doorbell = &s->doorbells[idx * s-
> > > >doorbell_scale].sq_tail;
> > >  
> > >      nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
> > > &local_err);
> > >      if (local_err) {
> > >          error_propagate(errp, local_err);
> > >          goto fail;
> > >      }
> > > -    q->cq.doorbell = &s->regs->doorbells[idx * s-
> > > > doorbell_scale].cq_head;
> > > 
> > > +    q->cq.doorbell = &s->doorbells[idx * s-
> > > >doorbell_scale].cq_head;
> > >  
> > >      return q;
> > >  fail:
> > > @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
> > > const char *device, int namespace,
> > >          goto out;
> > >      }
> > >  
> > > -    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0,
> > > NVME_BAR_SIZE,
> > > +    s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0,
> > > sizeof(NvmeBar),
> > >                                      PROT_READ | PROT_WRITE,
> > > errp);
> > >      if (!s->regs) {
> > >          ret = -EINVAL;
> > >          goto out;
> > >      }
> > > -
> > >      /* Perform initialize sequence as described in NVMe spec
> > > "7.6.1
> > >       * Initialization". */
> > >  
> > > @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs,
> > > const
> > > char *device, int namespace,
> > >          }
> > >      }
> > >  
> > > +    s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
> > > sizeof(NvmeBar),
> > > +                                         NVME_DOORBELL_SIZE,
> > > PROT_WRITE, errp);
> > > +    if (!s->doorbells) {
> > > +        ret = -EINVAL;
> > > +        goto out;
> > > +    }
> > > +
> > >      /* Set up admin queue. */
> > >      s->queues = g_new(NVMeQueuePair *, 1);
> > >      s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s,
> > > aio_context,
> > > 0,
> > > @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
> > >                             &s-
> > > >irq_notifier[MSIX_SHARED_IRQ_IDX],
> > >                             false, NULL, NULL);
> > >      event_notifier_cleanup(&s-
> > > >irq_notifier[MSIX_SHARED_IRQ_IDX]);
> > > -    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> > > NVME_BAR_SIZE);
> > > +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
> > > +                            sizeof(NvmeBar),
> > > NVME_DOORBELL_SIZE);
> > > +    qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> > > sizeof(NvmeBar));
> > >      qemu_vfio_close(s->vfio);
> > >  
> > >      g_free(s->device);
> 
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
  2020-09-22  9:04       ` Paolo Bonzini
@ 2020-09-22  9:36         ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-22  9:36 UTC (permalink / raw)
  To: Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Eric Auger, qemu-block, Max Reitz

On 9/22/20 11:04 AM, Paolo Bonzini wrote:
> On 22/09/20 10:41, Philippe Mathieu-Daudé wrote:
>>> Besides looking more correct in access mode, is there any side effect
>>> of WO mapping?
>> TBH I don't have enough knowledge to answer this question.
>> I tested successfully on X86. I'm writing more tests.
> 
> No problem with doing this, but PROT_WRITE does not work at all on x86.
> :)  PROT_EXEC works if you have a machine with PKRU, but PROT_WRITE
> silently becomes PROT_READ|PROT_WRITE because the processor does not
> support it.

Ah this is why it works the same way in my testing.

I'll run tests on ARM.

Thanks,

Phil.

> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-09-22  9:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
2020-09-22  8:18   ` Fam Zheng
2020-09-22  8:41     ` Philippe Mathieu-Daudé
2020-09-22  9:04       ` Paolo Bonzini
2020-09-22  9:36         ` Philippe Mathieu-Daudé
2020-09-22  9:33       ` Fam Zheng
2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
2020-09-21 21:56 ` no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).