* [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
@ 2020-09-21 16:29 Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
Instead of mapping 8K of I/O + doorbells as RW during the whole
execution, maps I/O temporarly at init, and map doorbells WO.
Replace various magic values by slighly more explicit macros from
"block/nvme.h".
Philippe Mathieu-Daudé (6):
util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()
block/nvme: Map doorbells pages write-only
block/nvme: Reduce I/O registers scope
block/nvme: Drop NVMeRegs structure, directly use NvmeBar
block/nvme: Use register definitions from 'block/nvme.h'
block/nvme: Replace magic value by SCALE_MS definition
include/qemu/vfio-helpers.h | 2 +-
block/nvme.c | 73 +++++++++++++++++++++----------------
util/vfio-helpers.c | 4 +-
3 files changed, 44 insertions(+), 35 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
Pages are currently mapped READ/WRITE. To be able to use different
protections, add a new argument to qemu_vfio_pci_map_bar().
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
include/qemu/vfio-helpers.h | 2 +-
block/nvme.c | 3 ++-
util/vfio-helpers.c | 4 ++--
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/qemu/vfio-helpers.h b/include/qemu/vfio-helpers.h
index 1f057c2b9e4..4491c8e1a6e 100644
--- a/include/qemu/vfio-helpers.h
+++ b/include/qemu/vfio-helpers.h
@@ -22,7 +22,7 @@ int qemu_vfio_dma_map(QEMUVFIOState *s, void *host, size_t size,
int qemu_vfio_dma_reset_temporary(QEMUVFIOState *s);
void qemu_vfio_dma_unmap(QEMUVFIOState *s, void *host);
void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
- uint64_t offset, uint64_t size,
+ uint64_t offset, uint64_t size, int prot,
Error **errp);
void qemu_vfio_pci_unmap_bar(QEMUVFIOState *s, int index, void *bar,
uint64_t offset, uint64_t size);
diff --git a/block/nvme.c b/block/nvme.c
index f4f27b6da7d..5a4dc6a722a 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -712,7 +712,8 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
goto out;
}
- s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE, errp);
+ s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
+ PROT_READ | PROT_WRITE, errp);
if (!s->regs) {
ret = -EINVAL;
goto out;
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 583bdfb36fc..9ac307e3d42 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -146,13 +146,13 @@ static int qemu_vfio_pci_init_bar(QEMUVFIOState *s, int index, Error **errp)
* Map a PCI bar area.
*/
void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
- uint64_t offset, uint64_t size,
+ uint64_t offset, uint64_t size, int prot,
Error **errp)
{
void *p;
assert_bar_index_valid(s, index);
p = mmap(NULL, MIN(size, s->bar_region_info[index].size - offset),
- PROT_READ | PROT_WRITE, MAP_SHARED,
+ prot, MAP_SHARED,
s->device, s->bar_region_info[index].offset + offset);
if (p == MAP_FAILED) {
error_setg_errno(errp, errno, "Failed to map BAR region");
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-22 8:18 ` Fam Zheng
2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
` (5 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
Per the datasheet sections 3.1.13/3.1.14:
"The host should not read the doorbell registers."
As we don't need read access, map the doorbells with write-only
permission. We keep a reference to this mapped address in the
BDRVNVMeState structure.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
block/nvme.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/block/nvme.c b/block/nvme.c
index 5a4dc6a722a..3c834da8fec 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -31,7 +31,7 @@
#define NVME_SQ_ENTRY_BYTES 64
#define NVME_CQ_ENTRY_BYTES 16
#define NVME_QUEUE_SIZE 128
-#define NVME_BAR_SIZE 8192
+#define NVME_DOORBELL_SIZE 4096
/*
* We have to leave one slot empty as that is the full queue case where
@@ -84,10 +84,6 @@ typedef struct {
/* Memory mapped registers */
typedef volatile struct {
NvmeBar ctrl;
- struct {
- uint32_t sq_tail;
- uint32_t cq_head;
- } doorbells[];
} NVMeRegs;
#define INDEX_ADMIN 0
@@ -103,6 +99,11 @@ struct BDRVNVMeState {
AioContext *aio_context;
QEMUVFIOState *vfio;
NVMeRegs *regs;
+ /* Memory mapped registers */
+ volatile struct {
+ uint32_t sq_tail;
+ uint32_t cq_head;
+ } *doorbells;
/* The submission/completion queue pairs.
* [0]: admin queue.
* [1..]: io queues.
@@ -247,14 +248,14 @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
error_propagate(errp, local_err);
goto fail;
}
- q->sq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].sq_tail;
+ q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
}
- q->cq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].cq_head;
+ q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
return q;
fail:
@@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
goto out;
}
- s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
+ s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
PROT_READ | PROT_WRITE, errp);
if (!s->regs) {
ret = -EINVAL;
goto out;
}
-
/* Perform initialize sequence as described in NVMe spec "7.6.1
* Initialization". */
@@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
}
}
+ s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0, sizeof(NvmeBar),
+ NVME_DOORBELL_SIZE, PROT_WRITE, errp);
+ if (!s->doorbells) {
+ ret = -EINVAL;
+ goto out;
+ }
+
/* Set up admin queue. */
s->queues = g_new(NVMeQueuePair *, 1);
s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context, 0,
@@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
&s->irq_notifier[MSIX_SHARED_IRQ_IDX],
false, NULL, NULL);
event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
- qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE);
+ qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
+ sizeof(NvmeBar), NVME_DOORBELL_SIZE);
+ qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, sizeof(NvmeBar));
qemu_vfio_close(s->vfio);
g_free(s->device);
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/6] block/nvme: Reduce I/O registers scope
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
` (4 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
We only access the I/O register in nvme_init().
Remove the reference in BDRVNVMeState and reduce its scope.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
block/nvme.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/block/nvme.c b/block/nvme.c
index 3c834da8fec..e517c7539ff 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -98,7 +98,6 @@ enum {
struct BDRVNVMeState {
AioContext *aio_context;
QEMUVFIOState *vfio;
- NVMeRegs *regs;
/* Memory mapped registers */
volatile struct {
uint32_t sq_tail;
@@ -695,6 +694,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
uint64_t timeout_ms;
uint64_t deadline, now;
Error *local_err = NULL;
+ NVMeRegs *regs;
qemu_co_mutex_init(&s->dma_map_lock);
qemu_co_queue_init(&s->dma_flush_queue);
@@ -713,16 +713,16 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
goto out;
}
- s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
- PROT_READ | PROT_WRITE, errp);
- if (!s->regs) {
+ regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
+ PROT_READ | PROT_WRITE, errp);
+ if (!regs) {
ret = -EINVAL;
goto out;
}
/* Perform initialize sequence as described in NVMe spec "7.6.1
* Initialization". */
- cap = le64_to_cpu(s->regs->ctrl.cap);
+ cap = le64_to_cpu(regs->ctrl.cap);
if (!(cap & (1ULL << 37))) {
error_setg(errp, "Device doesn't support NVMe command set");
ret = -EINVAL;
@@ -735,10 +735,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
/* Reset device to get a clean state. */
- s->regs->ctrl.cc = cpu_to_le32(le32_to_cpu(s->regs->ctrl.cc) & 0xFE);
+ regs->ctrl.cc = cpu_to_le32(le32_to_cpu(regs->ctrl.cc) & 0xFE);
/* Wait for CSTS.RDY = 0. */
deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
- while (le32_to_cpu(s->regs->ctrl.csts) & 0x1) {
+ while (le32_to_cpu(regs->ctrl.csts) & 0x1) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to reset (%"
PRId64 " ms)",
@@ -766,18 +766,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
}
s->nr_queues = 1;
QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
- s->regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
- s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
- s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+ regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+ regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
+ regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
/* After setting up all control registers we can enable device now. */
- s->regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+ regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
(ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
0x1);
/* Wait for CSTS.RDY = 1. */
now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
deadline = now + timeout_ms * 1000000;
- while (!(le32_to_cpu(s->regs->ctrl.csts) & 0x1)) {
+ while (!(le32_to_cpu(regs->ctrl.csts) & 0x1)) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to start (%"
PRId64 " ms)",
@@ -808,6 +808,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
ret = -EIO;
}
out:
+ if (regs) {
+ qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)regs, 0, sizeof(NvmeBar));
+ }
+
/* Cleaning up is done in nvme_file_open() upon error. */
return ret;
}
@@ -882,7 +886,6 @@ static void nvme_close(BlockDriverState *bs)
event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
sizeof(NvmeBar), NVME_DOORBELL_SIZE);
- qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, sizeof(NvmeBar));
qemu_vfio_close(s->vfio);
g_free(s->device);
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
` (2 preceding siblings ...)
2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
` (3 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
NVMeRegs only contains NvmeBar. Simplify the code by using NvmeBar
directly.
This triggers a checkpatch.pl error:
ERROR: Use of volatile is usually wrong, please add a comment
#30: FILE: block/nvme.c:691:
+ volatile NvmeBar *regs;
This is a false positive as in our case we are using I/O registers,
so the 'volatile' use is justified.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
block/nvme.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/block/nvme.c b/block/nvme.c
index e517c7539ff..83bae6b2f13 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -81,11 +81,6 @@ typedef struct {
QEMUBH *completion_bh;
} NVMeQueuePair;
-/* Memory mapped registers */
-typedef volatile struct {
- NvmeBar ctrl;
-} NVMeRegs;
-
#define INDEX_ADMIN 0
#define INDEX_IO(n) (1 + n)
@@ -694,7 +689,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
uint64_t timeout_ms;
uint64_t deadline, now;
Error *local_err = NULL;
- NVMeRegs *regs;
+ volatile NvmeBar *regs;
qemu_co_mutex_init(&s->dma_map_lock);
qemu_co_queue_init(&s->dma_flush_queue);
@@ -722,7 +717,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
/* Perform initialize sequence as described in NVMe spec "7.6.1
* Initialization". */
- cap = le64_to_cpu(regs->ctrl.cap);
+ cap = le64_to_cpu(regs->cap);
if (!(cap & (1ULL << 37))) {
error_setg(errp, "Device doesn't support NVMe command set");
ret = -EINVAL;
@@ -735,10 +730,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
/* Reset device to get a clean state. */
- regs->ctrl.cc = cpu_to_le32(le32_to_cpu(regs->ctrl.cc) & 0xFE);
+ regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
/* Wait for CSTS.RDY = 0. */
deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
- while (le32_to_cpu(regs->ctrl.csts) & 0x1) {
+ while (le32_to_cpu(regs->csts) & 0x1) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to reset (%"
PRId64 " ms)",
@@ -766,18 +761,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
}
s->nr_queues = 1;
QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
- regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
- regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
- regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+ regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+ regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
+ regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
/* After setting up all control registers we can enable device now. */
- regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+ regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
(ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
0x1);
/* Wait for CSTS.RDY = 1. */
now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
deadline = now + timeout_ms * 1000000;
- while (!(le32_to_cpu(regs->ctrl.csts) & 0x1)) {
+ while (!(le32_to_cpu(regs->csts) & 0x1)) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to start (%"
PRId64 " ms)",
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h'
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
` (3 preceding siblings ...)
2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
` (2 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
Use the NVMe register definitions from "block/nvme.h" which
ease a bit reviewing the code while matching the datasheet.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
block/nvme.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/block/nvme.c b/block/nvme.c
index 83bae6b2f13..d18e897c7c9 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -718,22 +718,22 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
* Initialization". */
cap = le64_to_cpu(regs->cap);
- if (!(cap & (1ULL << 37))) {
+ if (!NVME_CAP_CSS(cap)) {
error_setg(errp, "Device doesn't support NVMe command set");
ret = -EINVAL;
goto out;
}
- s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF)));
- s->doorbell_scale = (4 << (((cap >> 32) & 0xF))) / sizeof(uint32_t);
+ s->page_size = MAX(4096, 1 << NVME_CAP_MPSMIN(cap));
+ s->doorbell_scale = (4 << NVME_CAP_DSTRD(cap)) / sizeof(uint32_t);
bs->bl.opt_mem_alignment = s->page_size;
- timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
+ timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
/* Reset device to get a clean state. */
regs->cc = cpu_to_le32(le32_to_cpu(regs->cc) & 0xFE);
/* Wait for CSTS.RDY = 0. */
deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
- while (le32_to_cpu(regs->csts) & 0x1) {
+ while (NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to reset (%"
PRId64 " ms)",
@@ -761,18 +761,19 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
}
s->nr_queues = 1;
QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
- regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+ regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << AQA_ACQS_SHIFT) |
+ (NVME_QUEUE_SIZE << AQA_ASQS_SHIFT));
regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
/* After setting up all control registers we can enable device now. */
- regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
- (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
- 0x1);
+ regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << CC_IOCQES_SHIFT) |
+ (ctz32(NVME_SQ_ENTRY_BYTES) << CC_IOSQES_SHIFT) |
+ CC_EN_MASK);
/* Wait for CSTS.RDY = 1. */
now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
deadline = now + timeout_ms * 1000000;
- while (!(le32_to_cpu(regs->csts) & 0x1)) {
+ while (!NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to start (%"
PRId64 " ms)",
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
` (4 preceding siblings ...)
2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
@ 2020-09-21 16:29 ` Philippe Mathieu-Daudé
2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
2020-09-21 21:56 ` no-reply
7 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-21 16:29 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Fam Zheng, Philippe Mathieu-Daudé,
qemu-block, Max Reitz
Use self-explicit SCALE_MS definition instead of magic value
(missed in similar commit e4f310fe7f5).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
block/nvme.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/nvme.c b/block/nvme.c
index d18e897c7c9..b3dab1c4610 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -772,7 +772,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
CC_EN_MASK);
/* Wait for CSTS.RDY = 1. */
now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
- deadline = now + timeout_ms * 1000000;
+ deadline = now + timeout_ms * SCALE_MS;
while (!NVME_CSTS_RDY(le32_to_cpu(regs->csts))) {
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
error_setg(errp, "Timeout while waiting for device to start (%"
--
2.26.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
` (5 preceding siblings ...)
2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
@ 2020-09-21 21:38 ` no-reply
2020-09-21 21:56 ` no-reply
7 siblings, 0 replies; 14+ messages in thread
From: no-reply @ 2020-09-21 21:38 UTC (permalink / raw)
To: philmd; +Cc: kwolf, fam, qemu-block, qemu-devel, mreitz, stefanha, philmd
Patchew URL: https://patchew.org/QEMU/20200921162949.553863-1-philmd@redhat.com/
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20200921162949.553863-1-philmd@redhat.com
Subject: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
b0ef8b7 block/nvme: Replace magic value by SCALE_MS definition
1792c4d block/nvme: Use register definitions from 'block/nvme.h'
352be0c block/nvme: Drop NVMeRegs structure, directly use NvmeBar
7aa184a block/nvme: Reduce I/O registers scope
4b1effc block/nvme: Map doorbells pages write-only
7ae4653 util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar()
=== OUTPUT BEGIN ===
1/6 Checking commit 7ae46536369a (util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar())
2/6 Checking commit 4b1effcfcd2b (block/nvme: Map doorbells pages write-only)
3/6 Checking commit 7aa184a756dc (block/nvme: Reduce I/O registers scope)
4/6 Checking commit 352be0c97869 (block/nvme: Drop NVMeRegs structure, directly use NvmeBar)
ERROR: Use of volatile is usually wrong, please add a comment
#43: FILE: block/nvme.c:692:
+ volatile NvmeBar *regs;
total: 1 errors, 0 warnings, 62 lines checked
Patch 4/6 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/6 Checking commit 1792c4da4bc2 (block/nvme: Use register definitions from 'block/nvme.h')
6/6 Checking commit b0ef8b776eb1 (block/nvme: Replace magic value by SCALE_MS definition)
=== OUTPUT END ===
Test command exited with code: 1
The full log is available at
http://patchew.org/logs/20200921162949.553863-1-philmd@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
` (6 preceding siblings ...)
2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
@ 2020-09-21 21:56 ` no-reply
7 siblings, 0 replies; 14+ messages in thread
From: no-reply @ 2020-09-21 21:56 UTC (permalink / raw)
To: philmd; +Cc: kwolf, fam, qemu-block, qemu-devel, mreitz, stefanha, philmd
Patchew URL: https://patchew.org/QEMU/20200921162949.553863-1-philmd@redhat.com/
Hi,
This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.
=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===
C linker for the host machine: cc ld.bfd 2.27-43
Host machine cpu family: x86_64
Host machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object authz/libauthz.fa.p/listfile.c.o
Compiling C object authz/libauthz.fa.p/list.c.o
../src/block/nvme.c: In function 'nvme_file_open':
../src/block/nvme.c:807:8: error: 'regs' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (regs) {
^
../src/block/nvme.c:692:23: note: 'regs' was declared here
volatile NvmeBar *regs;
^
cc1: all warnings being treated as errors
make: *** [libblock.fa.p/block_nvme.c.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
File "./tests/docker/docker.py", line 709, in <module>
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=99233811d86a42b1b27aa376bb85979d', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-wu686lwl/src/docker-src.2020-09-21-17.52.21.21740:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=99233811d86a42b1b27aa376bb85979d
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-wu686lwl/src'
make: *** [docker-run-test-quick@centos7] Error 2
real 3m46.350s
user 0m15.940s
The full log is available at
http://patchew.org/logs/20200921162949.553863-1-philmd@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
@ 2020-09-22 8:18 ` Fam Zheng
2020-09-22 8:41 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 14+ messages in thread
From: Fam Zheng @ 2020-09-22 8:18 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, qemu-block, Max Reitz
On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
> Per the datasheet sections 3.1.13/3.1.14:
> "The host should not read the doorbell registers."
>
> As we don't need read access, map the doorbells with write-only
> permission. We keep a reference to this mapped address in the
> BDRVNVMeState structure.
Besides looking more correct in access mode, is there any side effect
of WO mapping?
Fam
>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> block/nvme.c | 29 +++++++++++++++++++----------
> 1 file changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/block/nvme.c b/block/nvme.c
> index 5a4dc6a722a..3c834da8fec 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -31,7 +31,7 @@
> #define NVME_SQ_ENTRY_BYTES 64
> #define NVME_CQ_ENTRY_BYTES 16
> #define NVME_QUEUE_SIZE 128
> -#define NVME_BAR_SIZE 8192
> +#define NVME_DOORBELL_SIZE 4096
>
> /*
> * We have to leave one slot empty as that is the full queue case
> where
> @@ -84,10 +84,6 @@ typedef struct {
> /* Memory mapped registers */
> typedef volatile struct {
> NvmeBar ctrl;
> - struct {
> - uint32_t sq_tail;
> - uint32_t cq_head;
> - } doorbells[];
> } NVMeRegs;
>
> #define INDEX_ADMIN 0
> @@ -103,6 +99,11 @@ struct BDRVNVMeState {
> AioContext *aio_context;
> QEMUVFIOState *vfio;
> NVMeRegs *regs;
> + /* Memory mapped registers */
> + volatile struct {
> + uint32_t sq_tail;
> + uint32_t cq_head;
> + } *doorbells;
> /* The submission/completion queue pairs.
> * [0]: admin queue.
> * [1..]: io queues.
> @@ -247,14 +248,14 @@ static NVMeQueuePair
> *nvme_create_queue_pair(BDRVNVMeState *s,
> error_propagate(errp, local_err);
> goto fail;
> }
> - q->sq.doorbell = &s->regs->doorbells[idx * s-
> >doorbell_scale].sq_tail;
> + q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
>
> nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
> &local_err);
> if (local_err) {
> error_propagate(errp, local_err);
> goto fail;
> }
> - q->cq.doorbell = &s->regs->doorbells[idx * s-
> >doorbell_scale].cq_head;
> + q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
>
> return q;
> fail:
> @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
> const char *device, int namespace,
> goto out;
> }
>
> - s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
> + s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
> PROT_READ | PROT_WRITE, errp);
> if (!s->regs) {
> ret = -EINVAL;
> goto out;
> }
> -
> /* Perform initialize sequence as described in NVMe spec "7.6.1
> * Initialization". */
>
> @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const
> char *device, int namespace,
> }
> }
>
> + s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
> sizeof(NvmeBar),
> + NVME_DOORBELL_SIZE,
> PROT_WRITE, errp);
> + if (!s->doorbells) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> /* Set up admin queue. */
> s->queues = g_new(NVMeQueuePair *, 1);
> s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context,
> 0,
> @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
> &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
> false, NULL, NULL);
> event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
> - qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> NVME_BAR_SIZE);
> + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
> + sizeof(NvmeBar), NVME_DOORBELL_SIZE);
> + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> sizeof(NvmeBar));
> qemu_vfio_close(s->vfio);
>
> g_free(s->device);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-22 8:18 ` Fam Zheng
@ 2020-09-22 8:41 ` Philippe Mathieu-Daudé
2020-09-22 9:04 ` Paolo Bonzini
2020-09-22 9:33 ` Fam Zheng
0 siblings, 2 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-22 8:41 UTC (permalink / raw)
To: Fam Zheng, Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Max Reitz
Hi Fam,
+Paolo?
On 9/22/20 10:18 AM, Fam Zheng wrote:
> On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
>> Per the datasheet sections 3.1.13/3.1.14:
>> "The host should not read the doorbell registers."
>>
>> As we don't need read access, map the doorbells with write-only
>> permission. We keep a reference to this mapped address in the
>> BDRVNVMeState structure.
>
> Besides looking more correct in access mode, is there any side effect
> of WO mapping?
TBH I don't have enough knowledge to answer this question.
I tested successfully on X86. I'm writing more tests.
>
> Fam
>
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>> block/nvme.c | 29 +++++++++++++++++++----------
>> 1 file changed, 19 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/nvme.c b/block/nvme.c
>> index 5a4dc6a722a..3c834da8fec 100644
>> --- a/block/nvme.c
>> +++ b/block/nvme.c
>> @@ -31,7 +31,7 @@
>> #define NVME_SQ_ENTRY_BYTES 64
>> #define NVME_CQ_ENTRY_BYTES 16
>> #define NVME_QUEUE_SIZE 128
>> -#define NVME_BAR_SIZE 8192
>> +#define NVME_DOORBELL_SIZE 4096
>>
>> /*
>> * We have to leave one slot empty as that is the full queue case
>> where
>> @@ -84,10 +84,6 @@ typedef struct {
>> /* Memory mapped registers */
>> typedef volatile struct {
>> NvmeBar ctrl;
>> - struct {
>> - uint32_t sq_tail;
>> - uint32_t cq_head;
>> - } doorbells[];
>> } NVMeRegs;
>>
>> #define INDEX_ADMIN 0
>> @@ -103,6 +99,11 @@ struct BDRVNVMeState {
>> AioContext *aio_context;
>> QEMUVFIOState *vfio;
>> NVMeRegs *regs;
>> + /* Memory mapped registers */
>> + volatile struct {
>> + uint32_t sq_tail;
>> + uint32_t cq_head;
>> + } *doorbells;
>> /* The submission/completion queue pairs.
>> * [0]: admin queue.
>> * [1..]: io queues.
>> @@ -247,14 +248,14 @@ static NVMeQueuePair
>> *nvme_create_queue_pair(BDRVNVMeState *s,
>> error_propagate(errp, local_err);
>> goto fail;
>> }
>> - q->sq.doorbell = &s->regs->doorbells[idx * s-
>>> doorbell_scale].sq_tail;
>> + q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
>>
>> nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
>> &local_err);
>> if (local_err) {
>> error_propagate(errp, local_err);
>> goto fail;
>> }
>> - q->cq.doorbell = &s->regs->doorbells[idx * s-
>>> doorbell_scale].cq_head;
>> + q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
>>
>> return q;
>> fail:
>> @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
>> const char *device, int namespace,
>> goto out;
>> }
>>
>> - s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE,
>> + s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
>> PROT_READ | PROT_WRITE, errp);
>> if (!s->regs) {
>> ret = -EINVAL;
>> goto out;
>> }
>> -
>> /* Perform initialize sequence as described in NVMe spec "7.6.1
>> * Initialization". */
>>
>> @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs, const
>> char *device, int namespace,
>> }
>> }
>>
>> + s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
>> sizeof(NvmeBar),
>> + NVME_DOORBELL_SIZE,
>> PROT_WRITE, errp);
>> + if (!s->doorbells) {
>> + ret = -EINVAL;
>> + goto out;
>> + }
>> +
>> /* Set up admin queue. */
>> s->queues = g_new(NVMeQueuePair *, 1);
>> s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context,
>> 0,
>> @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
>> &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
>> false, NULL, NULL);
>> event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
>> - qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
>> NVME_BAR_SIZE);
>> + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
>> + sizeof(NvmeBar), NVME_DOORBELL_SIZE);
>> + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
>> sizeof(NvmeBar));
>> qemu_vfio_close(s->vfio);
>>
>> g_free(s->device);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-22 8:41 ` Philippe Mathieu-Daudé
@ 2020-09-22 9:04 ` Paolo Bonzini
2020-09-22 9:36 ` Philippe Mathieu-Daudé
2020-09-22 9:33 ` Fam Zheng
1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2020-09-22 9:04 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, Fam Zheng, Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, qemu-block, Max Reitz
On 22/09/20 10:41, Philippe Mathieu-Daudé wrote:
>> Besides looking more correct in access mode, is there any side effect
>> of WO mapping?
> TBH I don't have enough knowledge to answer this question.
> I tested successfully on X86. I'm writing more tests.
No problem with doing this, but PROT_WRITE does not work at all on x86.
:) PROT_EXEC works if you have a machine with PKRU, but PROT_WRITE
silently becomes PROT_READ|PROT_WRITE because the processor does not
support it.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-22 8:41 ` Philippe Mathieu-Daudé
2020-09-22 9:04 ` Paolo Bonzini
@ 2020-09-22 9:33 ` Fam Zheng
1 sibling, 0 replies; 14+ messages in thread
From: Fam Zheng @ 2020-09-22 9:33 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Paolo Bonzini, qemu-block, Max Reitz
On Tue, 2020-09-22 at 10:41 +0200, Philippe Mathieu-Daudé wrote:
> Hi Fam,
>
> +Paolo?
>
> On 9/22/20 10:18 AM, Fam Zheng wrote:
> > On Mon, 2020-09-21 at 18:29 +0200, Philippe Mathieu-Daudé wrote:
> > > Per the datasheet sections 3.1.13/3.1.14:
> > > "The host should not read the doorbell registers."
> > >
> > > As we don't need read access, map the doorbells with write-only
> > > permission. We keep a reference to this mapped address in the
> > > BDRVNVMeState structure.
> >
> > Besides looking more correct in access mode, is there any side
> > effect
> > of WO mapping?
>
> TBH I don't have enough knowledge to answer this question.
> I tested successfully on X86. I'm writing more tests.
The reason I'm asking is more because x86 pages are either RO or RW. So
I'd like to see if there's a practical reason behind this patch (I have
no idea about the effects on MTRR and/or IOMMU).
Fam
>
> >
> > Fam
> >
> > >
> > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > ---
> > > block/nvme.c | 29 +++++++++++++++++++----------
> > > 1 file changed, 19 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/block/nvme.c b/block/nvme.c
> > > index 5a4dc6a722a..3c834da8fec 100644
> > > --- a/block/nvme.c
> > > +++ b/block/nvme.c
> > > @@ -31,7 +31,7 @@
> > > #define NVME_SQ_ENTRY_BYTES 64
> > > #define NVME_CQ_ENTRY_BYTES 16
> > > #define NVME_QUEUE_SIZE 128
> > > -#define NVME_BAR_SIZE 8192
> > > +#define NVME_DOORBELL_SIZE 4096
> > >
> > > /*
> > > * We have to leave one slot empty as that is the full queue
> > > case
> > > where
> > > @@ -84,10 +84,6 @@ typedef struct {
> > > /* Memory mapped registers */
> > > typedef volatile struct {
> > > NvmeBar ctrl;
> > > - struct {
> > > - uint32_t sq_tail;
> > > - uint32_t cq_head;
> > > - } doorbells[];
> > > } NVMeRegs;
> > >
> > > #define INDEX_ADMIN 0
> > > @@ -103,6 +99,11 @@ struct BDRVNVMeState {
> > > AioContext *aio_context;
> > > QEMUVFIOState *vfio;
> > > NVMeRegs *regs;
> > > + /* Memory mapped registers */
> > > + volatile struct {
> > > + uint32_t sq_tail;
> > > + uint32_t cq_head;
> > > + } *doorbells;
> > > /* The submission/completion queue pairs.
> > > * [0]: admin queue.
> > > * [1..]: io queues.
> > > @@ -247,14 +248,14 @@ static NVMeQueuePair
> > > *nvme_create_queue_pair(BDRVNVMeState *s,
> > > error_propagate(errp, local_err);
> > > goto fail;
> > > }
> > > - q->sq.doorbell = &s->regs->doorbells[idx * s-
> > > > doorbell_scale].sq_tail;
> > >
> > > + q->sq.doorbell = &s->doorbells[idx * s-
> > > >doorbell_scale].sq_tail;
> > >
> > > nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES,
> > > &local_err);
> > > if (local_err) {
> > > error_propagate(errp, local_err);
> > > goto fail;
> > > }
> > > - q->cq.doorbell = &s->regs->doorbells[idx * s-
> > > > doorbell_scale].cq_head;
> > >
> > > + q->cq.doorbell = &s->doorbells[idx * s-
> > > >doorbell_scale].cq_head;
> > >
> > > return q;
> > > fail:
> > > @@ -712,13 +713,12 @@ static int nvme_init(BlockDriverState *bs,
> > > const char *device, int namespace,
> > > goto out;
> > > }
> > >
> > > - s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0,
> > > NVME_BAR_SIZE,
> > > + s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0,
> > > sizeof(NvmeBar),
> > > PROT_READ | PROT_WRITE,
> > > errp);
> > > if (!s->regs) {
> > > ret = -EINVAL;
> > > goto out;
> > > }
> > > -
> > > /* Perform initialize sequence as described in NVMe spec
> > > "7.6.1
> > > * Initialization". */
> > >
> > > @@ -748,6 +748,13 @@ static int nvme_init(BlockDriverState *bs,
> > > const
> > > char *device, int namespace,
> > > }
> > > }
> > >
> > > + s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0,
> > > sizeof(NvmeBar),
> > > + NVME_DOORBELL_SIZE,
> > > PROT_WRITE, errp);
> > > + if (!s->doorbells) {
> > > + ret = -EINVAL;
> > > + goto out;
> > > + }
> > > +
> > > /* Set up admin queue. */
> > > s->queues = g_new(NVMeQueuePair *, 1);
> > > s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s,
> > > aio_context,
> > > 0,
> > > @@ -873,7 +880,9 @@ static void nvme_close(BlockDriverState *bs)
> > > &s-
> > > >irq_notifier[MSIX_SHARED_IRQ_IDX],
> > > false, NULL, NULL);
> > > event_notifier_cleanup(&s-
> > > >irq_notifier[MSIX_SHARED_IRQ_IDX]);
> > > - qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> > > NVME_BAR_SIZE);
> > > + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
> > > + sizeof(NvmeBar),
> > > NVME_DOORBELL_SIZE);
> > > + qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0,
> > > sizeof(NvmeBar));
> > > qemu_vfio_close(s->vfio);
> > >
> > > g_free(s->device);
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] block/nvme: Map doorbells pages write-only
2020-09-22 9:04 ` Paolo Bonzini
@ 2020-09-22 9:36 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-22 9:36 UTC (permalink / raw)
To: Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, qemu-devel
Cc: Kevin Wolf, Eric Auger, qemu-block, Max Reitz
On 9/22/20 11:04 AM, Paolo Bonzini wrote:
> On 22/09/20 10:41, Philippe Mathieu-Daudé wrote:
>>> Besides looking more correct in access mode, is there any side effect
>>> of WO mapping?
>> TBH I don't have enough knowledge to answer this question.
>> I tested successfully on X86. I'm writing more tests.
>
> No problem with doing this, but PROT_WRITE does not work at all on x86.
> :) PROT_EXEC works if you have a machine with PKRU, but PROT_WRITE
> silently becomes PROT_READ|PROT_WRITE because the processor does not
> support it.
Ah this is why it works the same way in my testing.
I'll run tests on ARM.
Thanks,
Phil.
>
> Paolo
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-09-22 9:37 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 16:29 [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 1/6] util/vfio-helpers: Pass page protections to qemu_vfio_pci_map_bar() Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 2/6] block/nvme: Map doorbells pages write-only Philippe Mathieu-Daudé
2020-09-22 8:18 ` Fam Zheng
2020-09-22 8:41 ` Philippe Mathieu-Daudé
2020-09-22 9:04 ` Paolo Bonzini
2020-09-22 9:36 ` Philippe Mathieu-Daudé
2020-09-22 9:33 ` Fam Zheng
2020-09-21 16:29 ` [PATCH 3/6] block/nvme: Reduce I/O registers scope Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 4/6] block/nvme: Drop NVMeRegs structure, directly use NvmeBar Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 5/6] block/nvme: Use register definitions from 'block/nvme.h' Philippe Mathieu-Daudé
2020-09-21 16:29 ` [PATCH 6/6] block/nvme: Replace magic value by SCALE_MS definition Philippe Mathieu-Daudé
2020-09-21 21:38 ` [PATCH 0/6] block/nvme: Map doorbells pages write-only, remove magic from nvme_init no-reply
2020-09-21 21:56 ` no-reply
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).