All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations
@ 2020-09-16 20:40 Philippe Mathieu-Daudé
  2020-09-16 20:40 ` [PATCH 1/3] block/nvme: Initialize constant values with const_le32() Philippe Mathieu-Daudé
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-16 20:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Dr . David Alan Gilbert,
	Max Reitz, Stefan Hajnoczi, Paolo Bonzini,
	Philippe Mathieu-Daudé,
	Richard Henderson

Fix a compiler optimization problem introduced in commit
e5ff22ba9fc ("block/nvme: Pair doorbell registers"),
use atomic operations.

For some not understood yet reason using atomic_and triggers
a NMI on x86 arch. Using the following snippet on top of this
series:

-- >8 --
-    atomic_set(&s->regs->ctrl.cc,
-               cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)=
));
+    atomic_and(&s->regs->ctrl.cc, const_le32(0xFE));
---

triggers:

 {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source:=
 3
 {1}[Hardware Error]: event severity: fatal
 {1}[Hardware Error]:  Error 0, type: fatal
 {1}[Hardware Error]:   section_type: PCIe error
 {1}[Hardware Error]:   port_type: 0, PCIe end point
 {1}[Hardware Error]:   version: 1.16
 {1}[Hardware Error]:   command: 0x0006, status: 0x0010
 {1}[Hardware Error]:   device_id: 0000:04:00.0
 {1}[Hardware Error]:   slot: 0
 {1}[Hardware Error]:   secondary_bus: 0x00
 {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2701
 {1}[Hardware Error]:   class_code: 010802
 {1}[Hardware Error]:   aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00018=
000
 {1}[Hardware Error]:   aer_uncor_severity: 0x000e7030
 {1}[Hardware Error]:   TLP Header: 33000000 00000000 00000000 00000000
 Kernel panic - not syncing: Fatal hardware error!
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.8-100.fc30.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015
 Call Trace:
  <NMI>
  dump_stack+0x66/0x90
  panic+0xf1/0x2d3
  __ghes_panic.part.0+0x26/0x26
  ghes_notify_nmi.cold+0x5/0x5
  nmi_handle+0x66/0x120
  default_do_nmi+0x45/0x100
  do_nmi+0x165/0x1d0
  end_repeat_nmi+0x16/0x50
 RIP: 0010:intel_idle+0x82/0x130
 Code: 65 48 8b 04 25 c0 8b 01 00 0f 01 c8 48 8b 00 a8 08 75 17 e9 07 00 00 0=
0 0f 00 2d f5 cd 6d 00 b9 01 00 00 00 48 89 d8 0f 01 c9 <65> 48 8b 04 25 c0 8=
b 01 00 f0 80 60 02 df f0 83 44 24 fc 00 b
 RSP: 0018:ffffffffa8603e30 EFLAGS: 00000046
 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
 RDX: 0000000000000000 RSI: ffffffffa875edc0 RDI: 0000000000000000
 RBP: ffffffffa875edc0 R08: 00000013ef2b797f R09: 000000000000ba42
 R10: 00000000000993b3 R11: ffffa070afc29ca4 R12: 0000000000000002
 R13: ffffffffa875edc0 R14: 0000000000000002 R15: ffffffffa8614840
  ? intel_idle+0x82/0x130
  ? intel_idle+0x82/0x130
  </NMI>
  cpuidle_enter_state+0x81/0x3e0
  cpuidle_enter+0x29/0x40
  do_idle+0x1c0/0x260
  cpu_startup_entry+0x19/0x20
  start_kernel+0x7ad/0x7ba
  secondary_startup_64+0xb6/0xc0
 Kernel Offset: 0x26000000 from 0xffffffff81000000 (relocation range: 0xfffff=
fff80000000-0xffffffffbfffffff)

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>

Philippe Mathieu-Daud=C3=A9 (3):
  block/nvme: Initialize constant values with const_le32()
  block/nvme: Use atomic operations instead of 'volatile' keyword
  block/nvme: Align NVMeRegs structure to 4KiB and mark it packed

 block/nvme.c | 55 ++++++++++++++++++++++++++++------------------------
 1 file changed, 30 insertions(+), 25 deletions(-)

--=20
2.26.2



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] block/nvme: Initialize constant values with const_le32()
  2020-09-16 20:40 [PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations Philippe Mathieu-Daudé
@ 2020-09-16 20:40 ` Philippe Mathieu-Daudé
  2020-09-17  9:55   ` Stefan Hajnoczi
  2020-09-16 20:40 ` [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword Philippe Mathieu-Daudé
  2020-09-16 20:40 ` [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed Philippe Mathieu-Daudé
  2 siblings, 1 reply; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-16 20:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
	Philippe Mathieu-Daudé

To avoid multiple endianess conversion, as we know the device
registers are in little-endian, directly use const_le32() with
constant values.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index f4f27b6da7d..b91749713e0 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -511,7 +511,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
     uint64_t iova;
     NvmeCmd cmd = {
         .opcode = NVME_ADM_CMD_IDENTIFY,
-        .cdw10 = cpu_to_le32(0x1),
+        .cdw10 = const_le32(0x1),
     };
 
     id = qemu_try_memalign(s->page_size, sizeof(*id));
@@ -649,7 +649,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
         .opcode = NVME_ADM_CMD_CREATE_CQ,
         .dptr.prp1 = cpu_to_le64(q->cq.iova),
         .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
-        .cdw11 = cpu_to_le32(0x3),
+        .cdw11 = const_le32(0x3),
     };
     if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
         error_setg(errp, "Failed to create CQ io queue [%d]", n);
@@ -734,10 +734,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
 
     /* Reset device to get a clean state. */
-    s->regs->ctrl.cc = cpu_to_le32(le32_to_cpu(s->regs->ctrl.cc) & 0xFE);
+    s->regs->ctrl.cc &= const_le32(0xFE);
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-    while (le32_to_cpu(s->regs->ctrl.csts) & 0x1) {
+    while (s->regs->ctrl.csts & const_le32(0x1)) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to reset (%"
                              PRId64 " ms)",
@@ -758,18 +758,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     }
     s->nr_queues = 1;
     QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-    s->regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+    s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
     s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
     s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
 
     /* After setting up all control registers we can enable device now. */
-    s->regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+    s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
                               (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
                               0x1);
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     deadline = now + timeout_ms * 1000000;
-    while (!(le32_to_cpu(s->regs->ctrl.csts) & 0x1)) {
+    while (!(s->regs->ctrl.csts & const_le32(0x1))) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
                              PRId64 " ms)",
@@ -848,8 +848,8 @@ static int nvme_enable_disable_write_cache(BlockDriverState *bs, bool enable,
     NvmeCmd cmd = {
         .opcode = NVME_ADM_CMD_SET_FEATURES,
         .nsid = cpu_to_le32(s->nsid),
-        .cdw10 = cpu_to_le32(0x06),
-        .cdw11 = cpu_to_le32(enable ? 0x01 : 0x00),
+        .cdw10 = const_le32(0x06),
+        .cdw11 = enable ? const_le32(0x01) : 0x00,
     };
 
     ret = nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd);
@@ -1278,8 +1278,8 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs,
     NvmeCmd cmd = {
         .opcode = NVME_CMD_DSM,
         .nsid = cpu_to_le32(s->nsid),
-        .cdw10 = cpu_to_le32(0), /*number of ranges - 0 based*/
-        .cdw11 = cpu_to_le32(1 << 2), /*deallocate bit*/
+        .cdw10 = const_le32(0), /*number of ranges - 0 based*/
+        .cdw11 = const_le32(1 << 2), /*deallocate bit*/
     };
 
     NVMeCoData data = {
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword
  2020-09-16 20:40 [PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations Philippe Mathieu-Daudé
  2020-09-16 20:40 ` [PATCH 1/3] block/nvme: Initialize constant values with const_le32() Philippe Mathieu-Daudé
@ 2020-09-16 20:40 ` Philippe Mathieu-Daudé
  2020-09-17 10:42   ` Stefan Hajnoczi
  2020-09-16 20:40 ` [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed Philippe Mathieu-Daudé
  2 siblings, 1 reply; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-16 20:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Dr . David Alan Gilbert,
	Max Reitz, Stefan Hajnoczi, Paolo Bonzini,
	Philippe Mathieu-Daudé,
	Richard Henderson

Follow docs/devel/atomics.rst guidelines and use atomic operations.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index b91749713e0..be80ea1f410 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -46,7 +46,7 @@ typedef struct {
     uint8_t  *queue;
     uint64_t iova;
     /* Hardware MMIO register */
-    volatile uint32_t *doorbell;
+    uint32_t *doorbell;
 } NVMeQueue;
 
 typedef struct {
@@ -82,7 +82,7 @@ typedef struct {
 } NVMeQueuePair;
 
 /* Memory mapped registers */
-typedef volatile struct {
+typedef struct {
     NvmeBar ctrl;
     struct {
         uint32_t sq_tail;
@@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q)
     trace_nvme_kick(s, q->index);
     assert(!(q->sq.tail & 0xFF00));
     /* Fence the write to submission queue entry before notifying the device. */
-    smp_wmb();
-    *q->sq.doorbell = cpu_to_le32(q->sq.tail);
+    atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail));
     q->inflight += q->need_kick;
     q->need_kick = 0;
 }
@@ -414,8 +413,7 @@ static bool nvme_process_completion(NVMeQueuePair *q)
     }
     if (progress) {
         /* Notify the device so it can post more completions. */
-        smp_mb_release();
-        *q->cq.doorbell = cpu_to_le32(q->cq.head);
+        atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head));
         nvme_wake_free_req_locked(q);
     }
 
@@ -433,8 +431,7 @@ static void nvme_process_completion_bh(void *opaque)
      * called aio_poll(). The callback may be waiting for further completions
      * so notify the device that it has space to fill in more completions now.
      */
-    smp_mb_release();
-    *q->cq.doorbell = cpu_to_le32(q->cq.head);
+    atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head));
     nvme_wake_free_req_locked(q);
 
     nvme_process_completion(q);
@@ -721,7 +718,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     /* Perform initialize sequence as described in NVMe spec "7.6.1
      * Initialization". */
 
-    cap = le64_to_cpu(s->regs->ctrl.cap);
+    cap = le64_to_cpu(atomic_read(&s->regs->ctrl.cap));
     if (!(cap & (1ULL << 37))) {
         error_setg(errp, "Device doesn't support NVMe command set");
         ret = -EINVAL;
@@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
 
     /* Reset device to get a clean state. */
-    s->regs->ctrl.cc &= const_le32(0xFE);
+    atomic_set(&s->regs->ctrl.cc,
+               cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)));
     /* Wait for CSTS.RDY = 0. */
     deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-    while (s->regs->ctrl.csts & const_le32(0x1)) {
+    while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to reset (%"
                              PRId64 " ms)",
@@ -758,18 +756,22 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     }
     s->nr_queues = 1;
     QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-    s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
-    s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
-    s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+    atomic_set(&s->regs->ctrl.aqa,
+               const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE));
+    atomic_set(&s->regs->ctrl.asq,
+               cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova));
+    atomic_set(&s->regs->ctrl.acq,
+               cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova));
 
     /* After setting up all control registers we can enable device now. */
-    s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
-                              (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
-                              0x1);
+    atomic_set(&s->regs->ctrl.cc,
+               const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+                          (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
+                          0x1));
     /* Wait for CSTS.RDY = 1. */
     now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     deadline = now + timeout_ms * 1000000;
-    while (!(s->regs->ctrl.csts & const_le32(0x1))) {
+    while (!(atomic_read(&s->regs->ctrl.csts) & const_le32(0x1))) {
         if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
             error_setg(errp, "Timeout while waiting for device to start (%"
                              PRId64 " ms)",
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed
  2020-09-16 20:40 [PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations Philippe Mathieu-Daudé
  2020-09-16 20:40 ` [PATCH 1/3] block/nvme: Initialize constant values with const_le32() Philippe Mathieu-Daudé
  2020-09-16 20:40 ` [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword Philippe Mathieu-Daudé
@ 2020-09-16 20:40 ` Philippe Mathieu-Daudé
  2020-09-17 11:07   ` Stefan Hajnoczi
  2 siblings, 1 reply; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-16 20:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
	Philippe Mathieu-Daudé

In commit e5ff22ba9fc we changed the doorbells register
declaration but forgot to mark the structure packed (as
MMIO registers), allowing the compiler to optimize it.
Fix by marking it packed, and align it to avoid:

  block/nvme.c: In function ‘nvme_create_queue_pair’:
  block/nvme.c:252:22: error: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
      252 |     q->sq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].sq_tail;
          |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: e5ff22ba9fc ("block/nvme: Pair doorbell registers")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index be80ea1f410..2f9f560ccd5 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -12,6 +12,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include <linux/vfio.h>
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
@@ -82,13 +83,15 @@ typedef struct {
 } NVMeQueuePair;
 
 /* Memory mapped registers */
-typedef struct {
+typedef struct QEMU_PACKED {
     NvmeBar ctrl;
     struct {
         uint32_t sq_tail;
         uint32_t cq_head;
     } doorbells[];
-} NVMeRegs;
+} QEMU_ALIGNED(4 * KiB) NVMeRegs;
+
+QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells[1]) != 4096 + 8);
 
 #define INDEX_ADMIN     0
 #define INDEX_IO(n)     (1 + n)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] block/nvme: Initialize constant values with const_le32()
  2020-09-16 20:40 ` [PATCH 1/3] block/nvme: Initialize constant values with const_le32() Philippe Mathieu-Daudé
@ 2020-09-17  9:55   ` Stefan Hajnoczi
  2020-09-17 13:52     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-09-17  9:55 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 506 bytes --]

On Wed, Sep 16, 2020 at 10:40:02PM +0200, Philippe Mathieu-Daudé wrote:
> To avoid multiple endianess conversion, as we know the device
> registers are in little-endian, directly use const_le32() with
> constant values.

Can cpu_to_X() be extended to handle constant expressions? That way the
programmer doesn't need to remember the const_X() API exists.

Maybe __builtin_constant_p() can be used:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fconstant_005fp

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword
  2020-09-16 20:40 ` [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword Philippe Mathieu-Daudé
@ 2020-09-17 10:42   ` Stefan Hajnoczi
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-09-17 10:42 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
	Paolo Bonzini, Dr . David Alan Gilbert, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 3082 bytes --]

On Wed, Sep 16, 2020 at 10:40:03PM +0200, Philippe Mathieu-Daudé wrote:

I think the current use of volatile is fine. It's widely used in device
drivers (see Linux and DPDK) so I'm not sure eliminating it is
worthwhile.

> Follow docs/devel/atomics.rst guidelines and use atomic operations.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  block/nvme.c | 38 ++++++++++++++++++++------------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index b91749713e0..be80ea1f410 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -46,7 +46,7 @@ typedef struct {
>      uint8_t  *queue;
>      uint64_t iova;
>      /* Hardware MMIO register */
> -    volatile uint32_t *doorbell;
> +    uint32_t *doorbell;
>  } NVMeQueue;
>  
>  typedef struct {
> @@ -82,7 +82,7 @@ typedef struct {
>  } NVMeQueuePair;
>  
>  /* Memory mapped registers */
> -typedef volatile struct {
> +typedef struct {
>      NvmeBar ctrl;
>      struct {
>          uint32_t sq_tail;
> @@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q)
>      trace_nvme_kick(s, q->index);
>      assert(!(q->sq.tail & 0xFF00));
>      /* Fence the write to submission queue entry before notifying the device. */
> -    smp_wmb();
> -    *q->sq.doorbell = cpu_to_le32(q->sq.tail);
> +    atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail));

I suggest using smp_wmb()/smp_rmb() instead of atomic operations since
operation doesn't actually need to be atomic.

This hunk can be dropped from the patch, the existing behavior is
already correct.

> @@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
>      timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
>  
>      /* Reset device to get a clean state. */
> -    s->regs->ctrl.cc &= const_le32(0xFE);
> +    atomic_set(&s->regs->ctrl.cc,
> +               cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)));
>      /* Wait for CSTS.RDY = 0. */
>      deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
> -    while (s->regs->ctrl.csts & const_le32(0x1)) {
> +    while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) {
>          if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
>              error_setg(errp, "Timeout while waiting for device to reset (%"
>                               PRId64 " ms)",

Linux drivers use readl()/writel() to perform memory loads/stores with
appropriate constraints for MMIO accesses (e.g. the instructions cannot
be optimized by the compiler). QEMU lacks an API like this because it
didn't contain userspace drivers before block/nvme.c.

The semantics needed here are that the compiler must perform the memory
access and cannot optimize it.

Please introduce an API for hardware register accesses instead of
(ab)using atomics.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed
  2020-09-16 20:40 ` [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed Philippe Mathieu-Daudé
@ 2020-09-17 11:07   ` Stefan Hajnoczi
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2020-09-17 11:07 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]

On Wed, Sep 16, 2020 at 10:40:04PM +0200, Philippe Mathieu-Daudé wrote:
> In commit e5ff22ba9fc we changed the doorbells register
> declaration but forgot to mark the structure packed (as
> MMIO registers), allowing the compiler to optimize it.

Does this patch actually change anything? NvmeBar is already 4096 bytes
long and struct doorbells has two 32-bit fields, so there is no padding.

I ask because it's not clear whether this patch is a bug fix or a
cleanup.

>  /* Memory mapped registers */
> -typedef struct {
> +typedef struct QEMU_PACKED {
>      NvmeBar ctrl;
>      struct {
>          uint32_t sq_tail;
>          uint32_t cq_head;
>      } doorbells[];
> -} NVMeRegs;
> +} QEMU_ALIGNED(4 * KiB) NVMeRegs;

I can think of two policies for using packed:

1. Packed is used only when needed to avoid padding.

   But this patch uses it even though there is no padding, so it can't
   be this policy.

2. Packed is always used on external binary structs (e.g. file formats,
   network protocols, hardware register layouts, etc).

   But doorbells[] isn't marked packed so it can't be this policy
   either. The documentation says that nested structs need to be marked
   packed too:
   https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Type-Attributes.html#Common-Type-Attributes

So I'm confused about the purpose of this patch. What does it achieve?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] block/nvme: Initialize constant values with const_le32()
  2020-09-17  9:55   ` Stefan Hajnoczi
@ 2020-09-17 13:52     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-17 13:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Fam Zheng, Kevin Wolf, qemu-devel, qemu-block, Max Reitz

On 9/17/20 11:55 AM, Stefan Hajnoczi wrote:
> On Wed, Sep 16, 2020 at 10:40:02PM +0200, Philippe Mathieu-Daudé wrote:
>> To avoid multiple endianess conversion, as we know the device
>> registers are in little-endian, directly use const_le32() with
>> constant values.
> 
> Can cpu_to_X() be extended to handle constant expressions? That way the
> programmer doesn't need to remember the const_X() API exists.
> 
> Maybe __builtin_constant_p() can be used:
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fconstant_005fp

Yes it works! Good idea :)



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-17 15:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-16 20:40 [PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations Philippe Mathieu-Daudé
2020-09-16 20:40 ` [PATCH 1/3] block/nvme: Initialize constant values with const_le32() Philippe Mathieu-Daudé
2020-09-17  9:55   ` Stefan Hajnoczi
2020-09-17 13:52     ` Philippe Mathieu-Daudé
2020-09-16 20:40 ` [PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword Philippe Mathieu-Daudé
2020-09-17 10:42   ` Stefan Hajnoczi
2020-09-16 20:40 ` [PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed Philippe Mathieu-Daudé
2020-09-17 11:07   ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.