All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] hw/block/nvme: handle transient dma errors
@ 2020-06-29 20:20 Klaus Jensen
  2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Klaus Jensen @ 2020-06-29 20:20 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Klaus Jensen, qemu-devel, Max Reitz, Klaus Jensen,
	Keith Busch, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

QEMU actually respects that Bus Master Enabling for a PCI device gets
flipped, so in order to succesfully pass the block/011 test ("disable
PCI device while doing I/O") the nvme device needs to know if a dma
transfer was successful or not.

Based-on: <20200629195017.1217056-1-its@irrelevant.dk>
("[PATCH 00/17] hw/block/nvme: AIO and address mapping refactoring")

Klaus Jensen (2):
  pci: pass along the return value of dma_memory_rw
  hw/block/nvme: handle dma errors

 hw/block/nvme.c       | 43 ++++++++++++++++++++++++++++++++-----------
 hw/block/trace-events |  2 ++
 include/block/nvme.h  |  2 +-
 include/hw/pci/pci.h  |  3 +--
 4 files changed, 36 insertions(+), 14 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] pci: pass along the return value of dma_memory_rw
  2020-06-29 20:20 [PATCH 0/2] hw/block/nvme: handle transient dma errors Klaus Jensen
@ 2020-06-29 20:20 ` Klaus Jensen
  2020-07-01 12:18   ` Michael S. Tsirkin
  2020-07-22 11:48   ` Michael S. Tsirkin
  2020-06-29 20:20 ` [PATCH 2/2] hw/block/nvme: handle dma errors Klaus Jensen
  2020-06-29 21:07 ` [PATCH 0/2] hw/block/nvme: handle transient " no-reply
  2 siblings, 2 replies; 11+ messages in thread
From: Klaus Jensen @ 2020-06-29 20:20 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Michael S . Tsirkin, Klaus Jensen, qemu-devel,
	Max Reitz, Klaus Jensen, Keith Busch, Maxim Levitsky,
	Philippe Mathieu-Daudé

From: Klaus Jensen <k.jensen@samsung.com>

Some devices might want to know the return value of dma_memory_rw, so
pass it along instead of ignoring it.

There are no existing users of the return value, so this patch should be
safe.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
 include/hw/pci/pci.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index a4e9c3341615..2347dc36bfb5 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -786,8 +786,7 @@ static inline AddressSpace *pci_get_address_space(PCIDevice *dev)
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
-    return 0;
+    return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
 }
 
 static inline int pci_dma_read(PCIDevice *dev, dma_addr_t addr,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] hw/block/nvme: handle dma errors
  2020-06-29 20:20 [PATCH 0/2] hw/block/nvme: handle transient dma errors Klaus Jensen
  2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
@ 2020-06-29 20:20 ` Klaus Jensen
  2020-06-29 21:07 ` [PATCH 0/2] hw/block/nvme: handle transient " no-reply
  2 siblings, 0 replies; 11+ messages in thread
From: Klaus Jensen @ 2020-06-29 20:20 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Klaus Jensen, qemu-devel, Max Reitz, Klaus Jensen,
	Keith Busch, Maxim Levitsky

From: Klaus Jensen <k.jensen@samsung.com>

Handling DMA errors gracefully is required for the device to pass the
block/011 test ("disable PCI device while doing I/O") in the blktests
suite.

With this patch the device passes the test by retrying "critical"
transfers (posting of completion entries and processing of submission
queue entries).

If DMA errors occur at any other point in the execution of the command
(say, while mapping the PRPs), the command is aborted with a Data
Transfer Error status code.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/block/nvme.c       | 43 ++++++++++++++++++++++++++++++++-----------
 hw/block/trace-events |  2 ++
 include/block/nvme.h  |  2 +-
 3 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index fa0f8e802d9b..94f5bf2a815f 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -89,14 +89,14 @@ static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
     return addr >= low && addr < hi;
 }
 
-static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
+static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
 {
     if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr)) {
         memcpy(buf, nvme_addr_to_cmb(n, addr), size);
-        return;
+        return 0;
     }
 
-    pci_dma_read(&n->parent_obj, addr, buf, size);
+    return pci_dma_read(&n->parent_obj, addr, buf, size);
 }
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
@@ -202,7 +202,7 @@ static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr addr,
     trace_pci_nvme_map_addr_cmb(addr, len);
 
     if (!nvme_addr_is_cmb(n, addr) || !nvme_addr_is_cmb(n, addr + len - 1)) {
-        return NVME_DATA_TRAS_ERROR;
+        return NVME_DATA_TRANSFER_ERROR;
     }
 
     qemu_iovec_add(iov, nvme_addr_to_cmb(n, addr), len);
@@ -257,6 +257,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
     int num_prps = (len >> n->page_bits) + 1;
     uint16_t status;
     bool prp_list_in_cmb = false;
+    int ret;
 
     trace_pci_nvme_map_prp(nvme_cid(req), trans_len, len, prp1, prp2,
                            num_prps);
@@ -295,7 +296,11 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
 
             nents = (len + n->page_size - 1) >> n->page_bits;
             prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
-            nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
+            ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
+            if (ret) {
+                trace_pci_nvme_err_addr_read(prp2);
+                return NVME_DATA_TRANSFER_ERROR;
+            }
             while (len != 0) {
                 uint64_t prp_ent = le64_to_cpu(prp_list[i]);
 
@@ -312,8 +317,12 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, QEMUSGList *qsg, QEMUIOVector *iov,
                     i = 0;
                     nents = (len + n->page_size - 1) >> n->page_bits;
                     prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
-                    nvme_addr_read(n, prp_ent, (void *)prp_list,
-                        prp_trans);
+                    ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
+                                         prp_trans);
+                    if (ret) {
+                        trace_pci_nvme_err_addr_read(prp_ent);
+                        return NVME_DATA_TRANSFER_ERROR;
+                    }
                     prp_ent = le64_to_cpu(prp_list[i]);
                 }
 
@@ -487,6 +496,7 @@ static void nvme_post_cqes(void *opaque)
     NvmeCQueue *cq = opaque;
     NvmeCtrl *n = cq->ctrl;
     NvmeRequest *req, *next;
+    int ret;
 
     QTAILQ_FOREACH_SAFE(req, &cq->req_list, entry, next) {
         NvmeSQueue *sq;
@@ -496,15 +506,21 @@ static void nvme_post_cqes(void *opaque)
             break;
         }
 
-        QTAILQ_REMOVE(&cq->req_list, req, entry);
         sq = req->sq;
         req->cqe.status = cpu_to_le16((req->status << 1) | cq->phase);
         req->cqe.sq_id = cpu_to_le16(sq->sqid);
         req->cqe.sq_head = cpu_to_le16(sq->head);
         addr = cq->dma_addr + cq->tail * n->cqe_size;
+        ret = pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
+                            sizeof(req->cqe));
+        if (ret) {
+            trace_pci_nvme_err_addr_write(addr);
+            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                      500 * SCALE_MS);
+            break;
+        }
+        QTAILQ_REMOVE(&cq->req_list, req, entry);
         nvme_inc_cq_tail(cq);
-        pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
-            sizeof(req->cqe));
         nvme_req_clear(req);
         QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
     }
@@ -1753,7 +1769,12 @@ static void nvme_process_sq(void *opaque)
 
     while (!(nvme_sq_empty(sq) || QTAILQ_EMPTY(&sq->req_list))) {
         addr = sq->dma_addr + sq->head * n->sqe_size;
-        nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd));
+        if (nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd))) {
+            trace_pci_nvme_err_addr_read(addr);
+            timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                      500 * SCALE_MS);
+            break;
+        }
         nvme_inc_sq_head(sq);
 
         req = QTAILQ_FIRST(&sq->req_list);
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 1b2c431e569e..e2a181a0915d 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -89,6 +89,8 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
 # nvme traces for error conditions
 pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %"PRIu64""
 pci_nvme_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t offset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p blk \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16""
+pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64""
+pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64""
 pci_nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
 pci_nvme_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
 pci_nvme_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 91456255ffa7..146c64e0bac7 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -629,7 +629,7 @@ enum NvmeStatusCodes {
     NVME_INVALID_OPCODE         = 0x0001,
     NVME_INVALID_FIELD          = 0x0002,
     NVME_CID_CONFLICT           = 0x0003,
-    NVME_DATA_TRAS_ERROR        = 0x0004,
+    NVME_DATA_TRANSFER_ERROR    = 0x0004,
     NVME_POWER_LOSS_ABORT       = 0x0005,
     NVME_INTERNAL_DEV_ERROR     = 0x0006,
     NVME_CMD_ABORT_REQ          = 0x0007,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors
  2020-06-29 20:20 [PATCH 0/2] hw/block/nvme: handle transient dma errors Klaus Jensen
  2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
  2020-06-29 20:20 ` [PATCH 2/2] hw/block/nvme: handle dma errors Klaus Jensen
@ 2020-06-29 21:07 ` no-reply
  2020-06-29 21:34   ` Klaus Jensen
  2 siblings, 1 reply; 11+ messages in thread
From: no-reply @ 2020-06-29 21:07 UTC (permalink / raw)
  To: its
  Cc: kwolf, qemu-block, k.jensen, qemu-devel, mreitz, kbusch, its, mlevitsk

Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

--- /tmp/qemu-test/src/tests/qemu-iotests/040.out       2020-06-29 20:12:10.000000000 +0000
+++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000
@@ -1,3 +1,5 @@
+WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
+WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
 ...........................................................
 ----------------------------------------------------------------------
 Ran 59 tests
---
Not run: 259
Failures: 040
Failed 1 of 119 iotests
make: *** [check-tests/check-block.sh] Error 1
make: *** Waiting for unfinished jobs....
  TEST    check-qtest-aarch64: tests/qtest/qos-test
Traceback (most recent call last):
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-98l7koy2/src/docker-src.2020-06-29-16.51.46.20742:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-98l7koy2/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    15m57.590s
user    0m9.240s


The full log is available at
http://patchew.org/logs/20200629202053.1223342-1-its@irrelevant.dk/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors
  2020-06-29 21:07 ` [PATCH 0/2] hw/block/nvme: handle transient " no-reply
@ 2020-06-29 21:34   ` Klaus Jensen
  2020-07-01 12:58     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 11+ messages in thread
From: Klaus Jensen @ 2020-06-29 21:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, k.jensen, kbusch, qemu-block, mreitz

On Jun 29 14:07, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out       2020-06-29 20:12:10.000000000 +0000
> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000
> @@ -1,3 +1,5 @@
> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest


Hmm, I can't seem to reproduce this locally and the test succeeded on
the next series[1] that is based on this.

Is this a flaky test? Or a bad test runner? I'm of course worried when
a qcow2 test fails and I touch something else than the nvme device ;)


  [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] pci: pass along the return value of dma_memory_rw
  2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
@ 2020-07-01 12:18   ` Michael S. Tsirkin
  2020-07-22 11:48   ` Michael S. Tsirkin
  1 sibling, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-07-01 12:18 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Maxim Levitsky, Philippe Mathieu-Daudé

On Mon, Jun 29, 2020 at 10:20:52PM +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Some devices might want to know the return value of dma_memory_rw, so
> pass it along instead of ignoring it.
> 
> There are no existing users of the return value, so this patch should be
> safe.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Acked-by: Keith Busch <kbusch@kernel.org>

Feel free to merge with patch 2/2.

> ---
>  include/hw/pci/pci.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index a4e9c3341615..2347dc36bfb5 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -786,8 +786,7 @@ static inline AddressSpace *pci_get_address_space(PCIDevice *dev)
>  static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
>                               void *buf, dma_addr_t len, DMADirection dir)
>  {
> -    dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
> -    return 0;
> +    return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
>  }
>  
>  static inline int pci_dma_read(PCIDevice *dev, dma_addr_t addr,
> -- 
> 2.27.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors
  2020-06-29 21:34   ` Klaus Jensen
@ 2020-07-01 12:58     ` Philippe Mathieu-Daudé
  2020-07-03  7:50       ` Kevin Wolf
  0 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-01 12:58 UTC (permalink / raw)
  To: Klaus Jensen, qemu-devel, kwolf, mreitz; +Cc: kbusch, k.jensen, qemu-block

On 6/29/20 11:34 PM, Klaus Jensen wrote:
> On Jun 29 14:07, no-reply@patchew.org wrote:
>> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/

>> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out       2020-06-29 20:12:10.000000000 +0000
>> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000
>> @@ -1,3 +1,5 @@
>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest

Kevin, Max, can iotests/040 be affected by this change?

> 
> 
> Hmm, I can't seem to reproduce this locally and the test succeeded on
> the next series[1] that is based on this.
> 
> Is this a flaky test? Or a bad test runner? I'm of course worried when
> a qcow2 test fails and I touch something else than the nvme device ;)
> 
> 
>   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors
  2020-07-01 12:58     ` Philippe Mathieu-Daudé
@ 2020-07-03  7:50       ` Kevin Wolf
  2020-07-03  7:55         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 11+ messages in thread
From: Kevin Wolf @ 2020-07-03  7:50 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-block, k.jensen, qemu-devel, mreitz, kbusch, Klaus Jensen

Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben:
> On 6/29/20 11:34 PM, Klaus Jensen wrote:
> > On Jun 29 14:07, no-reply@patchew.org wrote:
> >> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/
> 
> >> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out       2020-06-29 20:12:10.000000000 +0000
> >> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000
> >> @@ -1,3 +1,5 @@
> >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
> >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
> 
> Kevin, Max, can iotests/040 be affected by this change?

The diffstat of this series looks like it doesn't touch anything outside
of the nvme emuation, which isn't used by this test, so at least I'd say
it's not the fault of the patch series.

I think test cases use SIGKILL primarily in timeout handlers, so maybe
the test host was overloaded and didn't shutdown QEMU in time so it was
killed. There is no actually failing test case:

 ...........................................................
 ----------------------------------------------------------------------
 Ran 59 tests

You would have 'F' or 'E' for fail/error instead of '.' otherwise.

Kevin

> > 
> > 
> > Hmm, I can't seem to reproduce this locally and the test succeeded on
> > the next series[1] that is based on this.
> > 
> > Is this a flaky test? Or a bad test runner? I'm of course worried when
> > a qcow2 test fails and I touch something else than the nvme device ;)
> > 
> > 
> >   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/
> > 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors
  2020-07-03  7:50       ` Kevin Wolf
@ 2020-07-03  7:55         ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-03  7:55 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, k.jensen, qemu-devel, mreitz, kbusch, Klaus Jensen

On 7/3/20 9:50 AM, Kevin Wolf wrote:
> Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben:
>> On 6/29/20 11:34 PM, Klaus Jensen wrote:
>>> On Jun 29 14:07, no-reply@patchew.org wrote:
>>>> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/
>>
>>>> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out       2020-06-29 20:12:10.000000000 +0000
>>>> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000
>>>> @@ -1,3 +1,5 @@
>>>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
>>>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest
>>
>> Kevin, Max, can iotests/040 be affected by this change?
> 
> The diffstat of this series looks like it doesn't touch anything outside
> of the nvme emuation, which isn't used by this test, so at least I'd say
> it's not the fault of the patch series.
> 
> I think test cases use SIGKILL primarily in timeout handlers, so maybe
> the test host was overloaded and didn't shutdown QEMU in time so it was
> killed. There is no actually failing test case:
> 
>  ...........................................................
>  ----------------------------------------------------------------------
>  Ran 59 tests
> 
> You would have 'F' or 'E' for fail/error instead of '.' otherwise.

TIL how to read that line :)

Thanks for your analysis Kevin!

> 
> Kevin
> 
>>>
>>>
>>> Hmm, I can't seem to reproduce this locally and the test succeeded on
>>> the next series[1] that is based on this.
>>>
>>> Is this a flaky test? Or a bad test runner? I'm of course worried when
>>> a qcow2 test fails and I touch something else than the nvme device ;)
>>>
>>>
>>>   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/
>>>
>>
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] pci: pass along the return value of dma_memory_rw
  2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
  2020-07-01 12:18   ` Michael S. Tsirkin
@ 2020-07-22 11:48   ` Michael S. Tsirkin
  2020-07-27  9:20     ` Klaus Jensen
  1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-07-22 11:48 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Maxim Levitsky, Philippe Mathieu-Daudé

On Mon, Jun 29, 2020 at 10:20:52PM +0200, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> Some devices might want to know the return value of dma_memory_rw, so
> pass it along instead of ignoring it.
> 
> There are no existing users of the return value, so this patch should be
> safe.
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Acked-by: Keith Busch <kbusch@kernel.org>


Please feel free to merge this with the patch that uses the
return value.

> ---
>  include/hw/pci/pci.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index a4e9c3341615..2347dc36bfb5 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -786,8 +786,7 @@ static inline AddressSpace *pci_get_address_space(PCIDevice *dev)
>  static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
>                               void *buf, dma_addr_t len, DMADirection dir)
>  {
> -    dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
> -    return 0;
> +    return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
>  }
>  
>  static inline int pci_dma_read(PCIDevice *dev, dma_addr_t addr,
> -- 
> 2.27.0
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] pci: pass along the return value of dma_memory_rw
  2020-07-22 11:48   ` Michael S. Tsirkin
@ 2020-07-27  9:20     ` Klaus Jensen
  0 siblings, 0 replies; 11+ messages in thread
From: Klaus Jensen @ 2020-07-27  9:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, qemu-block, Klaus Jensen, qemu-devel, Max Reitz,
	Keith Busch, Maxim Levitsky, Philippe Mathieu-Daudé

On Jul 22 07:48, Michael S. Tsirkin wrote:
> On Mon, Jun 29, 2020 at 10:20:52PM +0200, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > Some devices might want to know the return value of dma_memory_rw, so
> > pass it along instead of ignoring it.
> > 
> > There are no existing users of the return value, so this patch should be
> > safe.
> > 
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > Acked-by: Keith Busch <kbusch@kernel.org>
> 
> 
> Please feel free to merge this with the patch that uses the
> return value.
> 

Hi Michael,

Noted. The patch depends on another series that have not been merged
yet, so this is why it is lying around and waiting to be added.


Thanks,
Klaus


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-07-27  9:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-29 20:20 [PATCH 0/2] hw/block/nvme: handle transient dma errors Klaus Jensen
2020-06-29 20:20 ` [PATCH 1/2] pci: pass along the return value of dma_memory_rw Klaus Jensen
2020-07-01 12:18   ` Michael S. Tsirkin
2020-07-22 11:48   ` Michael S. Tsirkin
2020-07-27  9:20     ` Klaus Jensen
2020-06-29 20:20 ` [PATCH 2/2] hw/block/nvme: handle dma errors Klaus Jensen
2020-06-29 21:07 ` [PATCH 0/2] hw/block/nvme: handle transient " no-reply
2020-06-29 21:34   ` Klaus Jensen
2020-07-01 12:58     ` Philippe Mathieu-Daudé
2020-07-03  7:50       ` Kevin Wolf
2020-07-03  7:55         ` Philippe Mathieu-Daudé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.