All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-17  7:59 Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 01/18] virtio: Return error from virtqueue_map_sg Fam Zheng
                   ` (19 more replies)
  0 siblings, 20 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Currently, virtio code chooses to kill QEMU if the guest passes any invalid
data with vring. That has drawbacks such as losing unsaved data (e.g. when
guest user is writing a very long email), or possible denial of service in
a nested vm use case where virtio device is passed through.

virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
improve this by communicating the error state between virtio devices and
drivers. The device notifies guest upon setting the bit, then the guest driver
should detect this bit and report to userspace, or recover the device by
resetting it.

This series makes necessary changes in virtio core code, based on which
virtio-blk is converted. Other devices now keep the existing behavior by
passing in "error_abort". They will be converted in following series. The Linux
driver part will also be worked on.

One concern with this behavior change is that it's now harder to notice the
actual driver bug that caused the error, as the guest continues to run.  To
address that, we could probably add a new error action option to virtio
devices,  similar to the "read/write werror" in block layer, so the vm could be
paused and the management will get an event in QMP like pvpanic.  This work can
be done on top.



Fam Zheng (18):
  virtio: Return error from virtqueue_map_sg
  virtio: Return error from virtqueue_num_heads
  virtio: Return error from virtqueue_get_head
  virtio: Return error from virtqueue_next_desc
  virtio: Return error from virtqueue_get_avail_bytes
  virtio: Return error from virtqueue_pop
  virtio: Return error from virtqueue_avail_bytes
  virtio: Return error from virtio_add_queue
  virtio: Return error from virtio_del_queue
  virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET
  virtio: Add "needs_reset" flag to virtio device
  virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop
  virtio-blk: Graceful error handling of virtqueue_pop
  qtest: Add "QTEST_FILTER" to filter test cases
  qtest: virtio-blk: Extract "setup" for future reuse
  libqos: Add qvirtio_needs_reset
  qtest: Add test case for "needs reset" of virtio-blk
  qtest: virtio-blk: Suppress virtio error messages in "make check"

 hw/9pfs/virtio-9p-device.c                     |   2 +-
 hw/9pfs/virtio-9p.c                            |   2 +-
 hw/block/dataplane/virtio-blk.c                |   9 +-
 hw/block/virtio-blk.c                          |  62 +++++--
 hw/char/virtio-serial-bus.c                    |  30 ++--
 hw/net/virtio-net.c                            |  36 +++--
 hw/scsi/virtio-scsi.c                          |   8 +-
 hw/virtio/virtio-balloon.c                     |  13 +-
 hw/virtio/virtio-rng.c                         |   6 +-
 hw/virtio/virtio.c                             | 214 ++++++++++++++++++-------
 include/hw/virtio/virtio-blk.h                 |   3 +-
 include/hw/virtio/virtio.h                     |  17 +-
 include/standard-headers/linux/virtio_config.h |   2 +
 tests/Makefile                                 |   6 +-
 tests/libqos/virtio.c                          |   5 +
 tests/libqos/virtio.h                          |   2 +
 tests/virtio-blk-test.c                        | 196 ++++++++++++++++++++--
 17 files changed, 482 insertions(+), 131 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 01/18] virtio: Return error from virtqueue_map_sg
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 02/18] virtio: Return error from virtqueue_num_heads Fam Zheng
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

virtqueue_map_sg calls error_report() and exit() when parameter is invalid, or
when mapping failed. Lift the error check to caller, by adding errp. Also, when
one of the mapping failed, previous maps are reverted for cleanness.

All existing callers pass in error_abort for now.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/virtio-blk.c       |  4 ++--
 hw/char/virtio-serial-bus.c |  4 ++--
 hw/virtio/virtio.c          | 27 ++++++++++++++++++---------
 include/hw/virtio/virtio.h  |  3 ++-
 4 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 9546fd2..f7d8528 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -822,9 +822,9 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
         s->rq = req;
 
         virtqueue_map_sg(req->elem.in_sg, req->elem.in_addr,
-            req->elem.in_num, 1);
+            req->elem.in_num, 1, &error_abort);
         virtqueue_map_sg(req->elem.out_sg, req->elem.out_addr,
-            req->elem.out_num, 0);
+            req->elem.out_num, 0, &error_abort);
     }
 
     return 0;
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index e336bdb..1e07858 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -703,9 +703,9 @@ static int fetch_active_ports_list(QEMUFile *f, int version_id,
                 qemu_get_buffer(f, (unsigned char *)&port->elem,
                                 sizeof(port->elem));
                 virtqueue_map_sg(port->elem.in_sg, port->elem.in_addr,
-                                 port->elem.in_num, 1);
+                                 port->elem.in_num, 1, &error_abort);
                 virtqueue_map_sg(port->elem.out_sg, port->elem.out_addr,
-                                 port->elem.out_num, 1);
+                                 port->elem.out_num, 1, &error_abort);
 
                 /*
                  *  Port was throttled on source machine.  Let's
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 17c1260..179c412 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -432,25 +432,32 @@ int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
 }
 
 void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
-    size_t num_sg, int is_write)
+                      size_t num_sg, int is_write,
+                      Error **errp)
 {
-    unsigned int i;
+    int i;
     hwaddr len;
 
     if (num_sg > VIRTQUEUE_MAX_SIZE) {
-        error_report("virtio: map attempt out of bounds: %zd > %d",
-                     num_sg, VIRTQUEUE_MAX_SIZE);
-        exit(1);
+        error_setg(errp, "virtio: map attempt out of bounds: %zd > %d",
+                   num_sg, VIRTQUEUE_MAX_SIZE);
+        return;
     }
 
     for (i = 0; i < num_sg; i++) {
         len = sg[i].iov_len;
         sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write);
         if (sg[i].iov_base == NULL || len != sg[i].iov_len) {
-            error_report("virtio: error trying to map MMIO memory");
-            exit(1);
+            goto fail;
         }
     }
+    return;
+fail:
+    for ( ; i >= 0; i--) {
+        cpu_physical_memory_unmap(sg[i].iov_base, sg[i].iov_len,
+                                  is_write, 0);
+    }
+    error_setg(errp, "virtio: error trying to map MMIO memory");
 }
 
 int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
@@ -514,8 +521,10 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
     } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
 
     /* Now map what we have collected */
-    virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
-    virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0);
+    virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
+                     &error_abort);
+    virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0,
+                     &error_abort);
 
     elem->index = head;
 
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index d95f8b6..64c10cf 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -140,7 +140,8 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
                     unsigned int len, unsigned int idx);
 
 void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
-    size_t num_sg, int is_write);
+                      size_t num_sg, int is_write,
+                      Error **errp);
 int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem);
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
                           unsigned int out_bytes);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 02/18] virtio: Return error from virtqueue_num_heads
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 01/18] virtio: Return error from virtqueue_map_sg Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head Fam Zheng
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

If vring has invalid data, virtqueue_num_heads used to print an error
and abort(). Change to returning -EINVAL with error info.

For now, pass in error_abort from two existing callers.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/virtio/virtio.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 179c412..b02c7a1 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -292,15 +292,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
     virtqueue_flush(vq, 1);
 }
 
-static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
+static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx, Error **errp)
 {
     uint16_t num_heads = vring_avail_idx(vq) - idx;
 
     /* Check it isn't doing very strange things with descriptor numbers. */
     if (num_heads > vq->vring.num) {
-        error_report("Guest moved used index from %u to %u",
-                     idx, vring_avail_idx(vq));
-        exit(1);
+        error_setg(errp, "Guest moved used index from %u to %u",
+                   idx, vring_avail_idx(vq));
+        return -EINVAL;
     }
     /* On success, callers read a descriptor at vq->last_avail_idx.
      * Make sure descriptor read does not bypass avail index read. */
@@ -361,7 +361,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
     idx = vq->last_avail_idx;
 
     total_bufs = in_total = out_total = 0;
-    while (virtqueue_num_heads(vq, idx)) {
+    while (virtqueue_num_heads(vq, idx, &error_abort)) {
         VirtIODevice *vdev = vq->vdev;
         unsigned int max, num_bufs, indirect = 0;
         hwaddr desc_pa;
@@ -466,7 +466,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
     hwaddr desc_pa = vq->vring.desc;
     VirtIODevice *vdev = vq->vdev;
 
-    if (!virtqueue_num_heads(vq, vq->last_avail_idx))
+    if (!virtqueue_num_heads(vq, vq->last_avail_idx, &error_abort))
         return 0;
 
     /* When we start there are none of either input nor output. */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 01/18] virtio: Return error from virtqueue_map_sg Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 02/18] virtio: Return error from virtqueue_num_heads Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-21  6:27   ` Michael S. Tsirkin
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc Fam Zheng
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Return type is changed to int.

When data is invalid, return -EINVAL with an error.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/virtio/virtio.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b02c7a1..a525f8e 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -311,9 +311,10 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx, Error **errp)
     return num_heads;
 }
 
-static unsigned int virtqueue_get_head(VirtQueue *vq, unsigned int idx)
+static int virtqueue_get_head(VirtQueue *vq, unsigned int idx,
+                              Error **errp)
 {
-    unsigned int head;
+    int head;
 
     /* Grab the next descriptor number they're advertising, and increment
      * the index we've seen. */
@@ -321,8 +322,8 @@ static unsigned int virtqueue_get_head(VirtQueue *vq, unsigned int idx)
 
     /* If their number is silly, that's a fatal mistake. */
     if (head >= vq->vring.num) {
-        error_report("Guest says index %u is available", head);
-        exit(1);
+        error_setg(errp, "Guest says index %u is available", head);
+        return -EINVAL;
     }
 
     return head;
@@ -369,7 +370,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
 
         max = vq->vring.num;
         num_bufs = total_bufs;
-        i = virtqueue_get_head(vq, idx++);
+        i = virtqueue_get_head(vq, idx++, &error_abort);
         desc_pa = vq->vring.desc;
 
         if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
@@ -474,7 +475,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
     max = vq->vring.num;
 
-    i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
+    i = head = virtqueue_get_head(vq, vq->last_avail_idx++, &error_abort);
     if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
         vring_set_avail_event(vq, vq->last_avail_idx);
     }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (2 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-21  6:37   ` Michael S. Tsirkin
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 05/18] virtio: Return error from virtqueue_get_avail_bytes Fam Zheng
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Two callers pass error_abort now, which can be changed to check return value
and pass the error on.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/virtio/virtio.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index a525f8e..2a24829 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -329,10 +329,11 @@ static int virtqueue_get_head(VirtQueue *vq, unsigned int idx,
     return head;
 }
 
-static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
-                                    unsigned int i, unsigned int max)
+static int virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
+                               unsigned int i, unsigned int max,
+                               Error **errp)
 {
-    unsigned int next;
+    int next;
 
     /* If this descriptor says it doesn't chain, we're done. */
     if (!(vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_NEXT)) {
@@ -345,8 +346,8 @@ static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
     smp_wmb();
 
     if (next >= max) {
-        error_report("Desc next is %u", next);
-        exit(1);
+        error_setg(errp, "Desc next is %u", next);
+        return -EINVAL;
     }
 
     return next;
@@ -392,7 +393,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
             num_bufs = i = 0;
         }
 
-        do {
+        while (true) {
             /* If we've got too many, that implies a descriptor loop. */
             if (++num_bufs > max) {
                 error_report("Looped descriptor");
@@ -407,7 +408,11 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
             if (in_total >= max_in_bytes && out_total >= max_out_bytes) {
                 goto done;
             }
-        } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
+            i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
+            if (i == max) {
+                break;
+            }
+        }
 
         if (!indirect)
             total_bufs = num_bufs;
@@ -493,7 +498,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
     }
 
     /* Collect all the descriptors */
-    do {
+    while (true) {
         struct iovec *sg;
 
         if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
@@ -519,7 +524,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
             error_report("Looped descriptor");
             exit(1);
         }
-    } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
+        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
+        if (i == max) {
+            break;
+        }
+    }
 
     /* Now map what we have collected */
     virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 05/18] virtio: Return error from virtqueue_get_avail_bytes
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (3 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop Fam Zheng
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Add errp and pass in error_abort for now. Later, we can add error handling code
in callers.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/char/virtio-serial-bus.c |  2 +-
 hw/virtio/virtio-rng.c      |  2 +-
 hw/virtio/virtio.c          | 39 +++++++++++++++++++++++++++------------
 include/hw/virtio/virtio.h  |  3 ++-
 4 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 1e07858..a56dafc 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -272,7 +272,7 @@ size_t virtio_serial_guest_ready(VirtIOSerialPort *port)
     if (use_multiport(port->vser) && !port->guest_connected) {
         return 0;
     }
-    virtqueue_get_avail_bytes(vq, &bytes, NULL, 4096, 0);
+    virtqueue_get_avail_bytes(vq, &bytes, NULL, 4096, 0, &error_abort);
     return bytes;
 }
 
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index 06e7178..9e1bd75 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -33,7 +33,7 @@ static size_t get_request_size(VirtQueue *vq, unsigned quota)
 {
     unsigned int in, out;
 
-    virtqueue_get_avail_bytes(vq, &in, &out, quota, 0);
+    virtqueue_get_avail_bytes(vq, &in, &out, quota, 0, &error_abort);
     return in;
 }
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2a24829..1cd454b 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -355,7 +355,8 @@ static int virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
 
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
                                unsigned int *out_bytes,
-                               unsigned max_in_bytes, unsigned max_out_bytes)
+                               unsigned max_in_bytes, unsigned max_out_bytes,
+                               Error **errp)
 {
     unsigned int idx;
     unsigned int total_bufs, in_total, out_total;
@@ -363,27 +364,38 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
     idx = vq->last_avail_idx;
 
     total_bufs = in_total = out_total = 0;
-    while (virtqueue_num_heads(vq, idx, &error_abort)) {
+    while (true) {
         VirtIODevice *vdev = vq->vdev;
         unsigned int max, num_bufs, indirect = 0;
         hwaddr desc_pa;
         int i;
 
+        i = virtqueue_num_heads(vq, idx, errp);
+        if (i < 0) {
+            return;
+        } else if (i == 0) {
+            break;
+        }
+
         max = vq->vring.num;
         num_bufs = total_bufs;
-        i = virtqueue_get_head(vq, idx++, &error_abort);
+        i = virtqueue_get_head(vq, idx++, errp);
+        if (i < 0) {
+            return;
+        }
+
         desc_pa = vq->vring.desc;
 
         if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
             if (vring_desc_len(vdev, desc_pa, i) % sizeof(VRingDesc)) {
-                error_report("Invalid size for indirect buffer table");
-                exit(1);
+                error_setg(errp, "Invalid size for indirect buffer table");
+                return;
             }
 
             /* If we've got too many, that implies a descriptor loop. */
             if (num_bufs >= max) {
-                error_report("Looped descriptor");
-                exit(1);
+                error_setg(errp, "Looped descriptor");
+                return;
             }
 
             /* loop over the indirect descriptor table */
@@ -396,8 +408,8 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
         while (true) {
             /* If we've got too many, that implies a descriptor loop. */
             if (++num_bufs > max) {
-                error_report("Looped descriptor");
-                exit(1);
+                error_setg(errp, "Looped descriptor");
+                return;
             }
 
             if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
@@ -408,8 +420,10 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
             if (in_total >= max_in_bytes && out_total >= max_out_bytes) {
                 goto done;
             }
-            i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
-            if (i == max) {
+            i = virtqueue_next_desc(vdev, desc_pa, i, max, errp);
+            if (i < 0) {
+                return;
+            } else if (i == max) {
                 break;
             }
         }
@@ -433,7 +447,8 @@ int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
 {
     unsigned int in_total, out_total;
 
-    virtqueue_get_avail_bytes(vq, &in_total, &out_total, in_bytes, out_bytes);
+    virtqueue_get_avail_bytes(vq, &in_total, &out_total, in_bytes, out_bytes,
+                              &error_abort);
     return in_bytes <= in_total && out_bytes <= out_total;
 }
 
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 64c10cf..a37ee64 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -147,7 +147,8 @@ int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
                           unsigned int out_bytes);
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
                                unsigned int *out_bytes,
-                               unsigned max_in_bytes, unsigned max_out_bytes);
+                               unsigned max_in_bytes, unsigned max_out_bytes,
+                               Error **errp);
 
 void virtio_notify(VirtIODevice *vdev, VirtQueue *vq);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (4 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 05/18] virtio: Return error from virtqueue_get_avail_bytes Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-21  6:49   ` Michael S. Tsirkin
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 07/18] virtio: Return error from virtqueue_avail_bytes Fam Zheng
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

When getting invalid data from vring, virtqueue_pop used to print an
error and exit.

Add an errp parameter so it can return the error to callers.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/9pfs/virtio-9p.c         |  2 +-
 hw/block/virtio-blk.c       |  2 +-
 hw/char/virtio-serial-bus.c | 10 +++----
 hw/net/virtio-net.c         |  6 ++--
 hw/scsi/virtio-scsi.c       |  2 +-
 hw/virtio/virtio-balloon.c  |  4 +--
 hw/virtio/virtio-rng.c      |  2 +-
 hw/virtio/virtio.c          | 70 ++++++++++++++++++++++++++++++++++-----------
 include/hw/virtio/virtio.h  |  2 +-
 9 files changed, 69 insertions(+), 31 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 4964da0..17d0c4a 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -3259,7 +3259,7 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
     ssize_t len;
 
     while ((pdu = alloc_pdu(s)) &&
-            (len = virtqueue_pop(vq, &pdu->elem)) != 0) {
+            (len = virtqueue_pop(vq, &pdu->elem, &error_abort)) != 0) {
         uint8_t *ptr;
         pdu->s = s;
         BUG_ON(pdu->elem.out_num == 0 || pdu->elem.in_num == 0);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f7d8528..0b66ee1 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -191,7 +191,7 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
 {
     VirtIOBlockReq *req = virtio_blk_alloc_request(s);
 
-    if (!virtqueue_pop(s->vq, &req->elem)) {
+    if (!virtqueue_pop(s->vq, &req->elem, &error_abort)) {
         virtio_blk_free_request(req);
         return NULL;
     }
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index a56dafc..76a934b 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -94,7 +94,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
     while (offset < size) {
         size_t len;
 
-        if (!virtqueue_pop(vq, &elem)) {
+        if (!virtqueue_pop(vq, &elem, &error_abort)) {
             break;
         }
 
@@ -116,7 +116,7 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
     if (!virtio_queue_ready(vq)) {
         return;
     }
-    while (virtqueue_pop(vq, &elem)) {
+    while (virtqueue_pop(vq, &elem, &error_abort)) {
         virtqueue_push(vq, &elem, 0);
     }
     virtio_notify(vdev, vq);
@@ -137,7 +137,7 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
 
         /* Pop an elem only if we haven't left off a previous one mid-way */
         if (!port->elem.out_num) {
-            if (!virtqueue_pop(vq, &port->elem)) {
+            if (!virtqueue_pop(vq, &port->elem, &error_abort)) {
                 break;
             }
             port->iov_idx = 0;
@@ -190,7 +190,7 @@ static size_t send_control_msg(VirtIOSerial *vser, void *buf, size_t len)
     if (!virtio_queue_ready(vq)) {
         return 0;
     }
-    if (!virtqueue_pop(vq, &elem)) {
+    if (!virtqueue_pop(vq, &elem, &error_abort)) {
         return 0;
     }
 
@@ -420,7 +420,7 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq)
 
     len = 0;
     buf = NULL;
-    while (virtqueue_pop(vq, &elem)) {
+    while (virtqueue_pop(vq, &elem, &error_abort)) {
         size_t cur_len;
 
         cur_len = iov_size(elem.out_sg, elem.out_num);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 59f76bc..bbcb51f 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -804,7 +804,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
     struct iovec *iov, *iov2;
     unsigned int iov_cnt;
 
-    while (virtqueue_pop(vq, &elem)) {
+    while (virtqueue_pop(vq, &elem, &error_abort)) {
         if (iov_size(elem.in_sg, elem.in_num) < sizeof(status) ||
             iov_size(elem.out_sg, elem.out_num) < sizeof(ctrl)) {
             error_report("virtio-net ctrl missing headers");
@@ -1031,7 +1031,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
 
         total = 0;
 
-        if (virtqueue_pop(q->rx_vq, &elem) == 0) {
+        if (virtqueue_pop(q->rx_vq, &elem, &error_abort) == 0) {
             if (i == 0)
                 return -1;
             error_report("virtio-net unexpected empty queue: "
@@ -1134,7 +1134,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
         return num_packets;
     }
 
-    while (virtqueue_pop(q->tx_vq, &elem)) {
+    while (virtqueue_pop(q->tx_vq, &elem, &error_abort)) {
         ssize_t ret, len;
         unsigned int out_num = elem.out_num;
         struct iovec *out_sg = &elem.out_sg[0];
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index c9bea06..40ba03d 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -177,7 +177,7 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req,
 static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq)
 {
     VirtIOSCSIReq *req = virtio_scsi_init_req(s, vq);
-    if (!virtqueue_pop(vq, &req->elem)) {
+    if (!virtqueue_pop(vq, &req->elem, &error_abort)) {
         virtio_scsi_free_req(req);
         return NULL;
     }
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 95b0643..e26c0a7 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -206,7 +206,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
     VirtQueueElement elem;
     MemoryRegionSection section;
 
-    while (virtqueue_pop(vq, &elem)) {
+    while (virtqueue_pop(vq, &elem, &error_abort)) {
         size_t offset = 0;
         uint32_t pfn;
 
@@ -246,7 +246,7 @@ static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
     size_t offset = 0;
     qemu_timeval tv;
 
-    if (!virtqueue_pop(vq, elem)) {
+    if (!virtqueue_pop(vq, elem, &error_abort)) {
         goto out;
     }
 
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index 9e1bd75..c3cbdc3 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -56,7 +56,7 @@ static void chr_read(void *opaque, const void *buf, size_t size)
 
     offset = 0;
     while (offset < size) {
-        if (!virtqueue_pop(vrng->vq, &elem)) {
+        if (!virtqueue_pop(vrng->vq, &elem, &error_abort)) {
             break;
         }
         len = iov_from_buf(elem.in_sg, elem.in_num,
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 1cd454b..e6f9f6b 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -481,29 +481,49 @@ fail:
     error_setg(errp, "virtio: error trying to map MMIO memory");
 }
 
-int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
+static void virtqueue_undo_map_sg(struct iovec *sg, hwaddr *addr,
+                                  size_t num_sg, int is_write)
+{
+    int i;
+
+    for (i = 0; i < num_sg; i++) {
+        cpu_physical_memory_unmap(sg[i].iov_base, sg[i].iov_len,
+                                  is_write, 0);
+    }
+}
+
+int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp)
 {
     unsigned int i, head, max;
+    int ret;
     hwaddr desc_pa = vq->vring.desc;
     VirtIODevice *vdev = vq->vdev;
+    Error *local_err = NULL;
 
-    if (!virtqueue_num_heads(vq, vq->last_avail_idx, &error_abort))
-        return 0;
+    ret = virtqueue_num_heads(vq, vq->last_avail_idx, &local_err);
+    if (ret <= 0) {
+        goto err;
+    }
 
     /* When we start there are none of either input nor output. */
     elem->out_num = elem->in_num = 0;
 
     max = vq->vring.num;
 
-    i = head = virtqueue_get_head(vq, vq->last_avail_idx++, &error_abort);
+    ret = virtqueue_get_head(vq, vq->last_avail_idx++, &local_err);
+    if (ret < 0) {
+        goto err;
+    }
+    head = i = ret;
+
     if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
         vring_set_avail_event(vq, vq->last_avail_idx);
     }
 
     if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
         if (vring_desc_len(vdev, desc_pa, i) % sizeof(VRingDesc)) {
-            error_report("Invalid size for indirect buffer table");
-            exit(1);
+            error_setg(errp, "Invalid size for indirect buffer table");
+            return -EINVAL;
         }
 
         /* loop over the indirect descriptor table */
@@ -518,15 +538,17 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
         if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
             if (elem->in_num >= ARRAY_SIZE(elem->in_sg)) {
-                error_report("Too many write descriptors in indirect table");
-                exit(1);
+                error_setg(errp,
+                           "Too many write descriptors in indirect table");
+                return -EINVAL;
             }
             elem->in_addr[elem->in_num] = vring_desc_addr(vdev, desc_pa, i);
             sg = &elem->in_sg[elem->in_num++];
         } else {
             if (elem->out_num >= ARRAY_SIZE(elem->out_sg)) {
-                error_report("Too many read descriptors in indirect table");
-                exit(1);
+                error_setg(errp,
+                           "Too many read descriptors in indirect table");
+                return -EINVAL;
             }
             elem->out_addr[elem->out_num] = vring_desc_addr(vdev, desc_pa, i);
             sg = &elem->out_sg[elem->out_num++];
@@ -536,20 +558,31 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
         /* If we've got too many, that implies a descriptor loop. */
         if ((elem->in_num + elem->out_num) > max) {
-            error_report("Looped descriptor");
-            exit(1);
+            error_setg(errp, "Looped descriptor");
+            return -EINVAL;
         }
-        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
-        if (i == max) {
+        ret = virtqueue_next_desc(vdev, desc_pa, i, max, &local_err);
+        if (ret < 0) {
+            goto err;
+        } else if (ret == max) {
             break;
         }
+        i = ret;
     }
 
     /* Now map what we have collected */
     virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
-                     &error_abort);
+                     &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        goto err;
+    }
     virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0,
-                     &error_abort);
+                     &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        goto err_unmap;
+    }
 
     elem->index = head;
 
@@ -557,6 +590,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
     trace_virtqueue_pop(vq, elem, elem->in_num, elem->out_num);
     return elem->in_num + elem->out_num;
+err_unmap:
+    virtqueue_undo_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
+err:
+    error_propagate(errp, local_err);
+    return ret;
 }
 
 /* virtio device */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a37ee64..c478f48 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -142,7 +142,7 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
 void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
                       size_t num_sg, int is_write,
                       Error **errp);
-int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem);
+int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp);
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
                           unsigned int out_bytes);
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 07/18] virtio: Return error from virtqueue_avail_bytes
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (5 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 08/18] virtio: Return error from virtio_add_queue Fam Zheng
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

The only caller is virtio-net, which now passes in error_abort.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/net/virtio-net.c        | 4 ++--
 hw/virtio/virtio.c         | 9 +++++++--
 include/hw/virtio/virtio.h | 2 +-
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index bbcb51f..5529b6f 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -877,7 +877,7 @@ static int virtio_net_has_buffers(VirtIONetQueue *q, int bufsize)
     VirtIONet *n = q->n;
     if (virtio_queue_empty(q->rx_vq) ||
         (n->mergeable_rx_bufs &&
-         !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
+         !virtqueue_avail_bytes(q->rx_vq, bufsize, 0, &error_abort))) {
         virtio_queue_set_notification(q->rx_vq, 1);
 
         /* To avoid a race condition where the guest has made some buffers
@@ -886,7 +886,7 @@ static int virtio_net_has_buffers(VirtIONetQueue *q, int bufsize)
          */
         if (virtio_queue_empty(q->rx_vq) ||
             (n->mergeable_rx_bufs &&
-             !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
+             !virtqueue_avail_bytes(q->rx_vq, bufsize, 0, &error_abort))) {
             return 0;
         }
     }
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index e6f9f6b..7c5ca07 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -443,12 +443,17 @@ done:
 }
 
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
-                          unsigned int out_bytes)
+                          unsigned int out_bytes, Error **errp)
 {
+    Error *local_err = NULL;
     unsigned int in_total, out_total;
 
     virtqueue_get_avail_bytes(vq, &in_total, &out_total, in_bytes, out_bytes,
-                              &error_abort);
+                              &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return -EINVAL;
+    }
     return in_bytes <= in_total && out_bytes <= out_total;
 }
 
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c478f48..34fb62c 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -144,7 +144,7 @@ void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
                       Error **errp);
 int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp);
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
-                          unsigned int out_bytes);
+                          unsigned int out_bytes, Error **errp);
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
                                unsigned int *out_bytes,
                                unsigned max_in_bytes, unsigned max_out_bytes,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 08/18] virtio: Return error from virtio_add_queue
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (6 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 07/18] virtio: Return error from virtqueue_avail_bytes Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 09/18] virtio: Return error from virtio_del_queue Fam Zheng
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

All callers pass in error_abort for now. Error handling will be added in
separate patches later.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/9pfs/virtio-9p-device.c  |  2 +-
 hw/block/virtio-blk.c       |  2 +-
 hw/char/virtio-serial-bus.c | 14 ++++++++------
 hw/net/virtio-net.c         | 24 ++++++++++++++++--------
 hw/scsi/virtio-scsi.c       |  6 +++---
 hw/virtio/virtio-balloon.c  |  9 ++++++---
 hw/virtio/virtio-rng.c      |  2 +-
 hw/virtio/virtio.c          | 13 ++++++++++---
 include/hw/virtio/virtio.h  |  3 ++-
 9 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 30492ec..6281b89 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -61,7 +61,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
         QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
     }
 
-    s->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
+    s->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output, &error_abort);
 
     v9fs_path_init(&path);
 
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 0b66ee1..c72a1a3 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -904,7 +904,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
     s->rq = NULL;
     s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
-    s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
+    s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output, &error_abort);
     s->complete_request = virtio_blk_complete_request;
     virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
     if (err != NULL) {
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 76a934b..1e8062e 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -999,9 +999,9 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp)
                           * sizeof(VirtQueue *));
 
     /* Add a queue for host to guest transfers for port 0 (backward compat) */
-    vser->ivqs[0] = virtio_add_queue(vdev, 128, handle_input);
+    vser->ivqs[0] = virtio_add_queue(vdev, 128, handle_input, &error_abort);
     /* Add a queue for guest to host transfers for port 0 (backward compat) */
-    vser->ovqs[0] = virtio_add_queue(vdev, 128, handle_output);
+    vser->ovqs[0] = virtio_add_queue(vdev, 128, handle_output, &error_abort);
 
     /* TODO: host to guest notifications can get dropped
      * if the queue fills up. Implement queueing in host,
@@ -1010,15 +1010,17 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp)
      * this will save 4Kbyte of guest memory per entry. */
 
     /* control queue: host to guest */
-    vser->c_ivq = virtio_add_queue(vdev, 32, control_in);
+    vser->c_ivq = virtio_add_queue(vdev, 32, control_in, &error_abort);
     /* control queue: guest to host */
-    vser->c_ovq = virtio_add_queue(vdev, 32, control_out);
+    vser->c_ovq = virtio_add_queue(vdev, 32, control_out, &error_abort);
 
     for (i = 1; i < vser->bus.max_nr_ports; i++) {
         /* Add a per-port queue for host to guest transfers */
-        vser->ivqs[i] = virtio_add_queue(vdev, 128, handle_input);
+        vser->ivqs[i] = virtio_add_queue(vdev, 128, handle_input,
+                                         &error_abort);
         /* Add a per-per queue for guest to host transfers */
-        vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output);
+        vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output,
+                                         &error_abort);
     }
 
     vser->ports_map = g_malloc0(((vser->serial.max_virtserial_ports + 31) / 32)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5529b6f..74205b4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1314,16 +1314,19 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
     }
 
     for (i = 1; i < max; i++) {
-        n->vqs[i].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx);
+        n->vqs[i].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx,
+                                           &error_abort);
         if (n->vqs[i].tx_timer) {
             n->vqs[i].tx_vq =
-                virtio_add_queue(vdev, 256, virtio_net_handle_tx_timer);
+                virtio_add_queue(vdev, 256, virtio_net_handle_tx_timer,
+                                 &error_abort);
             n->vqs[i].tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
                                                    virtio_net_tx_timer,
                                                    &n->vqs[i]);
         } else {
             n->vqs[i].tx_vq =
-                virtio_add_queue(vdev, 256, virtio_net_handle_tx_bh);
+                virtio_add_queue(vdev, 256, virtio_net_handle_tx_bh,
+                                 &error_abort);
             n->vqs[i].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[i]);
         }
 
@@ -1335,7 +1338,8 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
      * VIRTIO_NET_F_CTRL_VQ. Create ctrl vq unconditionally to avoid
      * breaking them.
      */
-    n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl);
+    n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl,
+                                  &error_abort);
 
     virtio_net_set_queues(n);
 }
@@ -1596,7 +1600,8 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
         return;
     }
     n->vqs = g_malloc0(sizeof(VirtIONetQueue) * n->max_queues);
-    n->vqs[0].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx);
+    n->vqs[0].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx,
+                                       &error_abort);
     n->curr_queues = 1;
     n->vqs[0].n = n;
     n->tx_timeout = n->net_conf.txtimer;
@@ -1611,15 +1616,18 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
 
     if (n->net_conf.tx && !strcmp(n->net_conf.tx, "timer")) {
         n->vqs[0].tx_vq = virtio_add_queue(vdev, 256,
-                                           virtio_net_handle_tx_timer);
+                                           virtio_net_handle_tx_timer,
+                                           &error_abort);
         n->vqs[0].tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, virtio_net_tx_timer,
                                                &n->vqs[0]);
     } else {
         n->vqs[0].tx_vq = virtio_add_queue(vdev, 256,
-                                           virtio_net_handle_tx_bh);
+                                           virtio_net_handle_tx_bh,
+                                           &error_abort);
         n->vqs[0].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[0]);
     }
-    n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl);
+    n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl,
+                                  &error_abort);
     qemu_macaddr_default_if_unset(&n->nic_conf.macaddr);
     memcpy(&n->mac[0], &n->nic_conf.macaddr, sizeof(n->mac));
     n->status = VIRTIO_NET_S_LINK_UP;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 40ba03d..ca7af2c 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -838,12 +838,12 @@ void virtio_scsi_common_realize(DeviceState *dev, Error **errp,
     s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
 
     s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE,
-                                  ctrl);
+                                  ctrl, &error_abort);
     s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE,
-                                   evt);
+                                   evt, &error_abort);
     for (i = 0; i < s->conf.num_queues; i++) {
         s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE,
-                                         cmd);
+                                         cmd, &error_abort);
     }
 
     if (s->conf.iothread) {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index e26c0a7..4479500 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -388,9 +388,12 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
-    s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
-    s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
+    s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output,
+                              &error_abort);
+    s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output,
+                              &error_abort);
+    s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats,
+                              &error_abort);
 
     reset_stats(s);
 
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index c3cbdc3..fc70525 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -194,7 +194,7 @@ static void virtio_rng_device_realize(DeviceState *dev, Error **errp)
 
     virtio_init(vdev, "virtio-rng", VIRTIO_ID_RNG, 0);
 
-    vrng->vq = virtio_add_queue(vdev, 8, handle_input);
+    vrng->vq = virtio_add_queue(vdev, 8, handle_input, &error_abort);
     vrng->quota_remaining = vrng->conf.max_bytes;
 
     vrng->rate_limit_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7c5ca07..b473f9d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -862,7 +862,8 @@ void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector)
 }
 
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
-                            void (*handle_output)(VirtIODevice *, VirtQueue *))
+                            void (*handle_output)(VirtIODevice *, VirtQueue *),
+                            Error **errp)
 {
     int i;
 
@@ -871,8 +872,14 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
             break;
     }
 
-    if (i == VIRTIO_PCI_QUEUE_MAX || queue_size > VIRTQUEUE_MAX_SIZE)
-        abort();
+    if (i == VIRTIO_PCI_QUEUE_MAX) {
+        error_setg(errp, "Cannot find free vq");
+        return NULL;
+    }
+    if (queue_size > VIRTQUEUE_MAX_SIZE) {
+        error_setg(errp, "Queue size too big: %d", queue_size);
+        return NULL;
+    }
 
     vdev->vq[i].vring.num = queue_size;
     vdev->vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 34fb62c..b31465f 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -129,7 +129,8 @@ void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name);
 
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
                             void (*handle_output)(VirtIODevice *,
-                                                  VirtQueue *));
+                                                  VirtQueue *),
+                            Error **errp);
 
 void virtio_del_queue(VirtIODevice *vdev, int n);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 09/18] virtio: Return error from virtio_del_queue
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (7 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 08/18] virtio: Return error from virtio_add_queue Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 10/18] virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET Fam Zheng
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/net/virtio-net.c        | 2 +-
 hw/virtio/virtio.c         | 5 +++--
 include/hw/virtio/virtio.h | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 74205b4..c727277 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1310,7 +1310,7 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
     n->multiqueue = multiqueue;
 
     for (i = 2; i <= n->max_queues * 2 + 1; i++) {
-        virtio_del_queue(vdev, i);
+        virtio_del_queue(vdev, i, &error_abort);
     }
 
     for (i = 1; i < max; i++) {
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b473f9d..6dfa181 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -888,10 +888,11 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
     return &vdev->vq[i];
 }
 
-void virtio_del_queue(VirtIODevice *vdev, int n)
+void virtio_del_queue(VirtIODevice *vdev, int n, Error **errp)
 {
     if (n < 0 || n >= VIRTIO_PCI_QUEUE_MAX) {
-        abort();
+        error_setg(errp, "Invalid queue number: %d", n);
+        return;
     }
 
     vdev->vq[n].vring.num = 0;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b31465f..52e2b1c 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -132,7 +132,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
                                                   VirtQueue *),
                             Error **errp);
 
-void virtio_del_queue(VirtIODevice *vdev, int n);
+void virtio_del_queue(VirtIODevice *vdev, int n, Error **errp);
 
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
                     unsigned int len);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 10/18] virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (8 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 09/18] virtio: Return error from virtio_del_queue Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 11/18] virtio: Add "needs_reset" flag to virtio device Fam Zheng
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/standard-headers/linux/virtio_config.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/standard-headers/linux/virtio_config.h b/include/standard-headers/linux/virtio_config.h
index bcc445b..34b8155 100644
--- a/include/standard-headers/linux/virtio_config.h
+++ b/include/standard-headers/linux/virtio_config.h
@@ -40,6 +40,8 @@
 #define VIRTIO_CONFIG_S_DRIVER_OK	4
 /* Driver has finished configuring features */
 #define VIRTIO_CONFIG_S_FEATURES_OK	8
+/* Device is in error state and should be reset */
+#define VIRTIO_CONFIG_S_NEEDS_RESET 0x40
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED		0x80
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 11/18] virtio: Add "needs_reset" flag to virtio device
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (9 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 10/18] virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 12/18] virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop Fam Zheng
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/virtio/virtio.c         | 18 ++++++++++++++++++
 include/hw/virtio/virtio.h |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 6dfa181..7ff0dc4 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -623,6 +623,10 @@ void virtio_set_status(VirtIODevice *vdev, uint8_t val)
     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
     trace_virtio_set_status(vdev, val);
 
+    if (val) {
+        /* Unsetting NEED_RESET bit without reset is ignored. */
+        val |= (vdev->status & VIRTIO_CONFIG_S_NEEDS_RESET);
+    }
     if (k->set_status) {
         k->set_status(vdev, val);
     }
@@ -1384,6 +1388,20 @@ static void virtio_device_realize(DeviceState *dev, Error **errp)
     virtio_bus_device_plugged(vdev);
 }
 
+void virtio_device_set_needs_reset(VirtIODevice *vdev)
+{
+    if (vdev->status & VIRTIO_CONFIG_S_NEEDS_RESET) {
+        return;
+    }
+    vdev->status |= VIRTIO_CONFIG_S_NEEDS_RESET;
+    virtio_notify_config(vdev);
+}
+
+bool virtio_device_needs_reset(VirtIODevice *vdev)
+{
+    return vdev->status & VIRTIO_CONFIG_S_NEEDS_RESET;
+}
+
 static void virtio_device_unrealize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 52e2b1c..71d98a3 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -131,6 +131,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
                             void (*handle_output)(VirtIODevice *,
                                                   VirtQueue *),
                             Error **errp);
+void virtio_device_set_needs_reset(VirtIODevice *vdev);
+bool virtio_device_needs_reset(VirtIODevice *vdev);
 
 void virtio_del_queue(VirtIODevice *vdev, int n, Error **errp);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 12/18] virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (10 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 11/18] virtio: Add "needs_reset" flag to virtio device Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 13/18] virtio-blk: Graceful error handling of virtqueue_pop Fam Zheng
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/virtio/virtio.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7ff0dc4..87f8c36 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -505,6 +505,9 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp)
     VirtIODevice *vdev = vq->vdev;
     Error *local_err = NULL;
 
+    if (virtio_device_needs_reset(vdev)) {
+        return -EINVAL;
+    }
     ret = virtqueue_num_heads(vq, vq->last_avail_idx, &local_err);
     if (ret <= 0) {
         goto err;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 13/18] virtio-blk: Graceful error handling of virtqueue_pop
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (11 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 12/18] virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 14/18] qtest: Add "QTEST_FILTER" to filter test cases Fam Zheng
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c |  9 ++++++-
 hw/block/virtio-blk.c           | 56 ++++++++++++++++++++++++++++++++---------
 include/hw/virtio/virtio-blk.h  |  3 ++-
 3 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3db139b..a094403 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -93,6 +93,8 @@ static void handle_notify(EventNotifier *e)
     VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
                                            host_notifier);
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
+    VirtIODevice *vdev = VIRTIO_DEVICE(vblk);
+    Error *local_err = NULL;
 
     event_notifier_test_and_clear(&s->host_notifier);
     blk_io_plug(s->conf->conf.blk);
@@ -116,7 +118,12 @@ static void handle_notify(EventNotifier *e)
                                                         req->elem.in_num,
                                                         req->elem.index);
 
-            virtio_blk_handle_request(req, &mrb);
+            virtio_blk_handle_request(req, &mrb, &local_err);
+            if (local_err) {
+                error_report_err(local_err);
+                virtio_device_set_needs_reset(vdev);
+                break;
+            }
         }
 
         if (mrb.num_reqs) {
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c72a1a3..82a95a3 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -61,7 +61,12 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
 
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
 {
-    req->dev->complete_request(req, status);
+    VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
+
+    /* No need to notify completion if we're in error state */
+    if (!virtio_device_needs_reset(vdev)) {
+        req->dev->complete_request(req, status);
+    }
 }
 
 static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
@@ -189,9 +194,18 @@ out:
 
 static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
 {
-    VirtIOBlockReq *req = virtio_blk_alloc_request(s);
+    VirtIOBlockReq *req;
+    Error *local_err = NULL;
 
-    if (!virtqueue_pop(s->vq, &req->elem, &error_abort)) {
+    if (virtio_device_needs_reset(VIRTIO_DEVICE(s))) {
+        return NULL;
+    }
+    req = virtio_blk_alloc_request(s);
+    if (virtqueue_pop(s->vq, &req->elem, &local_err) <= 0) {
+        if (local_err) {
+            error_report("virtio-blk: %s", error_get_pretty(local_err));
+            virtio_device_set_needs_reset(VIRTIO_DEVICE(s));
+        }
         virtio_blk_free_request(req);
         return NULL;
     }
@@ -478,7 +492,10 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
     return true;
 }
 
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
+/* Handle a request.  If error happened, errp will be set and the request will
+ * be freed. */
+void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb,
+                               Error **errp)
 {
     uint32_t type;
     struct iovec *in_iov = req->elem.in_sg;
@@ -487,22 +504,25 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
     unsigned out_num = req->elem.out_num;
 
     if (req->elem.out_num < 1 || req->elem.in_num < 1) {
-        error_report("virtio-blk missing headers");
-        exit(1);
+        error_setg(errp, "virtio-blk missing headers");
+        virtio_blk_free_request(req);
+        return;
     }
 
     if (unlikely(iov_to_buf(iov, out_num, 0, &req->out,
                             sizeof(req->out)) != sizeof(req->out))) {
-        error_report("virtio-blk request outhdr too short");
-        exit(1);
+        error_setg(errp, "virtio-blk request outhdr too short");
+        virtio_blk_free_request(req);
+        return;
     }
 
     iov_discard_front(&iov, &out_num, sizeof(req->out));
 
     if (in_num < 1 ||
         in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
-        error_report("virtio-blk request inhdr too short");
-        exit(1);
+        error_setg(errp, "virtio-blk request inhdr too short");
+        virtio_blk_free_request(req);
+        return;
     }
 
     /* We always touch the last byte, so just see how big in_iov is.  */
@@ -592,6 +612,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
     VirtIOBlock *s = VIRTIO_BLK(vdev);
     VirtIOBlockReq *req;
     MultiReqBuffer mrb = {};
+    Error *local_err = NULL;
 
     /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
      * dataplane here instead of waiting for .set_status().
@@ -602,7 +623,12 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
     }
 
     while ((req = virtio_blk_get_request(s))) {
-        virtio_blk_handle_request(req, &mrb);
+        virtio_blk_handle_request(req, &mrb, &local_err);
+        if (local_err) {
+            virtio_device_set_needs_reset(vdev);
+            error_report_err(local_err);
+            break;
+        }
     }
 
     if (mrb.num_reqs) {
@@ -615,6 +641,8 @@ static void virtio_blk_dma_restart_bh(void *opaque)
     VirtIOBlock *s = opaque;
     VirtIOBlockReq *req = s->rq;
     MultiReqBuffer mrb = {};
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    Error *local_err = NULL;
 
     qemu_bh_delete(s->bh);
     s->bh = NULL;
@@ -623,8 +651,12 @@ static void virtio_blk_dma_restart_bh(void *opaque)
 
     while (req) {
         VirtIOBlockReq *next = req->next;
-        virtio_blk_handle_request(req, &mrb);
+        virtio_blk_handle_request(req, &mrb, &local_err);
         req = next;
+        if (local_err) {
+            virtio_device_set_needs_reset(vdev);
+            break;
+        }
     }
 
     if (mrb.num_reqs) {
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 6bf5905..1569269 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -85,7 +85,8 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s);
 
 void virtio_blk_free_request(VirtIOBlockReq *req);
 
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
+void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb,
+                               Error **errp);
 
 void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 14/18] qtest: Add "QTEST_FILTER" to filter test cases
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (12 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 13/18] virtio-blk: Graceful error handling of virtqueue_pop Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 15/18] qtest: virtio-blk: Extract "setup" for future reuse Fam Zheng
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/Makefile | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile b/tests/Makefile
index 55aa745..1c1294f 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -411,12 +411,16 @@ GCOV_OPTIONS = -n $(if $(V),-f,)
 
 # gtester tests, possibly with verbose output
 
+QTEST_FILTER ?= %
+
 .PHONY: $(patsubst %, check-qtest-%, $(QTEST_TARGETS))
 $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): check-qtest-%: $(check-qtest-y)
 	$(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,)
 	$(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
 		MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \
-		gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y),"GTESTER $@")
+		gtester $(GTESTER_OPTIONS) -m=$(SPEED) \
+        $(filter $(QTEST_FILTER),$(check-qtest-$*-y)),\
+        "GTESTER $@")
 	$(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y); do \
 	  echo Gcov report for $$f:;\
 	  $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 15/18] qtest: virtio-blk: Extract "setup" for future reuse
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (13 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 14/18] qtest: Add "QTEST_FILTER" to filter test cases Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 16/18] libqos: Add qvirtio_needs_reset Fam Zheng
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/virtio-blk-test.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 4078321..866415a 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -168,16 +168,11 @@ static uint64_t virtio_blk_request(QGuestAllocator *alloc, QVirtioBlkReq *req,
     return addr;
 }
 
-static void test_basic(const QVirtioBus *bus, QVirtioDevice *dev,
-            QGuestAllocator *alloc, QVirtQueue *vq, uint64_t device_specific)
+static uint32_t setup(const QVirtioBus *bus, QVirtioDevice *dev,
+                      uint64_t device_specific)
 {
-    QVirtioBlkReq req;
-    uint64_t req_addr;
     uint64_t capacity;
     uint32_t features;
-    uint32_t free_head;
-    uint8_t status;
-    char *data;
 
     capacity = qvirtio_config_readq(bus, dev, device_specific);
 
@@ -187,10 +182,25 @@ static void test_basic(const QVirtioBus *bus, QVirtioDevice *dev,
     features = features & ~(QVIRTIO_F_BAD_FEATURE |
                     QVIRTIO_F_RING_INDIRECT_DESC | QVIRTIO_F_RING_EVENT_IDX |
                             QVIRTIO_BLK_F_SCSI);
+
     qvirtio_set_features(bus, dev, features);
-
     qvirtio_set_driver_ok(bus, dev);
 
+    return features;
+}
+
+static void test_basic(const QVirtioBus *bus, QVirtioDevice *dev,
+            QGuestAllocator *alloc, QVirtQueue *vq, uint64_t device_specific)
+{
+    QVirtioBlkReq req;
+    uint64_t req_addr;
+    uint32_t features;
+    uint32_t free_head;
+    uint8_t status;
+    char *data;
+
+    features = setup(bus, dev, device_specific);
+
     /* Write and read with 3 descriptor layout */
     /* Write request */
     req.type = QVIRTIO_BLK_T_OUT;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 16/18] libqos: Add qvirtio_needs_reset
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (14 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 15/18] qtest: virtio-blk: Extract "setup" for future reuse Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 17/18] qtest: Add test case for "needs reset" of virtio-blk Fam Zheng
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/libqos/virtio.c | 5 +++++
 tests/libqos/virtio.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c
index 3205b88..cd45b5c 100644
--- a/tests/libqos/virtio.c
+++ b/tests/libqos/virtio.c
@@ -46,6 +46,11 @@ void qvirtio_set_features(const QVirtioBus *bus, QVirtioDevice *d,
     bus->set_features(d, features);
 }
 
+bool qvirtio_needs_reset(const QVirtioBus *bus, QVirtioDevice *d)
+{
+    return bus->get_status(d) & QVIRTIO_NEEDS_RESET;
+}
+
 QVirtQueue *qvirtqueue_setup(const QVirtioBus *bus, QVirtioDevice *d,
                                         QGuestAllocator *alloc, uint16_t index)
 {
diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h
index 2449fee..82098f3 100644
--- a/tests/libqos/virtio.h
+++ b/tests/libqos/virtio.h
@@ -18,6 +18,7 @@
 #define QVIRTIO_ACKNOWLEDGE     0x1
 #define QVIRTIO_DRIVER          0x2
 #define QVIRTIO_DRIVER_OK       0x4
+#define QVIRTIO_NEEDS_RESET     0x40
 
 #define QVIRTIO_NET_DEVICE_ID   0x1
 #define QVIRTIO_BLK_DEVICE_ID   0x2
@@ -154,6 +155,7 @@ uint64_t qvirtio_config_readq(const QVirtioBus *bus, QVirtioDevice *d,
 uint32_t qvirtio_get_features(const QVirtioBus *bus, QVirtioDevice *d);
 void qvirtio_set_features(const QVirtioBus *bus, QVirtioDevice *d,
                                                             uint32_t features);
+bool qvirtio_needs_reset(const QVirtioBus *bus, QVirtioDevice *d);
 
 void qvirtio_reset(const QVirtioBus *bus, QVirtioDevice *d);
 void qvirtio_set_acknowledge(const QVirtioBus *bus, QVirtioDevice *d);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 17/18] qtest: Add test case for "needs reset" of virtio-blk
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (15 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 16/18] libqos: Add qvirtio_needs_reset Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 18/18] qtest: virtio-blk: Suppress virtio error messages in "make check" Fam Zheng
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/virtio-blk-test.c | 154 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 154 insertions(+)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 866415a..c5a599d 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -734,6 +734,159 @@ static void pci_hotplug(void)
     test_end();
 }
 
+typedef void (*err_func)(QGuestAllocator *alloc, const QVirtioBus *bus,
+                         QVirtioDevice *vdev, QVirtQueue *vq);
+
+static void test_pci_needs_reset(err_func ef)
+{
+    QVirtioPCIDevice *dev;
+    QVirtioDevice *vdev;
+    QPCIBus *bus;
+    QVirtQueuePCI *vqpci;
+    QGuestAllocator *alloc;
+    void *addr;
+
+    bus = pci_test_start();
+    dev = virtio_blk_pci_init(bus, PCI_SLOT);
+    vdev = &dev->vdev;
+
+    alloc = pc_alloc_init();
+    vqpci = (QVirtQueuePCI *)qvirtqueue_setup(&qvirtio_pci, vdev, alloc, 0);
+    /* MSI-X is not enabled */
+    addr = dev->addr + QVIRTIO_PCI_DEVICE_SPECIFIC_NO_MSIX;
+
+    setup(&qvirtio_pci, vdev, (uint64_t)(uintptr_t)addr);
+    g_assert(!qvirtio_needs_reset(&qvirtio_pci, vdev));
+
+    ef(alloc, &qvirtio_pci, vdev, &vqpci->vq);
+    g_assert(qvirtio_needs_reset(&qvirtio_pci, vdev));
+
+    /* Kicking when needs_rest should be nop. */
+    qvirtqueue_kick(&qvirtio_pci, vdev, &vqpci->vq, 0);
+
+    qvirtio_reset(&qvirtio_pci, vdev);
+    g_free(vqpci);
+    vqpci = (QVirtQueuePCI *)qvirtqueue_setup(&qvirtio_pci, vdev, alloc, 0);
+    g_assert(!qvirtio_needs_reset(&qvirtio_pci, vdev));
+
+    qvirtio_set_acknowledge(&qvirtio_pci, vdev);
+    qvirtio_set_driver(&qvirtio_pci, vdev);
+
+    /* The device should be back to functional once reset */
+    test_basic(&qvirtio_pci, &dev->vdev, alloc, &vqpci->vq,
+                                                    (uint64_t)(uintptr_t)addr);
+
+    guest_free(alloc, vqpci->vq.desc);
+    pc_alloc_uninit(alloc);
+    qvirtio_pci_device_disable(dev);
+    g_free(dev);
+    qpci_free_pc(bus);
+    test_end();
+}
+
+static void kick_with_no_request(QGuestAllocator *alloc,
+                                 const QVirtioBus *bus,
+                                 QVirtioDevice *vdev,
+                                 QVirtQueue *vq)
+{
+    qvirtqueue_kick(bus, vdev, vq, 0);
+}
+
+static void short_outhdr(QGuestAllocator *alloc,
+                         const QVirtioBus *bus,
+                         QVirtioDevice *vdev,
+                         QVirtQueue *vq)
+{
+    QVirtioBlkReq req;
+    uint64_t req_addr;
+    uint32_t free_head;
+
+    req.type = QVIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(alloc, &req, 512);
+
+    g_free(req.data);
+
+    free_head = qvirtqueue_add(vq, req_addr, 1, false, true);
+    qvirtqueue_add(vq, req_addr + 16, 1, false, true);
+    qvirtqueue_add(vq, req_addr + 528, 1, true, false);
+
+    qvirtqueue_kick(bus, vdev, vq, free_head);
+}
+
+static void invalid_avail_index(QGuestAllocator *alloc,
+                                const QVirtioBus *bus,
+                                QVirtioDevice *vdev,
+                                QVirtQueue *vq)
+{
+    uint16_t idx = vq->size + 1;
+
+    writel(vq->avail + 4 + (2 * (idx % vq->size)), 0);
+    writel(vq->avail + 2, idx + 1);
+    bus->virtqueue_kick(vdev, vq);
+}
+
+static uint32_t add_indirect_len(QVirtQueue *vq,
+                                 QVRingIndirectDesc *indirect,
+                                 size_t len)
+{
+    vq->num_free--;
+
+    /* vq->desc[vq->free_head].addr */
+    writeq(vq->desc + (16 * vq->free_head), indirect->desc);
+    /* vq->desc[vq->free_head].len */
+    writel(vq->desc + (16 * vq->free_head) + 8, len);
+    /* vq->desc[vq->free_head].flags */
+    writew(vq->desc + (16 * vq->free_head) + 12, QVRING_DESC_F_INDIRECT);
+
+    return vq->free_head++; /* Return and increase, in this order */
+}
+
+
+static void invalid_indirect(QGuestAllocator *alloc,
+                             const QVirtioBus *bus,
+                             QVirtioDevice *vdev,
+                             QVirtQueue *vq)
+{
+    QVirtioBlkReq req;
+    uint64_t req_addr;
+    uint32_t free_head;
+    uint32_t features;
+    QVRingIndirectDesc *indirect;
+
+    features = qvirtio_get_features(&qvirtio_pci, vdev);
+    g_assert_cmphex(features & QVIRTIO_F_RING_INDIRECT_DESC, !=, 0);
+
+    req.type = QVIRTIO_BLK_T_OUT;
+    req.ioprio = 1;
+    req.sector = 0;
+    req.data = g_malloc0(512);
+    strcpy(req.data, "TEST");
+
+    req_addr = virtio_blk_request(alloc, &req, 512);
+
+    g_free(req.data);
+
+    indirect = qvring_indirect_desc_setup(vdev, alloc, 2);
+    qvring_indirect_desc_add(indirect, req_addr, 528, false);
+    qvring_indirect_desc_add(indirect, req_addr + 528, 1, true);
+    free_head = add_indirect_len(vq, indirect, 511);
+    qvirtqueue_kick(&qvirtio_pci, vdev, vq, free_head);
+    g_free(indirect);
+}
+
+static void pci_needs_reset(void)
+{
+    test_pci_needs_reset(kick_with_no_request);
+    test_pci_needs_reset(short_outhdr);
+    test_pci_needs_reset(invalid_avail_index);
+    test_pci_needs_reset(invalid_indirect);
+}
+
 static void mmio_basic(void)
 {
     QVirtioMMIODevice *dev;
@@ -789,6 +942,7 @@ int main(int argc, char **argv)
         qtest_add_func("/virtio/blk/pci/msix", pci_msix);
         qtest_add_func("/virtio/blk/pci/idx", pci_idx);
         qtest_add_func("/virtio/blk/pci/hotplug", pci_hotplug);
+        qtest_add_func("/virtio/blk/pci/needs_reset", pci_needs_reset);
     } else if (strcmp(arch, "arm") == 0) {
         qtest_add_func("/virtio/blk/mmio/basic", mmio_basic);
     }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Qemu-devel] [PATCH 18/18] qtest: virtio-blk: Suppress virtio error messages in "make check"
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (16 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 17/18] qtest: Add test case for "needs reset" of virtio-blk Fam Zheng
@ 2015-04-17  7:59 ` Fam Zheng
  2015-04-20 15:13 ` [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Cornelia Huck
  2015-04-20 17:36   ` Michael S. Tsirkin
  19 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-17  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Michael S. Tsirkin, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/virtio-blk-test.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index c5a599d..17eb33d 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -76,7 +76,7 @@ static char *drive_create(void)
     return tmp_path;
 }
 
-static QPCIBus *pci_test_start(void)
+static QPCIBus *pci_test_start_common(const char *extra)
 {
     char *cmdline;
     char *tmp_path;
@@ -86,8 +86,8 @@ static QPCIBus *pci_test_start(void)
     cmdline = g_strdup_printf("-drive if=none,id=drive0,file=%s,format=raw "
                         "-drive if=none,id=drive1,file=/dev/null,format=raw "
                         "-device virtio-blk-pci,id=drv0,drive=drive0,"
-                        "addr=%x.%x",
-                        tmp_path, PCI_SLOT, PCI_FN);
+                        "addr=%x.%x %s",
+                        tmp_path, PCI_SLOT, PCI_FN, extra);
     qtest_start(cmdline);
     unlink(tmp_path);
     g_free(tmp_path);
@@ -96,6 +96,16 @@ static QPCIBus *pci_test_start(void)
     return qpci_init_pc();
 }
 
+static QPCIBus *pci_test_start(void)
+{
+    return pci_test_start_common("");
+}
+
+static QPCIBus *pci_test_start_silient(void)
+{
+    return pci_test_start_common("&>/dev/null");
+}
+
 static void arm_test_start(void)
 {
     char *cmdline;
@@ -746,7 +756,7 @@ static void test_pci_needs_reset(err_func ef)
     QGuestAllocator *alloc;
     void *addr;
 
-    bus = pci_test_start();
+    bus = pci_test_start_silient();
     dev = virtio_blk_pci_init(bus, PCI_SLOT);
     vdev = &dev->vdev;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
                   ` (17 preceding siblings ...)
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 18/18] qtest: virtio-blk: Suppress virtio error messages in "make check" Fam Zheng
@ 2015-04-20 15:13 ` Cornelia Huck
  2015-04-21  7:44   ` Fam Zheng
  2015-04-20 17:36   ` Michael S. Tsirkin
  19 siblings, 1 reply; 54+ messages in thread
From: Cornelia Huck @ 2015-04-20 15:13 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Fri, 17 Apr 2015 15:59:15 +0800
Fam Zheng <famz@redhat.com> wrote:

> Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> data with vring. That has drawbacks such as losing unsaved data (e.g. when
> guest user is writing a very long email), or possible denial of service in
> a nested vm use case where virtio device is passed through.
> 
> virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> improve this by communicating the error state between virtio devices and
> drivers. The device notifies guest upon setting the bit, then the guest driver
> should detect this bit and report to userspace, or recover the device by
> resetting it.
> 
> This series makes necessary changes in virtio core code, based on which
> virtio-blk is converted. Other devices now keep the existing behavior by
> passing in "error_abort". They will be converted in following series. The Linux
> driver part will also be worked on.
> 
> One concern with this behavior change is that it's now harder to notice the
> actual driver bug that caused the error, as the guest continues to run.  To
> address that, we could probably add a new error action option to virtio
> devices,  similar to the "read/write werror" in block layer, so the vm could be
> paused and the management will get an event in QMP like pvpanic.  This work can
> be done on top.

In principle, this looks nice; I'm not sure however how this affects
non-virtio-1 devices.

If a device is operating in virtio-1 mode, everything is clearly
specified: The guest is notified and if it is aware of the NEEDS_RESET
bit, it can react accordingly.

But what about legacy devices? Even if they are notified, they don't
know to check for NEEDS_RESET - and I'm not sure if the undefined
behaviour after NEEDS_RESET might lead to bigger trouble than killing
off the guest.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
@ 2015-04-20 17:36   ` Michael S. Tsirkin
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 02/18] virtio: Return error from virtqueue_num_heads Fam Zheng
                     ` (18 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 17:36 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Rusty Russell, qemu-devel, virtualization,
	Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> data with vring.
> That has drawbacks such as losing unsaved data (e.g. when
> guest user is writing a very long email), or possible denial of service in
> a nested vm use case where virtio device is passed through.
> 
> virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> improve this by communicating the error state between virtio devices and
> drivers. The device notifies guest upon setting the bit, then the guest driver
> should detect this bit and report to userspace, or recover the device by
> resetting it.

Unfortunately, virtio 1 spec does not have a conformance statement
that requires driver to recover. We merely have a non-normative looking
text:
	Note: For example, the driver can’t assume requests in flight
	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
	they have not been completed. A good implementation will try to recover
	by issuing a reset.

Implementing this reset for all devices in a race-free manner might also
be far from trivial.  I think we'd need a feature bit for this.
OTOH as long as we make this a new feature, would an ability to
reset a single VQ be a better match for what you are trying to
achieve?

> This series makes necessary changes in virtio core code, based on which
> virtio-blk is converted. Other devices now keep the existing behavior by
> passing in "error_abort". They will be converted in following series. The Linux
> driver part will also be worked on.
> 
> One concern with this behavior change is that it's now harder to notice the
> actual driver bug that caused the error, as the guest continues to run.  To
> address that, we could probably add a new error action option to virtio
> devices,  similar to the "read/write werror" in block layer, so the vm could be
> paused and the management will get an event in QMP like pvpanic.  This work can
> be done on top.

At the architectural level, that's only one concern. Others would be
- workloads such as openstack handle guest crash better than
  a guest that's e.g. slow because of a memory leak
- it's easier for guests to probe host for security issues
  if guest isn't killed
- guest can flood host log with guest-triggered errors


At the implementation level, there's one big issue you seem to have
missed: DMA to invalid memory addresses causes a crash in memory core.
I'm not sure whether it makes sense to recover from virtio core bugs
when we can't recover from device bugs.


> 
> 
> Fam Zheng (18):
>   virtio: Return error from virtqueue_map_sg
>   virtio: Return error from virtqueue_num_heads
>   virtio: Return error from virtqueue_get_head
>   virtio: Return error from virtqueue_next_desc
>   virtio: Return error from virtqueue_get_avail_bytes
>   virtio: Return error from virtqueue_pop
>   virtio: Return error from virtqueue_avail_bytes
>   virtio: Return error from virtio_add_queue
>   virtio: Return error from virtio_del_queue
>   virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET
>   virtio: Add "needs_reset" flag to virtio device
>   virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop
>   virtio-blk: Graceful error handling of virtqueue_pop
>   qtest: Add "QTEST_FILTER" to filter test cases
>   qtest: virtio-blk: Extract "setup" for future reuse
>   libqos: Add qvirtio_needs_reset
>   qtest: Add test case for "needs reset" of virtio-blk
>   qtest: virtio-blk: Suppress virtio error messages in "make check"
> 
>  hw/9pfs/virtio-9p-device.c                     |   2 +-
>  hw/9pfs/virtio-9p.c                            |   2 +-
>  hw/block/dataplane/virtio-blk.c                |   9 +-
>  hw/block/virtio-blk.c                          |  62 +++++--
>  hw/char/virtio-serial-bus.c                    |  30 ++--
>  hw/net/virtio-net.c                            |  36 +++--
>  hw/scsi/virtio-scsi.c                          |   8 +-
>  hw/virtio/virtio-balloon.c                     |  13 +-
>  hw/virtio/virtio-rng.c                         |   6 +-
>  hw/virtio/virtio.c                             | 214 ++++++++++++++++++-------
>  include/hw/virtio/virtio-blk.h                 |   3 +-
>  include/hw/virtio/virtio.h                     |  17 +-
>  include/standard-headers/linux/virtio_config.h |   2 +
>  tests/Makefile                                 |   6 +-
>  tests/libqos/virtio.c                          |   5 +
>  tests/libqos/virtio.h                          |   2 +
>  tests/virtio-blk-test.c                        | 196 ++++++++++++++++++++--
>  17 files changed, 482 insertions(+), 131 deletions(-)
> 
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-20 17:36   ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 17:36 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> data with vring.
> That has drawbacks such as losing unsaved data (e.g. when
> guest user is writing a very long email), or possible denial of service in
> a nested vm use case where virtio device is passed through.
> 
> virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> improve this by communicating the error state between virtio devices and
> drivers. The device notifies guest upon setting the bit, then the guest driver
> should detect this bit and report to userspace, or recover the device by
> resetting it.

Unfortunately, virtio 1 spec does not have a conformance statement
that requires driver to recover. We merely have a non-normative looking
text:
	Note: For example, the driver can’t assume requests in flight
	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
	they have not been completed. A good implementation will try to recover
	by issuing a reset.

Implementing this reset for all devices in a race-free manner might also
be far from trivial.  I think we'd need a feature bit for this.
OTOH as long as we make this a new feature, would an ability to
reset a single VQ be a better match for what you are trying to
achieve?

> This series makes necessary changes in virtio core code, based on which
> virtio-blk is converted. Other devices now keep the existing behavior by
> passing in "error_abort". They will be converted in following series. The Linux
> driver part will also be worked on.
> 
> One concern with this behavior change is that it's now harder to notice the
> actual driver bug that caused the error, as the guest continues to run.  To
> address that, we could probably add a new error action option to virtio
> devices,  similar to the "read/write werror" in block layer, so the vm could be
> paused and the management will get an event in QMP like pvpanic.  This work can
> be done on top.

At the architectural level, that's only one concern. Others would be
- workloads such as openstack handle guest crash better than
  a guest that's e.g. slow because of a memory leak
- it's easier for guests to probe host for security issues
  if guest isn't killed
- guest can flood host log with guest-triggered errors


At the implementation level, there's one big issue you seem to have
missed: DMA to invalid memory addresses causes a crash in memory core.
I'm not sure whether it makes sense to recover from virtio core bugs
when we can't recover from device bugs.


> 
> 
> Fam Zheng (18):
>   virtio: Return error from virtqueue_map_sg
>   virtio: Return error from virtqueue_num_heads
>   virtio: Return error from virtqueue_get_head
>   virtio: Return error from virtqueue_next_desc
>   virtio: Return error from virtqueue_get_avail_bytes
>   virtio: Return error from virtqueue_pop
>   virtio: Return error from virtqueue_avail_bytes
>   virtio: Return error from virtio_add_queue
>   virtio: Return error from virtio_del_queue
>   virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET
>   virtio: Add "needs_reset" flag to virtio device
>   virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop
>   virtio-blk: Graceful error handling of virtqueue_pop
>   qtest: Add "QTEST_FILTER" to filter test cases
>   qtest: virtio-blk: Extract "setup" for future reuse
>   libqos: Add qvirtio_needs_reset
>   qtest: Add test case for "needs reset" of virtio-blk
>   qtest: virtio-blk: Suppress virtio error messages in "make check"
> 
>  hw/9pfs/virtio-9p-device.c                     |   2 +-
>  hw/9pfs/virtio-9p.c                            |   2 +-
>  hw/block/dataplane/virtio-blk.c                |   9 +-
>  hw/block/virtio-blk.c                          |  62 +++++--
>  hw/char/virtio-serial-bus.c                    |  30 ++--
>  hw/net/virtio-net.c                            |  36 +++--
>  hw/scsi/virtio-scsi.c                          |   8 +-
>  hw/virtio/virtio-balloon.c                     |  13 +-
>  hw/virtio/virtio-rng.c                         |   6 +-
>  hw/virtio/virtio.c                             | 214 ++++++++++++++++++-------
>  include/hw/virtio/virtio-blk.h                 |   3 +-
>  include/hw/virtio/virtio.h                     |  17 +-
>  include/standard-headers/linux/virtio_config.h |   2 +
>  tests/Makefile                                 |   6 +-
>  tests/libqos/virtio.c                          |   5 +
>  tests/libqos/virtio.h                          |   2 +
>  tests/virtio-blk-test.c                        | 196 ++++++++++++++++++++--
>  17 files changed, 482 insertions(+), 131 deletions(-)
> 
> -- 
> 1.9.3
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 17:36   ` Michael S. Tsirkin
@ 2015-04-20 19:10     ` Paolo Bonzini
  -1 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2015-04-20 19:10 UTC (permalink / raw)
  To: Michael S. Tsirkin, Fam Zheng
  Cc: Aneesh Kumar K.V, Amit Shah, qemu-devel, Stefan Hajnoczi, virtualization



On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> At the implementation level, there's one big issue you seem to have
> missed: DMA to invalid memory addresses causes a crash in memory core.
> I'm not sure whether it makes sense to recover from virtio core bugs
> when we can't recover from device bugs.

What do you mean exactly?  DMA to invalid memory addresses causes
address_space_map to return a "short read".

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-20 19:10     ` Paolo Bonzini
  0 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2015-04-20 19:10 UTC (permalink / raw)
  To: Michael S. Tsirkin, Fam Zheng
  Cc: Aneesh Kumar K.V, Amit Shah, qemu-devel, Stefan Hajnoczi, virtualization



On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> At the implementation level, there's one big issue you seem to have
> missed: DMA to invalid memory addresses causes a crash in memory core.
> I'm not sure whether it makes sense to recover from virtio core bugs
> when we can't recover from device bugs.

What do you mean exactly?  DMA to invalid memory addresses causes
address_space_map to return a "short read".

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 19:10     ` Paolo Bonzini
@ 2015-04-20 20:34       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 20:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > At the implementation level, there's one big issue you seem to have
> > missed: DMA to invalid memory addresses causes a crash in memory core.
> > I'm not sure whether it makes sense to recover from virtio core bugs
> > when we can't recover from device bugs.
> 
> What do you mean exactly?  DMA to invalid memory addresses causes
> address_space_map to return a "short read".
> 
> Paolo

I mean, first of all, a bunch of virtio_XXX_phys calls.
These eventually call qemu_get_ram_ptr, which internally calls
qemu_get_ram_block and ramblock_ptr.
Both abort on errors.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-20 20:34       ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 20:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > At the implementation level, there's one big issue you seem to have
> > missed: DMA to invalid memory addresses causes a crash in memory core.
> > I'm not sure whether it makes sense to recover from virtio core bugs
> > when we can't recover from device bugs.
> 
> What do you mean exactly?  DMA to invalid memory addresses causes
> address_space_map to return a "short read".
> 
> Paolo

I mean, first of all, a bunch of virtio_XXX_phys calls.
These eventually call qemu_get_ram_ptr, which internally calls
qemu_get_ram_block and ramblock_ptr.
Both abort on errors.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 17:36   ` Michael S. Tsirkin
@ 2015-04-21  2:37     ` Fam Zheng
  -1 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  2:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Rusty Russell, qemu-devel, virtualization,
	Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > data with vring.
> > That has drawbacks such as losing unsaved data (e.g. when
> > guest user is writing a very long email), or possible denial of service in
> > a nested vm use case where virtio device is passed through.
> > 
> > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > improve this by communicating the error state between virtio devices and
> > drivers. The device notifies guest upon setting the bit, then the guest driver
> > should detect this bit and report to userspace, or recover the device by
> > resetting it.
> 
> Unfortunately, virtio 1 spec does not have a conformance statement
> that requires driver to recover. We merely have a non-normative looking
> text:
> 	Note: For example, the driver can’t assume requests in flight
> 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> 	they have not been completed. A good implementation will try to recover
> 	by issuing a reset.
> 
> Implementing this reset for all devices in a race-free manner might also
> be far from trivial.  I think we'd need a feature bit for this.
> OTOH as long as we make this a new feature, would an ability to
> reset a single VQ be a better match for what you are trying to
> achieve?

I think that is too complicated as a recovery measure, a device level resetting
will be better to get to a deterministic state, at least.

> 
> > This series makes necessary changes in virtio core code, based on which
> > virtio-blk is converted. Other devices now keep the existing behavior by
> > passing in "error_abort". They will be converted in following series. The Linux
> > driver part will also be worked on.
> > 
> > One concern with this behavior change is that it's now harder to notice the
> > actual driver bug that caused the error, as the guest continues to run.  To
> > address that, we could probably add a new error action option to virtio
> > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > paused and the management will get an event in QMP like pvpanic.  This work can
> > be done on top.
> 
> At the architectural level, that's only one concern. Others would be
> - workloads such as openstack handle guest crash better than
>   a guest that's e.g. slow because of a memory leak

What memory leak are you referring to?

> - it's easier for guests to probe host for security issues
>   if guest isn't killed
> - guest can flood host log with guest-triggered errors

We can still abort() if guest is triggering error too quickly.

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  2:37     ` Fam Zheng
  0 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  2:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > data with vring.
> > That has drawbacks such as losing unsaved data (e.g. when
> > guest user is writing a very long email), or possible denial of service in
> > a nested vm use case where virtio device is passed through.
> > 
> > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > improve this by communicating the error state between virtio devices and
> > drivers. The device notifies guest upon setting the bit, then the guest driver
> > should detect this bit and report to userspace, or recover the device by
> > resetting it.
> 
> Unfortunately, virtio 1 spec does not have a conformance statement
> that requires driver to recover. We merely have a non-normative looking
> text:
> 	Note: For example, the driver can’t assume requests in flight
> 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> 	they have not been completed. A good implementation will try to recover
> 	by issuing a reset.
> 
> Implementing this reset for all devices in a race-free manner might also
> be far from trivial.  I think we'd need a feature bit for this.
> OTOH as long as we make this a new feature, would an ability to
> reset a single VQ be a better match for what you are trying to
> achieve?

I think that is too complicated as a recovery measure, a device level resetting
will be better to get to a deterministic state, at least.

> 
> > This series makes necessary changes in virtio core code, based on which
> > virtio-blk is converted. Other devices now keep the existing behavior by
> > passing in "error_abort". They will be converted in following series. The Linux
> > driver part will also be worked on.
> > 
> > One concern with this behavior change is that it's now harder to notice the
> > actual driver bug that caused the error, as the guest continues to run.  To
> > address that, we could probably add a new error action option to virtio
> > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > paused and the management will get an event in QMP like pvpanic.  This work can
> > be done on top.
> 
> At the architectural level, that's only one concern. Others would be
> - workloads such as openstack handle guest crash better than
>   a guest that's e.g. slow because of a memory leak

What memory leak are you referring to?

> - it's easier for guests to probe host for security issues
>   if guest isn't killed
> - guest can flood host log with guest-triggered errors

We can still abort() if guest is triggering error too quickly.

Fam
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 20:34       ` Michael S. Tsirkin
@ 2015-04-21  2:39         ` Fam Zheng
  -1 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  2:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 04/20 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > > At the implementation level, there's one big issue you seem to have
> > > missed: DMA to invalid memory addresses causes a crash in memory core.
> > > I'm not sure whether it makes sense to recover from virtio core bugs
> > > when we can't recover from device bugs.
> > 
> > What do you mean exactly?  DMA to invalid memory addresses causes
> > address_space_map to return a "short read".
> > 
> > Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.
> 

They are VQ manipulating operations, not DMA. Anyway, can we return errors from
memory core?

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  2:39         ` Fam Zheng
  0 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  2:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 04/20 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > > At the implementation level, there's one big issue you seem to have
> > > missed: DMA to invalid memory addresses causes a crash in memory core.
> > > I'm not sure whether it makes sense to recover from virtio core bugs
> > > when we can't recover from device bugs.
> > 
> > What do you mean exactly?  DMA to invalid memory addresses causes
> > address_space_map to return a "short read".
> > 
> > Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.
> 

They are VQ manipulating operations, not DMA. Anyway, can we return errors from
memory core?

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  2:37     ` Fam Zheng
@ 2015-04-21  5:22       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  5:22 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Rusty Russell, qemu-devel, virtualization,
	Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > data with vring.
> > > That has drawbacks such as losing unsaved data (e.g. when
> > > guest user is writing a very long email), or possible denial of service in
> > > a nested vm use case where virtio device is passed through.
> > > 
> > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > improve this by communicating the error state between virtio devices and
> > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > should detect this bit and report to userspace, or recover the device by
> > > resetting it.
> > 
> > Unfortunately, virtio 1 spec does not have a conformance statement
> > that requires driver to recover. We merely have a non-normative looking
> > text:
> > 	Note: For example, the driver can’t assume requests in flight
> > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > 	they have not been completed. A good implementation will try to recover
> > 	by issuing a reset.
> > 
> > Implementing this reset for all devices in a race-free manner might also
> > be far from trivial.  I think we'd need a feature bit for this.
> > OTOH as long as we make this a new feature, would an ability to
> > reset a single VQ be a better match for what you are trying to
> > achieve?
> 
> I think that is too complicated as a recovery measure, a device level resetting
> will be better to get to a deterministic state, at least.

Question would be, how hard is it to stop host from using all queues,
retrieve all host OS state and re-program it into the device.
If we need to shadow all OS state within the driver, then that's a lot
of not well tested code with a possibility of introducing more bugs.

> > 
> > > This series makes necessary changes in virtio core code, based on which
> > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > passing in "error_abort". They will be converted in following series. The Linux
> > > driver part will also be worked on.
> > > 
> > > One concern with this behavior change is that it's now harder to notice the
> > > actual driver bug that caused the error, as the guest continues to run.  To
> > > address that, we could probably add a new error action option to virtio
> > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > be done on top.
> > 
> > At the architectural level, that's only one concern. Others would be
> > - workloads such as openstack handle guest crash better than
> >   a guest that's e.g. slow because of a memory leak
> 
> What memory leak are you referring to?

That was just an example.  If host detects a malformed ring, it will
crash.  But often it doesn't, result is buffers not being used, so guest
can't free them up.

> > - it's easier for guests to probe host for security issues
> >   if guest isn't killed
> > - guest can flood host log with guest-triggered errors
> 
> We can still abort() if guest is triggering error too quickly.
> 
> Fam


Absolutely, and if it looked like I'm against error detection and
recovery, this was not my intent.

I am merely saying we can't apply this patchset as is, deferring
addressing the issues to patches on top.

But I have an idea: refactor the code to use error_abort. This way we
can apply the patchset without making functional changes, and you can
make progress to complete this, on top.



-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  5:22       ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  5:22 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > data with vring.
> > > That has drawbacks such as losing unsaved data (e.g. when
> > > guest user is writing a very long email), or possible denial of service in
> > > a nested vm use case where virtio device is passed through.
> > > 
> > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > improve this by communicating the error state between virtio devices and
> > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > should detect this bit and report to userspace, or recover the device by
> > > resetting it.
> > 
> > Unfortunately, virtio 1 spec does not have a conformance statement
> > that requires driver to recover. We merely have a non-normative looking
> > text:
> > 	Note: For example, the driver can’t assume requests in flight
> > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > 	they have not been completed. A good implementation will try to recover
> > 	by issuing a reset.
> > 
> > Implementing this reset for all devices in a race-free manner might also
> > be far from trivial.  I think we'd need a feature bit for this.
> > OTOH as long as we make this a new feature, would an ability to
> > reset a single VQ be a better match for what you are trying to
> > achieve?
> 
> I think that is too complicated as a recovery measure, a device level resetting
> will be better to get to a deterministic state, at least.

Question would be, how hard is it to stop host from using all queues,
retrieve all host OS state and re-program it into the device.
If we need to shadow all OS state within the driver, then that's a lot
of not well tested code with a possibility of introducing more bugs.

> > 
> > > This series makes necessary changes in virtio core code, based on which
> > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > passing in "error_abort". They will be converted in following series. The Linux
> > > driver part will also be worked on.
> > > 
> > > One concern with this behavior change is that it's now harder to notice the
> > > actual driver bug that caused the error, as the guest continues to run.  To
> > > address that, we could probably add a new error action option to virtio
> > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > be done on top.
> > 
> > At the architectural level, that's only one concern. Others would be
> > - workloads such as openstack handle guest crash better than
> >   a guest that's e.g. slow because of a memory leak
> 
> What memory leak are you referring to?

That was just an example.  If host detects a malformed ring, it will
crash.  But often it doesn't, result is buffers not being used, so guest
can't free them up.

> > - it's easier for guests to probe host for security issues
> >   if guest isn't killed
> > - guest can flood host log with guest-triggered errors
> 
> We can still abort() if guest is triggering error too quickly.
> 
> Fam


Absolutely, and if it looked like I'm against error detection and
recovery, this was not my intent.

I am merely saying we can't apply this patchset as is, deferring
addressing the issues to patches on top.

But I have an idea: refactor the code to use error_abort. This way we
can apply the patchset without making functional changes, and you can
make progress to complete this, on top.



-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  5:22       ` Michael S. Tsirkin
@ 2015-04-21  5:50         ` Fam Zheng
  -1 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  5:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Rusty Russell, qemu-devel, virtualization,
	Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > data with vring.
> > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > guest user is writing a very long email), or possible denial of service in
> > > > a nested vm use case where virtio device is passed through.
> > > > 
> > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > improve this by communicating the error state between virtio devices and
> > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > should detect this bit and report to userspace, or recover the device by
> > > > resetting it.
> > > 
> > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > that requires driver to recover. We merely have a non-normative looking
> > > text:
> > > 	Note: For example, the driver can’t assume requests in flight
> > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > 	they have not been completed. A good implementation will try to recover
> > > 	by issuing a reset.
> > > 
> > > Implementing this reset for all devices in a race-free manner might also
> > > be far from trivial.  I think we'd need a feature bit for this.
> > > OTOH as long as we make this a new feature, would an ability to
> > > reset a single VQ be a better match for what you are trying to
> > > achieve?
> > 
> > I think that is too complicated as a recovery measure, a device level resetting
> > will be better to get to a deterministic state, at least.
> 
> Question would be, how hard is it to stop host from using all queues,
> retrieve all host OS state and re-program it into the device.
> If we need to shadow all OS state within the driver, then that's a lot
> of not well tested code with a possibility of introducing more bugs.

I don't understand the question. In this series the virtio-blk device will not
pop any more requests, and as long as the reset is properly handled, both guest
and host should go back to a good state.
> 
> > > 
> > > > This series makes necessary changes in virtio core code, based on which
> > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > driver part will also be worked on.
> > > > 
> > > > One concern with this behavior change is that it's now harder to notice the
> > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > address that, we could probably add a new error action option to virtio
> > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > be done on top.
> > > 
> > > At the architectural level, that's only one concern. Others would be
> > > - workloads such as openstack handle guest crash better than
> > >   a guest that's e.g. slow because of a memory leak
> > 
> > What memory leak are you referring to?
> 
> That was just an example.  If host detects a malformed ring, it will
> crash.  But often it doesn't, result is buffers not being used, so guest
> can't free them up.
> 
> > > - it's easier for guests to probe host for security issues
> > >   if guest isn't killed
> > > - guest can flood host log with guest-triggered errors
> > 
> > We can still abort() if guest is triggering error too quickly.
> 
> 
> Absolutely, and if it looked like I'm against error detection and
> recovery, this was not my intent.
> 
> I am merely saying we can't apply this patchset as is, deferring
> addressing the issues to patches on top.
> 
> But I have an idea: refactor the code to use error_abort. 

That is patch 1-9 of this series. Or do you mean also refactor and pass
error_abort to the memory core?

Fam

>This way we
> can apply the patchset without making functional changes, and you can
> make progress to complete this, on top.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  5:50         ` Fam Zheng
  0 siblings, 0 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  5:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > data with vring.
> > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > guest user is writing a very long email), or possible denial of service in
> > > > a nested vm use case where virtio device is passed through.
> > > > 
> > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > improve this by communicating the error state between virtio devices and
> > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > should detect this bit and report to userspace, or recover the device by
> > > > resetting it.
> > > 
> > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > that requires driver to recover. We merely have a non-normative looking
> > > text:
> > > 	Note: For example, the driver can’t assume requests in flight
> > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > 	they have not been completed. A good implementation will try to recover
> > > 	by issuing a reset.
> > > 
> > > Implementing this reset for all devices in a race-free manner might also
> > > be far from trivial.  I think we'd need a feature bit for this.
> > > OTOH as long as we make this a new feature, would an ability to
> > > reset a single VQ be a better match for what you are trying to
> > > achieve?
> > 
> > I think that is too complicated as a recovery measure, a device level resetting
> > will be better to get to a deterministic state, at least.
> 
> Question would be, how hard is it to stop host from using all queues,
> retrieve all host OS state and re-program it into the device.
> If we need to shadow all OS state within the driver, then that's a lot
> of not well tested code with a possibility of introducing more bugs.

I don't understand the question. In this series the virtio-blk device will not
pop any more requests, and as long as the reset is properly handled, both guest
and host should go back to a good state.
> 
> > > 
> > > > This series makes necessary changes in virtio core code, based on which
> > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > driver part will also be worked on.
> > > > 
> > > > One concern with this behavior change is that it's now harder to notice the
> > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > address that, we could probably add a new error action option to virtio
> > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > be done on top.
> > > 
> > > At the architectural level, that's only one concern. Others would be
> > > - workloads such as openstack handle guest crash better than
> > >   a guest that's e.g. slow because of a memory leak
> > 
> > What memory leak are you referring to?
> 
> That was just an example.  If host detects a malformed ring, it will
> crash.  But often it doesn't, result is buffers not being used, so guest
> can't free them up.
> 
> > > - it's easier for guests to probe host for security issues
> > >   if guest isn't killed
> > > - guest can flood host log with guest-triggered errors
> > 
> > We can still abort() if guest is triggering error too quickly.
> 
> 
> Absolutely, and if it looked like I'm against error detection and
> recovery, this was not my intent.
> 
> I am merely saying we can't apply this patchset as is, deferring
> addressing the issues to patches on top.
> 
> But I have an idea: refactor the code to use error_abort. 

That is patch 1-9 of this series. Or do you mean also refactor and pass
error_abort to the memory core?

Fam

>This way we
> can apply the patchset without making functional changes, and you can
> make progress to complete this, on top.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  5:50         ` Fam Zheng
@ 2015-04-21  6:09           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:09 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Rusty Russell, qemu-devel, virtualization,
	Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 01:50:33PM +0800, Fam Zheng wrote:
> On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > data with vring.
> > > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > > guest user is writing a very long email), or possible denial of service in
> > > > > a nested vm use case where virtio device is passed through.
> > > > > 
> > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > improve this by communicating the error state between virtio devices and
> > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > should detect this bit and report to userspace, or recover the device by
> > > > > resetting it.
> > > > 
> > > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > > that requires driver to recover. We merely have a non-normative looking
> > > > text:
> > > > 	Note: For example, the driver can’t assume requests in flight
> > > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > > 	they have not been completed. A good implementation will try to recover
> > > > 	by issuing a reset.
> > > > 
> > > > Implementing this reset for all devices in a race-free manner might also
> > > > be far from trivial.  I think we'd need a feature bit for this.
> > > > OTOH as long as we make this a new feature, would an ability to
> > > > reset a single VQ be a better match for what you are trying to
> > > > achieve?
> > > 
> > > I think that is too complicated as a recovery measure, a device level resetting
> > > will be better to get to a deterministic state, at least.
> > 
> > Question would be, how hard is it to stop host from using all queues,
> > retrieve all host OS state and re-program it into the device.
> > If we need to shadow all OS state within the driver, then that's a lot
> > of not well tested code with a possibility of introducing more bugs.
> 
> I don't understand the question. In this series the virtio-blk device will not
> pop any more requests, and as long as the reset is properly handled, both guest
> and host should go back to a good state.
> > 
> > > > 
> > > > > This series makes necessary changes in virtio core code, based on which
> > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > driver part will also be worked on.
> > > > > 
> > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > address that, we could probably add a new error action option to virtio
> > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > be done on top.
> > > > 
> > > > At the architectural level, that's only one concern. Others would be
> > > > - workloads such as openstack handle guest crash better than
> > > >   a guest that's e.g. slow because of a memory leak
> > > 
> > > What memory leak are you referring to?
> > 
> > That was just an example.  If host detects a malformed ring, it will
> > crash.  But often it doesn't, result is buffers not being used, so guest
> > can't free them up.
> > 
> > > > - it's easier for guests to probe host for security issues
> > > >   if guest isn't killed
> > > > - guest can flood host log with guest-triggered errors
> > > 
> > > We can still abort() if guest is triggering error too quickly.
> > 
> > 
> > Absolutely, and if it looked like I'm against error detection and
> > recovery, this was not my intent.
> > 
> > I am merely saying we can't apply this patchset as is, deferring
> > addressing the issues to patches on top.
> > 
> > But I have an idea: refactor the code to use error_abort. 
> 
> That is patch 1-9 of this series. Or do you mean also refactor and pass
> error_abort to the memory core?
> 
> Fam

So if you like just patches 1-9 applied, this sounds
reasonable. I'll provide review comments on the individual patches.



> >This way we
> > can apply the patchset without making functional changes, and you can
> > make progress to complete this, on top.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  6:09           ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:09 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 01:50:33PM +0800, Fam Zheng wrote:
> On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > data with vring.
> > > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > > guest user is writing a very long email), or possible denial of service in
> > > > > a nested vm use case where virtio device is passed through.
> > > > > 
> > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > improve this by communicating the error state between virtio devices and
> > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > should detect this bit and report to userspace, or recover the device by
> > > > > resetting it.
> > > > 
> > > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > > that requires driver to recover. We merely have a non-normative looking
> > > > text:
> > > > 	Note: For example, the driver can’t assume requests in flight
> > > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > > 	they have not been completed. A good implementation will try to recover
> > > > 	by issuing a reset.
> > > > 
> > > > Implementing this reset for all devices in a race-free manner might also
> > > > be far from trivial.  I think we'd need a feature bit for this.
> > > > OTOH as long as we make this a new feature, would an ability to
> > > > reset a single VQ be a better match for what you are trying to
> > > > achieve?
> > > 
> > > I think that is too complicated as a recovery measure, a device level resetting
> > > will be better to get to a deterministic state, at least.
> > 
> > Question would be, how hard is it to stop host from using all queues,
> > retrieve all host OS state and re-program it into the device.
> > If we need to shadow all OS state within the driver, then that's a lot
> > of not well tested code with a possibility of introducing more bugs.
> 
> I don't understand the question. In this series the virtio-blk device will not
> pop any more requests, and as long as the reset is properly handled, both guest
> and host should go back to a good state.
> > 
> > > > 
> > > > > This series makes necessary changes in virtio core code, based on which
> > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > driver part will also be worked on.
> > > > > 
> > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > address that, we could probably add a new error action option to virtio
> > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > be done on top.
> > > > 
> > > > At the architectural level, that's only one concern. Others would be
> > > > - workloads such as openstack handle guest crash better than
> > > >   a guest that's e.g. slow because of a memory leak
> > > 
> > > What memory leak are you referring to?
> > 
> > That was just an example.  If host detects a malformed ring, it will
> > crash.  But often it doesn't, result is buffers not being used, so guest
> > can't free them up.
> > 
> > > > - it's easier for guests to probe host for security issues
> > > >   if guest isn't killed
> > > > - guest can flood host log with guest-triggered errors
> > > 
> > > We can still abort() if guest is triggering error too quickly.
> > 
> > 
> > Absolutely, and if it looked like I'm against error detection and
> > recovery, this was not my intent.
> > 
> > I am merely saying we can't apply this patchset as is, deferring
> > addressing the issues to patches on top.
> > 
> > But I have an idea: refactor the code to use error_abort. 
> 
> That is patch 1-9 of this series. Or do you mean also refactor and pass
> error_abort to the memory core?
> 
> Fam

So if you like just patches 1-9 applied, this sounds
reasonable. I'll provide review comments on the individual patches.



> >This way we
> > can apply the patchset without making functional changes, and you can
> > make progress to complete this, on top.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head Fam Zheng
@ 2015-04-21  6:27   ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:27 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:18PM +0800, Fam Zheng wrote:
> Return type is changed to int.
> 
> When data is invalid, return -EINVAL with an error.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

Problem is, callers can't really handle -EINVAL here, so it's adding
dead code if we just apply patches 1-9.  How about the following
refactoring?  Your patch can go on top, it doesn't add dead code then.
If you like this, feel free to include it.

--->
virtio: get rid of virtqueue_num_heads

We don't really care about # of heads, squash
virtqueue_num_heads and virtqueue_get_head together,
to get a simpler internal API.
Memory barrier logic is also simplified a bit.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

--

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 60d10cd..580effe 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -295,8 +295,9 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
     virtqueue_flush(vq, 1);
 }
 
-static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
+static int virtqueue_get_head(VirtQueue *vq, unsigned int idx)
 {
+    unsigned int head;
     uint16_t num_heads = vring_avail_idx(vq) - idx;
 
     /* Check it isn't doing very strange things with descriptor numbers. */
@@ -305,18 +306,13 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
                      idx, vring_avail_idx(vq));
         exit(1);
     }
-    /* On success, callers read a descriptor at vq->last_avail_idx.
-     * Make sure descriptor read does not bypass avail index read. */
-    if (num_heads) {
-        smp_rmb();
-    }
 
-    return num_heads;
-}
+    if (!num_heads) {
+        return -EAGAIN;
+    }
 
-static unsigned int virtqueue_get_head(VirtQueue *vq, unsigned int idx)
-{
-    unsigned int head;
+    /* Make sure head read below does not bypass avail index read. */
+    smp_rmb();
 
     /* Grab the next descriptor number they're advertising, and increment
      * the index we've seen. */
@@ -360,19 +356,19 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
 {
     unsigned int idx;
     unsigned int total_bufs, in_total, out_total;
+    int i;
 
     idx = vq->last_avail_idx;
 
     total_bufs = in_total = out_total = 0;
-    while (virtqueue_num_heads(vq, idx)) {
+    while ((i = virtqueue_get_head(vq, idx)) >= 0) {
         VirtIODevice *vdev = vq->vdev;
         unsigned int max, num_bufs, indirect = 0;
         hwaddr desc_pa;
-        int i;
 
         max = vq->vring.num;
         num_bufs = total_bufs;
-        i = virtqueue_get_head(vq, idx++);
+        idx++;
         desc_pa = vq->vring.desc;
 
         if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
@@ -462,15 +458,17 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
     hwaddr desc_pa = vq->vring.desc;
     VirtIODevice *vdev = vq->vdev;
 
-    if (!virtqueue_num_heads(vq, vq->last_avail_idx))
+    if ((i = virtqueue_get_head(vq, vq->last_avail_idx)) < 0) {
         return 0;
+    }
 
     /* When we start there are none of either input nor output. */
     elem->out_num = elem->in_num = 0;
 
     max = vq->vring.num;
 
-    i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
+    head = i;
+    vq->last_avail_idx++;
     if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
         vring_set_avail_event(vq, vq->last_avail_idx);
     }

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc Fam Zheng
@ 2015-04-21  6:37   ` Michael S. Tsirkin
  2015-04-21  7:30     ` Fam Zheng
  0 siblings, 1 reply; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:37 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:19PM +0800, Fam Zheng wrote:
> Two callers pass error_abort now, which can be changed to check return value
> and pass the error on.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  hw/virtio/virtio.c | 27 ++++++++++++++++++---------
>  1 file changed, 18 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index a525f8e..2a24829 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -329,10 +329,11 @@ static int virtqueue_get_head(VirtQueue *vq, unsigned int idx,
>      return head;
>  }
>  
> -static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> -                                    unsigned int i, unsigned int max)
> +static int virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> +                               unsigned int i, unsigned int max,
> +                               Error **errp)
>  {
> -    unsigned int next;
> +    int next;
>  
>      /* If this descriptor says it doesn't chain, we're done. */
>      if (!(vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_NEXT)) {
> @@ -345,8 +346,8 @@ static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
>      smp_wmb();
>  
>      if (next >= max) {
> -        error_report("Desc next is %u", next);
> -        exit(1);
> +        error_setg(errp, "Desc next is %u", next);
> +        return -EINVAL;

I think it's best to return max here. No need to change return type
then.

>      }
>  
>      return next;
> @@ -392,7 +393,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
>              num_bufs = i = 0;
>          }
>  
> -        do {
> +        while (true) {
>              /* If we've got too many, that implies a descriptor loop. */
>              if (++num_bufs > max) {
>                  error_report("Looped descriptor");
> @@ -407,7 +408,11 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
>              if (in_total >= max_in_bytes && out_total >= max_out_bytes) {
>                  goto done;
>              }
> -        } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> +            i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> +            if (i == max) {
> +                break;
> +            }
> +        }
>  
>          if (!indirect)
>              total_bufs = num_bufs;
> @@ -493,7 +498,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>      }
>  
>      /* Collect all the descriptors */
> -    do {
> +    while (true) {
>          struct iovec *sg;
>  
>          if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
> @@ -519,7 +524,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>              error_report("Looped descriptor");
>              exit(1);
>          }
> -    } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> +        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> +        if (i == max) {
> +            break;
> +        }
> +    }
>  

Why refactor this as part of this patch?

>      /* Now map what we have collected */
>      virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop
  2015-04-17  7:59 ` [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop Fam Zheng
@ 2015-04-21  6:49   ` Michael S. Tsirkin
  2015-04-21  7:24     ` Fam Zheng
  0 siblings, 1 reply; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:49 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:21PM +0800, Fam Zheng wrote:
> When getting invalid data from vring, virtqueue_pop used to print an
> error and exit.
> 
> Add an errp parameter so it can return the error to callers.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  hw/9pfs/virtio-9p.c         |  2 +-
>  hw/block/virtio-blk.c       |  2 +-
>  hw/char/virtio-serial-bus.c | 10 +++----
>  hw/net/virtio-net.c         |  6 ++--
>  hw/scsi/virtio-scsi.c       |  2 +-
>  hw/virtio/virtio-balloon.c  |  4 +--
>  hw/virtio/virtio-rng.c      |  2 +-
>  hw/virtio/virtio.c          | 70 ++++++++++++++++++++++++++++++++++-----------
>  include/hw/virtio/virtio.h  |  2 +-
>  9 files changed, 69 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
> index 4964da0..17d0c4a 100644
> --- a/hw/9pfs/virtio-9p.c
> +++ b/hw/9pfs/virtio-9p.c
> @@ -3259,7 +3259,7 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
>      ssize_t len;
>  
>      while ((pdu = alloc_pdu(s)) &&
> -            (len = virtqueue_pop(vq, &pdu->elem)) != 0) {
> +            (len = virtqueue_pop(vq, &pdu->elem, &error_abort)) != 0) {
>          uint8_t *ptr;
>          pdu->s = s;
>          BUG_ON(pdu->elem.out_num == 0 || pdu->elem.in_num == 0);
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index f7d8528..0b66ee1 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -191,7 +191,7 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
>  {
>      VirtIOBlockReq *req = virtio_blk_alloc_request(s);
>  
> -    if (!virtqueue_pop(s->vq, &req->elem)) {
> +    if (!virtqueue_pop(s->vq, &req->elem, &error_abort)) {
>          virtio_blk_free_request(req);
>          return NULL;
>      }
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index a56dafc..76a934b 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -94,7 +94,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
>      while (offset < size) {
>          size_t len;
>  
> -        if (!virtqueue_pop(vq, &elem)) {
> +        if (!virtqueue_pop(vq, &elem, &error_abort)) {
>              break;
>          }
>  
> @@ -116,7 +116,7 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
>      if (!virtio_queue_ready(vq)) {
>          return;
>      }
> -    while (virtqueue_pop(vq, &elem)) {
> +    while (virtqueue_pop(vq, &elem, &error_abort)) {
>          virtqueue_push(vq, &elem, 0);
>      }
>      virtio_notify(vdev, vq);
> @@ -137,7 +137,7 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
>  
>          /* Pop an elem only if we haven't left off a previous one mid-way */
>          if (!port->elem.out_num) {
> -            if (!virtqueue_pop(vq, &port->elem)) {
> +            if (!virtqueue_pop(vq, &port->elem, &error_abort)) {
>                  break;
>              }
>              port->iov_idx = 0;
> @@ -190,7 +190,7 @@ static size_t send_control_msg(VirtIOSerial *vser, void *buf, size_t len)
>      if (!virtio_queue_ready(vq)) {
>          return 0;
>      }
> -    if (!virtqueue_pop(vq, &elem)) {
> +    if (!virtqueue_pop(vq, &elem, &error_abort)) {
>          return 0;
>      }
>  
> @@ -420,7 +420,7 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq)
>  
>      len = 0;
>      buf = NULL;
> -    while (virtqueue_pop(vq, &elem)) {
> +    while (virtqueue_pop(vq, &elem, &error_abort)) {
>          size_t cur_len;
>  
>          cur_len = iov_size(elem.out_sg, elem.out_num);
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 59f76bc..bbcb51f 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -804,7 +804,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
>      struct iovec *iov, *iov2;
>      unsigned int iov_cnt;
>  
> -    while (virtqueue_pop(vq, &elem)) {
> +    while (virtqueue_pop(vq, &elem, &error_abort)) {
>          if (iov_size(elem.in_sg, elem.in_num) < sizeof(status) ||
>              iov_size(elem.out_sg, elem.out_num) < sizeof(ctrl)) {
>              error_report("virtio-net ctrl missing headers");
> @@ -1031,7 +1031,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
>  
>          total = 0;
>  
> -        if (virtqueue_pop(q->rx_vq, &elem) == 0) {
> +        if (virtqueue_pop(q->rx_vq, &elem, &error_abort) == 0) {
>              if (i == 0)
>                  return -1;
>              error_report("virtio-net unexpected empty queue: "
> @@ -1134,7 +1134,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>          return num_packets;
>      }
>  
> -    while (virtqueue_pop(q->tx_vq, &elem)) {
> +    while (virtqueue_pop(q->tx_vq, &elem, &error_abort)) {
>          ssize_t ret, len;
>          unsigned int out_num = elem.out_num;
>          struct iovec *out_sg = &elem.out_sg[0];
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index c9bea06..40ba03d 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -177,7 +177,7 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req,
>  static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq)
>  {
>      VirtIOSCSIReq *req = virtio_scsi_init_req(s, vq);
> -    if (!virtqueue_pop(vq, &req->elem)) {
> +    if (!virtqueue_pop(vq, &req->elem, &error_abort)) {
>          virtio_scsi_free_req(req);
>          return NULL;
>      }
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 95b0643..e26c0a7 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -206,7 +206,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>      VirtQueueElement elem;
>      MemoryRegionSection section;
>  
> -    while (virtqueue_pop(vq, &elem)) {
> +    while (virtqueue_pop(vq, &elem, &error_abort)) {
>          size_t offset = 0;
>          uint32_t pfn;
>  
> @@ -246,7 +246,7 @@ static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
>      size_t offset = 0;
>      qemu_timeval tv;
>  
> -    if (!virtqueue_pop(vq, elem)) {
> +    if (!virtqueue_pop(vq, elem, &error_abort)) {
>          goto out;
>      }
>  
> diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> index 9e1bd75..c3cbdc3 100644
> --- a/hw/virtio/virtio-rng.c
> +++ b/hw/virtio/virtio-rng.c
> @@ -56,7 +56,7 @@ static void chr_read(void *opaque, const void *buf, size_t size)
>  
>      offset = 0;
>      while (offset < size) {
> -        if (!virtqueue_pop(vrng->vq, &elem)) {
> +        if (!virtqueue_pop(vrng->vq, &elem, &error_abort)) {
>              break;
>          }
>          len = iov_from_buf(elem.in_sg, elem.in_num,
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 1cd454b..e6f9f6b 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -481,29 +481,49 @@ fail:
>      error_setg(errp, "virtio: error trying to map MMIO memory");
>  }
>  
> -int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> +static void virtqueue_undo_map_sg(struct iovec *sg, hwaddr *addr,
> +                                  size_t num_sg, int is_write)
> +{
> +    int i;
> +
> +    for (i = 0; i < num_sg; i++) {
> +        cpu_physical_memory_unmap(sg[i].iov_base, sg[i].iov_len,
> +                                  is_write, 0);
> +    }
> +}
> +
> +int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp)
>  {
>      unsigned int i, head, max;
> +    int ret;
>      hwaddr desc_pa = vq->vring.desc;
>      VirtIODevice *vdev = vq->vdev;
> +    Error *local_err = NULL;
>  
> -    if (!virtqueue_num_heads(vq, vq->last_avail_idx, &error_abort))
> -        return 0;
> +    ret = virtqueue_num_heads(vq, vq->last_avail_idx, &local_err);
> +    if (ret <= 0) {
> +        goto err;
> +    }
>  

Strange.
Doesn't this make pop print an error if ring is empty?



>      /* When we start there are none of either input nor output. */
>      elem->out_num = elem->in_num = 0;
>  
>      max = vq->vring.num;
>  
> -    i = head = virtqueue_get_head(vq, vq->last_avail_idx++, &error_abort);
> +    ret = virtqueue_get_head(vq, vq->last_avail_idx++, &local_err);
> +    if (ret < 0) {
> +        goto err;
> +    }
> +    head = i = ret;
> +
>      if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
>          vring_set_avail_event(vq, vq->last_avail_idx);
>      }
>  
>      if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
>          if (vring_desc_len(vdev, desc_pa, i) % sizeof(VRingDesc)) {
> -            error_report("Invalid size for indirect buffer table");
> -            exit(1);
> +            error_setg(errp, "Invalid size for indirect buffer table");
> +            return -EINVAL;

Again, you have both error_setg and a return code.
There's no need for this.
Just return ring empty on error.


>          }
>  
>          /* loop over the indirect descriptor table */
> @@ -518,15 +538,17 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>  
>          if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
>              if (elem->in_num >= ARRAY_SIZE(elem->in_sg)) {
> -                error_report("Too many write descriptors in indirect table");
> -                exit(1);
> +                error_setg(errp,
> +                           "Too many write descriptors in indirect table");
> +                return -EINVAL;
>              }
>              elem->in_addr[elem->in_num] = vring_desc_addr(vdev, desc_pa, i);
>              sg = &elem->in_sg[elem->in_num++];
>          } else {
>              if (elem->out_num >= ARRAY_SIZE(elem->out_sg)) {
> -                error_report("Too many read descriptors in indirect table");
> -                exit(1);
> +                error_setg(errp,
> +                           "Too many read descriptors in indirect table");
> +                return -EINVAL;
>              }
>              elem->out_addr[elem->out_num] = vring_desc_addr(vdev, desc_pa, i);
>              sg = &elem->out_sg[elem->out_num++];
> @@ -536,20 +558,31 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>  
>          /* If we've got too many, that implies a descriptor loop. */
>          if ((elem->in_num + elem->out_num) > max) {
> -            error_report("Looped descriptor");
> -            exit(1);
> +            error_setg(errp, "Looped descriptor");
> +            return -EINVAL;
>          }
> -        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> -        if (i == max) {
> +        ret = virtqueue_next_desc(vdev, desc_pa, i, max, &local_err);
> +        if (ret < 0) {
> +            goto err;
> +        } else if (ret == max) {
>              break;
>          }
> +        i = ret;
>      }
>  
>      /* Now map what we have collected */
>      virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
> -                     &error_abort);
> +                     &local_err);
> +    if (local_err) {
> +        ret = -EINVAL;
> +        goto err;
> +    }
>      virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0,
> -                     &error_abort);
> +                     &local_err);
> +    if (local_err) {
> +        ret = -EINVAL;
> +        goto err_unmap;
> +    }
>  
>      elem->index = head;
>  
> @@ -557,6 +590,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>  
>      trace_virtqueue_pop(vq, elem, elem->in_num, elem->out_num);
>      return elem->in_num + elem->out_num;
> +err_unmap:
> +    virtqueue_undo_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
> +err:
> +    error_propagate(errp, local_err);
> +    return ret;
>  }
>  
>  /* virtio device */
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index a37ee64..c478f48 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -142,7 +142,7 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
>  void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
>                        size_t num_sg, int is_write,
>                        Error **errp);
> -int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem);
> +int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp);
>  int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
>                            unsigned int out_bytes);
>  void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 20:34       ` Michael S. Tsirkin
@ 2015-04-21  6:52         ` Paolo Bonzini
  -1 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2015-04-21  6:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah



On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
>>> At the implementation level, there's one big issue you seem to have
>>> missed: DMA to invalid memory addresses causes a crash in memory core.
>>> I'm not sure whether it makes sense to recover from virtio core bugs
>>> when we can't recover from device bugs.
>>
>> What do you mean exactly?  DMA to invalid memory addresses causes
>> address_space_map to return a "short read".
>>
>> Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.

address_space_translate and memory_access_size should ensure they don't.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  6:52         ` Paolo Bonzini
  0 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2015-04-21  6:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah



On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
>>> At the implementation level, there's one big issue you seem to have
>>> missed: DMA to invalid memory addresses causes a crash in memory core.
>>> I'm not sure whether it makes sense to recover from virtio core bugs
>>> when we can't recover from device bugs.
>>
>> What do you mean exactly?  DMA to invalid memory addresses causes
>> address_space_map to return a "short read".
>>
>> Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.

address_space_translate and memory_access_size should ensure they don't.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  6:52         ` Paolo Bonzini
@ 2015-04-21  6:58           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Tue, Apr 21, 2015 at 08:52:36AM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> >>> At the implementation level, there's one big issue you seem to have
> >>> missed: DMA to invalid memory addresses causes a crash in memory core.
> >>> I'm not sure whether it makes sense to recover from virtio core bugs
> >>> when we can't recover from device bugs.
> >>
> >> What do you mean exactly?  DMA to invalid memory addresses causes
> >> address_space_map to return a "short read".
> >>
> >> Paolo
> > 
> > I mean, first of all, a bunch of virtio_XXX_phys calls.
> > These eventually call qemu_get_ram_ptr, which internally calls
> > qemu_get_ram_block and ramblock_ptr.
> > Both abort on errors.
> 
> address_space_translate and memory_access_size should ensure they don't.
> 
> Paolo

More comments in this code won't hurt.
It *looks* as if we assume we get a valid mr, and try to
access it.
In any case, no error is reported.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
@ 2015-04-21  6:58           ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Tue, Apr 21, 2015 at 08:52:36AM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> >>> At the implementation level, there's one big issue you seem to have
> >>> missed: DMA to invalid memory addresses causes a crash in memory core.
> >>> I'm not sure whether it makes sense to recover from virtio core bugs
> >>> when we can't recover from device bugs.
> >>
> >> What do you mean exactly?  DMA to invalid memory addresses causes
> >> address_space_map to return a "short read".
> >>
> >> Paolo
> > 
> > I mean, first of all, a bunch of virtio_XXX_phys calls.
> > These eventually call qemu_get_ram_ptr, which internally calls
> > qemu_get_ram_block and ramblock_ptr.
> > Both abort on errors.
> 
> address_space_translate and memory_access_size should ensure they don't.
> 
> Paolo

More comments in this code won't hurt.
It *looks* as if we assume we get a valid mr, and try to
access it.
In any case, no error is reported.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop
  2015-04-21  6:49   ` Michael S. Tsirkin
@ 2015-04-21  7:24     ` Fam Zheng
  2015-04-21  9:51       ` Michael S. Tsirkin
  0 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  7:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, 04/21 08:49, Michael S. Tsirkin wrote:
> On Fri, Apr 17, 2015 at 03:59:21PM +0800, Fam Zheng wrote:
> > When getting invalid data from vring, virtqueue_pop used to print an
> > error and exit.
> > 
> > Add an errp parameter so it can return the error to callers.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  hw/9pfs/virtio-9p.c         |  2 +-
> >  hw/block/virtio-blk.c       |  2 +-
> >  hw/char/virtio-serial-bus.c | 10 +++----
> >  hw/net/virtio-net.c         |  6 ++--
> >  hw/scsi/virtio-scsi.c       |  2 +-
> >  hw/virtio/virtio-balloon.c  |  4 +--
> >  hw/virtio/virtio-rng.c      |  2 +-
> >  hw/virtio/virtio.c          | 70 ++++++++++++++++++++++++++++++++++-----------
> >  include/hw/virtio/virtio.h  |  2 +-
> >  9 files changed, 69 insertions(+), 31 deletions(-)
> > 
> > diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
> > index 4964da0..17d0c4a 100644
> > --- a/hw/9pfs/virtio-9p.c
> > +++ b/hw/9pfs/virtio-9p.c
> > @@ -3259,7 +3259,7 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
> >      ssize_t len;
> >  
> >      while ((pdu = alloc_pdu(s)) &&
> > -            (len = virtqueue_pop(vq, &pdu->elem)) != 0) {
> > +            (len = virtqueue_pop(vq, &pdu->elem, &error_abort)) != 0) {
> >          uint8_t *ptr;
> >          pdu->s = s;
> >          BUG_ON(pdu->elem.out_num == 0 || pdu->elem.in_num == 0);
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index f7d8528..0b66ee1 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -191,7 +191,7 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
> >  {
> >      VirtIOBlockReq *req = virtio_blk_alloc_request(s);
> >  
> > -    if (!virtqueue_pop(s->vq, &req->elem)) {
> > +    if (!virtqueue_pop(s->vq, &req->elem, &error_abort)) {
> >          virtio_blk_free_request(req);
> >          return NULL;
> >      }
> > diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > index a56dafc..76a934b 100644
> > --- a/hw/char/virtio-serial-bus.c
> > +++ b/hw/char/virtio-serial-bus.c
> > @@ -94,7 +94,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
> >      while (offset < size) {
> >          size_t len;
> >  
> > -        if (!virtqueue_pop(vq, &elem)) {
> > +        if (!virtqueue_pop(vq, &elem, &error_abort)) {
> >              break;
> >          }
> >  
> > @@ -116,7 +116,7 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
> >      if (!virtio_queue_ready(vq)) {
> >          return;
> >      }
> > -    while (virtqueue_pop(vq, &elem)) {
> > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> >          virtqueue_push(vq, &elem, 0);
> >      }
> >      virtio_notify(vdev, vq);
> > @@ -137,7 +137,7 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
> >  
> >          /* Pop an elem only if we haven't left off a previous one mid-way */
> >          if (!port->elem.out_num) {
> > -            if (!virtqueue_pop(vq, &port->elem)) {
> > +            if (!virtqueue_pop(vq, &port->elem, &error_abort)) {
> >                  break;
> >              }
> >              port->iov_idx = 0;
> > @@ -190,7 +190,7 @@ static size_t send_control_msg(VirtIOSerial *vser, void *buf, size_t len)
> >      if (!virtio_queue_ready(vq)) {
> >          return 0;
> >      }
> > -    if (!virtqueue_pop(vq, &elem)) {
> > +    if (!virtqueue_pop(vq, &elem, &error_abort)) {
> >          return 0;
> >      }
> >  
> > @@ -420,7 +420,7 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq)
> >  
> >      len = 0;
> >      buf = NULL;
> > -    while (virtqueue_pop(vq, &elem)) {
> > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> >          size_t cur_len;
> >  
> >          cur_len = iov_size(elem.out_sg, elem.out_num);
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 59f76bc..bbcb51f 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -804,7 +804,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
> >      struct iovec *iov, *iov2;
> >      unsigned int iov_cnt;
> >  
> > -    while (virtqueue_pop(vq, &elem)) {
> > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> >          if (iov_size(elem.in_sg, elem.in_num) < sizeof(status) ||
> >              iov_size(elem.out_sg, elem.out_num) < sizeof(ctrl)) {
> >              error_report("virtio-net ctrl missing headers");
> > @@ -1031,7 +1031,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
> >  
> >          total = 0;
> >  
> > -        if (virtqueue_pop(q->rx_vq, &elem) == 0) {
> > +        if (virtqueue_pop(q->rx_vq, &elem, &error_abort) == 0) {
> >              if (i == 0)
> >                  return -1;
> >              error_report("virtio-net unexpected empty queue: "
> > @@ -1134,7 +1134,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> >          return num_packets;
> >      }
> >  
> > -    while (virtqueue_pop(q->tx_vq, &elem)) {
> > +    while (virtqueue_pop(q->tx_vq, &elem, &error_abort)) {
> >          ssize_t ret, len;
> >          unsigned int out_num = elem.out_num;
> >          struct iovec *out_sg = &elem.out_sg[0];
> > diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> > index c9bea06..40ba03d 100644
> > --- a/hw/scsi/virtio-scsi.c
> > +++ b/hw/scsi/virtio-scsi.c
> > @@ -177,7 +177,7 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req,
> >  static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq)
> >  {
> >      VirtIOSCSIReq *req = virtio_scsi_init_req(s, vq);
> > -    if (!virtqueue_pop(vq, &req->elem)) {
> > +    if (!virtqueue_pop(vq, &req->elem, &error_abort)) {
> >          virtio_scsi_free_req(req);
> >          return NULL;
> >      }
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 95b0643..e26c0a7 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -206,7 +206,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >      VirtQueueElement elem;
> >      MemoryRegionSection section;
> >  
> > -    while (virtqueue_pop(vq, &elem)) {
> > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> >          size_t offset = 0;
> >          uint32_t pfn;
> >  
> > @@ -246,7 +246,7 @@ static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
> >      size_t offset = 0;
> >      qemu_timeval tv;
> >  
> > -    if (!virtqueue_pop(vq, elem)) {
> > +    if (!virtqueue_pop(vq, elem, &error_abort)) {
> >          goto out;
> >      }
> >  
> > diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> > index 9e1bd75..c3cbdc3 100644
> > --- a/hw/virtio/virtio-rng.c
> > +++ b/hw/virtio/virtio-rng.c
> > @@ -56,7 +56,7 @@ static void chr_read(void *opaque, const void *buf, size_t size)
> >  
> >      offset = 0;
> >      while (offset < size) {
> > -        if (!virtqueue_pop(vrng->vq, &elem)) {
> > +        if (!virtqueue_pop(vrng->vq, &elem, &error_abort)) {
> >              break;
> >          }
> >          len = iov_from_buf(elem.in_sg, elem.in_num,
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index 1cd454b..e6f9f6b 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -481,29 +481,49 @@ fail:
> >      error_setg(errp, "virtio: error trying to map MMIO memory");
> >  }
> >  
> > -int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> > +static void virtqueue_undo_map_sg(struct iovec *sg, hwaddr *addr,
> > +                                  size_t num_sg, int is_write)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < num_sg; i++) {
> > +        cpu_physical_memory_unmap(sg[i].iov_base, sg[i].iov_len,
> > +                                  is_write, 0);
> > +    }
> > +}
> > +
> > +int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp)
> >  {
> >      unsigned int i, head, max;
> > +    int ret;
> >      hwaddr desc_pa = vq->vring.desc;
> >      VirtIODevice *vdev = vq->vdev;
> > +    Error *local_err = NULL;
> >  
> > -    if (!virtqueue_num_heads(vq, vq->last_avail_idx, &error_abort))
> > -        return 0;
> > +    ret = virtqueue_num_heads(vq, vq->last_avail_idx, &local_err);
> > +    if (ret <= 0) {
> > +        goto err;
> > +    }
> >  
> 
> Strange.
> Doesn't this make pop print an error if ring is empty?

Code at err label propagates local_err (NULL if empty, hence nop) and returns
ret (0 if empty). Maybe rename it to "out".

> 
> 
> 
> >      /* When we start there are none of either input nor output. */
> >      elem->out_num = elem->in_num = 0;
> >  
> >      max = vq->vring.num;
> >  
> > -    i = head = virtqueue_get_head(vq, vq->last_avail_idx++, &error_abort);
> > +    ret = virtqueue_get_head(vq, vq->last_avail_idx++, &local_err);
> > +    if (ret < 0) {
> > +        goto err;
> > +    }
> > +    head = i = ret;
> > +
> >      if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
> >          vring_set_avail_event(vq, vq->last_avail_idx);
> >      }
> >  
> >      if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
> >          if (vring_desc_len(vdev, desc_pa, i) % sizeof(VRingDesc)) {
> > -            error_report("Invalid size for indirect buffer table");
> > -            exit(1);
> > +            error_setg(errp, "Invalid size for indirect buffer table");
> > +            return -EINVAL;
> 
> Again, you have both error_setg and a return code.
> There's no need for this.
> Just return ring empty on error.

How would caller distinguish real ring empty from error then?

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc
  2015-04-21  6:37   ` Michael S. Tsirkin
@ 2015-04-21  7:30     ` Fam Zheng
  2015-04-21  9:56       ` Michael S. Tsirkin
  0 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  7:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, 04/21 08:37, Michael S. Tsirkin wrote:
> On Fri, Apr 17, 2015 at 03:59:19PM +0800, Fam Zheng wrote:
> > Two callers pass error_abort now, which can be changed to check return value
> > and pass the error on.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  hw/virtio/virtio.c | 27 ++++++++++++++++++---------
> >  1 file changed, 18 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index a525f8e..2a24829 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -329,10 +329,11 @@ static int virtqueue_get_head(VirtQueue *vq, unsigned int idx,
> >      return head;
> >  }
> >  
> > -static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> > -                                    unsigned int i, unsigned int max)
> > +static int virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> > +                               unsigned int i, unsigned int max,
> > +                               Error **errp)
> >  {
> > -    unsigned int next;
> > +    int next;
> >  
> >      /* If this descriptor says it doesn't chain, we're done. */
> >      if (!(vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_NEXT)) {
> > @@ -345,8 +346,8 @@ static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> >      smp_wmb();
> >  
> >      if (next >= max) {
> > -        error_report("Desc next is %u", next);
> > -        exit(1);
> > +        error_setg(errp, "Desc next is %u", next);
> > +        return -EINVAL;
> 
> I think it's best to return max here. No need to change return type
> then.

We use negative return code elsewherer for reporting errors, I personally
prefer -EINVAL.  Are you concerned about overflow?

> 
> >      }
> >  
> >      return next;
> > @@ -392,7 +393,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> >              num_bufs = i = 0;
> >          }
> >  
> > -        do {
> > +        while (true) {
> >              /* If we've got too many, that implies a descriptor loop. */
> >              if (++num_bufs > max) {
> >                  error_report("Looped descriptor");
> > @@ -407,7 +408,11 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> >              if (in_total >= max_in_bytes && out_total >= max_out_bytes) {
> >                  goto done;
> >              }
> > -        } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> > +            i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> > +            if (i == max) {
> > +                break;
> > +            }
> > +        }
> >  
> >          if (!indirect)
> >              total_bufs = num_bufs;
> > @@ -493,7 +498,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> >      }
> >  
> >      /* Collect all the descriptors */
> > -    do {
> > +    while (true) {
> >          struct iovec *sg;
> >  
> >          if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
> > @@ -519,7 +524,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> >              error_report("Looped descriptor");
> >              exit(1);
> >          }
> > -    } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> > +        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> > +        if (i == max) {
> > +            break;
> > +        }
> > +    }
> >  
> 
> Why refactor this as part of this patch?

Graceful error handling will need to un-inline the loop condition, so refactor
it as we're touching the line.

Fam

> 
> >      /* Now map what we have collected */
> >      virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
> > -- 
> > 1.9.3

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 15:13 ` [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Cornelia Huck
@ 2015-04-21  7:44   ` Fam Zheng
  2015-04-21  8:04     ` Cornelia Huck
  0 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  7:44 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Mon, 04/20 17:13, Cornelia Huck wrote:
> On Fri, 17 Apr 2015 15:59:15 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > guest user is writing a very long email), or possible denial of service in
> > a nested vm use case where virtio device is passed through.
> > 
> > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > improve this by communicating the error state between virtio devices and
> > drivers. The device notifies guest upon setting the bit, then the guest driver
> > should detect this bit and report to userspace, or recover the device by
> > resetting it.
> > 
> > This series makes necessary changes in virtio core code, based on which
> > virtio-blk is converted. Other devices now keep the existing behavior by
> > passing in "error_abort". They will be converted in following series. The Linux
> > driver part will also be worked on.
> > 
> > One concern with this behavior change is that it's now harder to notice the
> > actual driver bug that caused the error, as the guest continues to run.  To
> > address that, we could probably add a new error action option to virtio
> > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > paused and the management will get an event in QMP like pvpanic.  This work can
> > be done on top.
> 
> In principle, this looks nice; I'm not sure however how this affects
> non-virtio-1 devices.
> 
> If a device is operating in virtio-1 mode, everything is clearly
> specified: The guest is notified and if it is aware of the NEEDS_RESET
> bit, it can react accordingly.
> 
> But what about legacy devices? Even if they are notified, they don't
> know to check for NEEDS_RESET - and I'm not sure if the undefined
> behaviour after NEEDS_RESET might lead to bigger trouble than killing
> off the guest.
> 

The device should become unresponsive to VQ output until guest issues a reset
through bus commands.  Do you have an example of "big trouble" in mind?

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  7:44   ` Fam Zheng
@ 2015-04-21  8:04     ` Cornelia Huck
  2015-04-21  8:38       ` Fam Zheng
  0 siblings, 1 reply; 54+ messages in thread
From: Cornelia Huck @ 2015-04-21  8:04 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 21 Apr 2015 15:44:02 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Mon, 04/20 17:13, Cornelia Huck wrote:
> > On Fri, 17 Apr 2015 15:59:15 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> > 
> > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > > guest user is writing a very long email), or possible denial of service in
> > > a nested vm use case where virtio device is passed through.
> > > 
> > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > improve this by communicating the error state between virtio devices and
> > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > should detect this bit and report to userspace, or recover the device by
> > > resetting it.
> > > 
> > > This series makes necessary changes in virtio core code, based on which
> > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > passing in "error_abort". They will be converted in following series. The Linux
> > > driver part will also be worked on.
> > > 
> > > One concern with this behavior change is that it's now harder to notice the
> > > actual driver bug that caused the error, as the guest continues to run.  To
> > > address that, we could probably add a new error action option to virtio
> > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > be done on top.
> > 
> > In principle, this looks nice; I'm not sure however how this affects
> > non-virtio-1 devices.
> > 
> > If a device is operating in virtio-1 mode, everything is clearly
> > specified: The guest is notified and if it is aware of the NEEDS_RESET
> > bit, it can react accordingly.
> > 
> > But what about legacy devices? Even if they are notified, they don't
> > know to check for NEEDS_RESET - and I'm not sure if the undefined
> > behaviour after NEEDS_RESET might lead to bigger trouble than killing
> > off the guest.
> > 
> 
> The device should become unresponsive to VQ output until guest issues a reset
> through bus commands.  Do you have an example of "big trouble" in mind?

I'm not sure what's supposed to happen if NEEDS_RESET is set but not
everything is fenced off. The guest may see that queues have become
unresponsive, but if we don't stop ioeventfds and fence off
notifications, it may easily get into an undefined state internally.
And if it is connected to other guests via networking, having it limp
on may be worse than just killing it off. (Which parts of the data have
been cleanly written to disk and which haven't? How is it going to get
out of that pickle if it has no good idea of what is wrong?)

If I have to debug a non-working guest, I prefer a crashed one with a
clean state over one that has continued running after the error
occurred. For legacy, I vote for killing the guest off as before; a
virtio-1 aware guest can choose to either recover the device via reset,
if possible, or die of its own accord if the problem is non-recoverable.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  8:04     ` Cornelia Huck
@ 2015-04-21  8:38       ` Fam Zheng
  2015-04-21  9:08         ` Cornelia Huck
  0 siblings, 1 reply; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  8:38 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 04/21 10:04, Cornelia Huck wrote:
> On Tue, 21 Apr 2015 15:44:02 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Mon, 04/20 17:13, Cornelia Huck wrote:
> > > On Fri, 17 Apr 2015 15:59:15 +0800
> > > Fam Zheng <famz@redhat.com> wrote:
> > > 
> > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > > > guest user is writing a very long email), or possible denial of service in
> > > > a nested vm use case where virtio device is passed through.
> > > > 
> > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > improve this by communicating the error state between virtio devices and
> > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > should detect this bit and report to userspace, or recover the device by
> > > > resetting it.
> > > > 
> > > > This series makes necessary changes in virtio core code, based on which
> > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > driver part will also be worked on.
> > > > 
> > > > One concern with this behavior change is that it's now harder to notice the
> > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > address that, we could probably add a new error action option to virtio
> > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > be done on top.
> > > 
> > > In principle, this looks nice; I'm not sure however how this affects
> > > non-virtio-1 devices.
> > > 
> > > If a device is operating in virtio-1 mode, everything is clearly
> > > specified: The guest is notified and if it is aware of the NEEDS_RESET
> > > bit, it can react accordingly.
> > > 
> > > But what about legacy devices? Even if they are notified, they don't
> > > know to check for NEEDS_RESET - and I'm not sure if the undefined
> > > behaviour after NEEDS_RESET might lead to bigger trouble than killing
> > > off the guest.
> > > 
> > 
> > The device should become unresponsive to VQ output until guest issues a reset
> > through bus commands.  Do you have an example of "big trouble" in mind?
> 
> I'm not sure what's supposed to happen if NEEDS_RESET is set but not
> everything is fenced off. The guest may see that queues have become
> unresponsive, but if we don't stop ioeventfds and fence off
> notifications, it may easily get into an undefined state internally.

Yeah, disabling ioeventfds and notifications is a good idea.

> And if it is connected to other guests via networking, having it limp
> on may be worse than just killing it off. (Which parts of the data have
> been cleanly written to disk and which haven't?

Well, we don't know that even without this series, do we?

> How is it going to get
> out of that pickle if it has no good idea of what is wrong?

If it's virtio-1 compatible, it can reset the device or mark the device
ususable, either way guest gets a chance to save the work.

If it's not, it's merely an unresponsive device, and guest user can
reboot/shutdown.

> 
> If I have to debug a non-working guest, I prefer a crashed one with a
> clean state over one that has continued running after the error
> occurred.

For debugging purpose, crashing is definitely fine (even better :), but only
because we won't have critical applications in guest. It makes sense to user to
avoid the overkiller "exit(1)"'s in QEMU. They don't even generate a core file.
And even if they do, it would be much more painful to recover an unsaved
libreoffice document from a memory core.

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  8:38       ` Fam Zheng
@ 2015-04-21  9:08         ` Cornelia Huck
  2015-04-21  9:16           ` Fam Zheng
  0 siblings, 1 reply; 54+ messages in thread
From: Cornelia Huck @ 2015-04-21  9:08 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 21 Apr 2015 16:38:31 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Tue, 04/21 10:04, Cornelia Huck wrote:
> > On Tue, 21 Apr 2015 15:44:02 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> > 
> > > On Mon, 04/20 17:13, Cornelia Huck wrote:
> > > > On Fri, 17 Apr 2015 15:59:15 +0800
> > > > Fam Zheng <famz@redhat.com> wrote:
> > > > 
> > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > > > > guest user is writing a very long email), or possible denial of service in
> > > > > a nested vm use case where virtio device is passed through.
> > > > > 
> > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > improve this by communicating the error state between virtio devices and
> > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > should detect this bit and report to userspace, or recover the device by
> > > > > resetting it.
> > > > > 
> > > > > This series makes necessary changes in virtio core code, based on which
> > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > driver part will also be worked on.
> > > > > 
> > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > address that, we could probably add a new error action option to virtio
> > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > be done on top.
> > > > 
> > > > In principle, this looks nice; I'm not sure however how this affects
> > > > non-virtio-1 devices.
> > > > 
> > > > If a device is operating in virtio-1 mode, everything is clearly
> > > > specified: The guest is notified and if it is aware of the NEEDS_RESET
> > > > bit, it can react accordingly.
> > > > 
> > > > But what about legacy devices? Even if they are notified, they don't
> > > > know to check for NEEDS_RESET - and I'm not sure if the undefined
> > > > behaviour after NEEDS_RESET might lead to bigger trouble than killing
> > > > off the guest.
> > > > 
> > > 
> > > The device should become unresponsive to VQ output until guest issues a reset
> > > through bus commands.  Do you have an example of "big trouble" in mind?
> > 
> > I'm not sure what's supposed to happen if NEEDS_RESET is set but not
> > everything is fenced off. The guest may see that queues have become
> > unresponsive, but if we don't stop ioeventfds and fence off
> > notifications, it may easily get into an undefined state internally.
> 
> Yeah, disabling ioeventfds and notifications is a good idea.
> 
> > And if it is connected to other guests via networking, having it limp
> > on may be worse than just killing it off. (Which parts of the data have
> > been cleanly written to disk and which haven't?
> 
> Well, we don't know that even without this series, do we?

We know it hasn't, as the guest is dead :)

> 
> > How is it going to get
> > out of that pickle if it has no good idea of what is wrong?
> 
> If it's virtio-1 compatible, it can reset the device or mark the device
> ususable, either way guest gets a chance to save the work.

My problem is not with virtio-1 devices; although data certainly can't
be written if the device has become unusable.

> 
> If it's not, it's merely an unresponsive device, and guest user can
> reboot/shutdown.

But how does any management software know? If I'm logged into a system
and I notice that saving my data doesn't complete, I can trigger an
action (although reboot/shutdown may not work anymore if too many
threads are waiting on writeback), but how can an automation system
know? It is probably more useful for those setups to have a hard stop
if recovery is not possible - and for legacy systems, that means
killing the guest afaics.

> 
> > 
> > If I have to debug a non-working guest, I prefer a crashed one with a
> > clean state over one that has continued running after the error
> > occurred.
> 
> For debugging purpose, crashing is definitely fine (even better :), but only
> because we won't have critical applications in guest. 

I would argue even for critical applications. They should have a second
guest as backup :)

> It makes sense to user to
> avoid the overkiller "exit(1)"'s in QEMU. They don't even generate a core file.

Let's keep dying, but use abort? Would that help?

> And even if they do, it would be much more painful to recover an unsaved
> libreoffice document from a memory core.

See my reply above.

My concern is mainly about legacy setups that aren't used interactively.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  9:08         ` Cornelia Huck
@ 2015-04-21  9:16           ` Fam Zheng
  2015-04-21  9:55             ` Cornelia Huck
  2015-04-21  9:59             ` Michael S. Tsirkin
  0 siblings, 2 replies; 54+ messages in thread
From: Fam Zheng @ 2015-04-21  9:16 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 04/21 11:08, Cornelia Huck wrote:
> On Tue, 21 Apr 2015 16:38:31 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Tue, 04/21 10:04, Cornelia Huck wrote:
> > > On Tue, 21 Apr 2015 15:44:02 +0800
> > > Fam Zheng <famz@redhat.com> wrote:
> > > 
> > > > On Mon, 04/20 17:13, Cornelia Huck wrote:
> > > > > On Fri, 17 Apr 2015 15:59:15 +0800
> > > > > Fam Zheng <famz@redhat.com> wrote:
> > > > > 
> > > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > > > > > guest user is writing a very long email), or possible denial of service in
> > > > > > a nested vm use case where virtio device is passed through.
> > > > > > 
> > > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > > improve this by communicating the error state between virtio devices and
> > > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > > should detect this bit and report to userspace, or recover the device by
> > > > > > resetting it.
> > > > > > 
> > > > > > This series makes necessary changes in virtio core code, based on which
> > > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > > driver part will also be worked on.
> > > > > > 
> > > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > > address that, we could probably add a new error action option to virtio
> > > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > > be done on top.
> > > > > 
> > > > > In principle, this looks nice; I'm not sure however how this affects
> > > > > non-virtio-1 devices.
> > > > > 
> > > > > If a device is operating in virtio-1 mode, everything is clearly
> > > > > specified: The guest is notified and if it is aware of the NEEDS_RESET
> > > > > bit, it can react accordingly.
> > > > > 
> > > > > But what about legacy devices? Even if they are notified, they don't
> > > > > know to check for NEEDS_RESET - and I'm not sure if the undefined
> > > > > behaviour after NEEDS_RESET might lead to bigger trouble than killing
> > > > > off the guest.
> > > > > 
> > > > 
> > > > The device should become unresponsive to VQ output until guest issues a reset
> > > > through bus commands.  Do you have an example of "big trouble" in mind?
> > > 
> > > I'm not sure what's supposed to happen if NEEDS_RESET is set but not
> > > everything is fenced off. The guest may see that queues have become
> > > unresponsive, but if we don't stop ioeventfds and fence off
> > > notifications, it may easily get into an undefined state internally.
> > 
> > Yeah, disabling ioeventfds and notifications is a good idea.
> > 
> > > And if it is connected to other guests via networking, having it limp
> > > on may be worse than just killing it off. (Which parts of the data have
> > > been cleanly written to disk and which haven't?
> > 
> > Well, we don't know that even without this series, do we?
> 
> We know it hasn't, as the guest is dead :)
> 
> > 
> > > How is it going to get
> > > out of that pickle if it has no good idea of what is wrong?
> > 
> > If it's virtio-1 compatible, it can reset the device or mark the device
> > ususable, either way guest gets a chance to save the work.
> 
> My problem is not with virtio-1 devices; although data certainly can't
> be written if the device has become unusable.
> 
> > 
> > If it's not, it's merely an unresponsive device, and guest user can
> > reboot/shutdown.
> 
> But how does any management software know? If I'm logged into a system
> and I notice that saving my data doesn't complete, I can trigger an
> action (although reboot/shutdown may not work anymore if too many
> threads are waiting on writeback), but how can an automation system
> know? It is probably more useful for those setups to have a hard stop
> if recovery is not possible - and for legacy systems, that means
> killing the guest afaics.
> 
> > 
> > > 
> > > If I have to debug a non-working guest, I prefer a crashed one with a
> > > clean state over one that has continued running after the error
> > > occurred.
> > 
> > For debugging purpose, crashing is definitely fine (even better :), but only
> > because we won't have critical applications in guest. 
> 
> I would argue even for critical applications. They should have a second
> guest as backup :)
> 
> > It makes sense to user to
> > avoid the overkiller "exit(1)"'s in QEMU. They don't even generate a core file.
> 
> Let's keep dying, but use abort? Would that help?
> 
> > And even if they do, it would be much more painful to recover an unsaved
> > libreoffice document from a memory core.
> 
> See my reply above.
> 
> My concern is mainly about legacy setups that aren't used interactively.
> 

How about pausing guest and generating an QMP event?

Fam

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop
  2015-04-21  7:24     ` Fam Zheng
@ 2015-04-21  9:51       ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  9:51 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 03:24:25PM +0800, Fam Zheng wrote:
> On Tue, 04/21 08:49, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:21PM +0800, Fam Zheng wrote:
> > > When getting invalid data from vring, virtqueue_pop used to print an
> > > error and exit.
> > > 
> > > Add an errp parameter so it can return the error to callers.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > ---
> > >  hw/9pfs/virtio-9p.c         |  2 +-
> > >  hw/block/virtio-blk.c       |  2 +-
> > >  hw/char/virtio-serial-bus.c | 10 +++----
> > >  hw/net/virtio-net.c         |  6 ++--
> > >  hw/scsi/virtio-scsi.c       |  2 +-
> > >  hw/virtio/virtio-balloon.c  |  4 +--
> > >  hw/virtio/virtio-rng.c      |  2 +-
> > >  hw/virtio/virtio.c          | 70 ++++++++++++++++++++++++++++++++++-----------
> > >  include/hw/virtio/virtio.h  |  2 +-
> > >  9 files changed, 69 insertions(+), 31 deletions(-)
> > > 
> > > diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
> > > index 4964da0..17d0c4a 100644
> > > --- a/hw/9pfs/virtio-9p.c
> > > +++ b/hw/9pfs/virtio-9p.c
> > > @@ -3259,7 +3259,7 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
> > >      ssize_t len;
> > >  
> > >      while ((pdu = alloc_pdu(s)) &&
> > > -            (len = virtqueue_pop(vq, &pdu->elem)) != 0) {
> > > +            (len = virtqueue_pop(vq, &pdu->elem, &error_abort)) != 0) {
> > >          uint8_t *ptr;
> > >          pdu->s = s;
> > >          BUG_ON(pdu->elem.out_num == 0 || pdu->elem.in_num == 0);
> > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > > index f7d8528..0b66ee1 100644
> > > --- a/hw/block/virtio-blk.c
> > > +++ b/hw/block/virtio-blk.c
> > > @@ -191,7 +191,7 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
> > >  {
> > >      VirtIOBlockReq *req = virtio_blk_alloc_request(s);
> > >  
> > > -    if (!virtqueue_pop(s->vq, &req->elem)) {
> > > +    if (!virtqueue_pop(s->vq, &req->elem, &error_abort)) {
> > >          virtio_blk_free_request(req);
> > >          return NULL;
> > >      }
> > > diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > > index a56dafc..76a934b 100644
> > > --- a/hw/char/virtio-serial-bus.c
> > > +++ b/hw/char/virtio-serial-bus.c
> > > @@ -94,7 +94,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
> > >      while (offset < size) {
> > >          size_t len;
> > >  
> > > -        if (!virtqueue_pop(vq, &elem)) {
> > > +        if (!virtqueue_pop(vq, &elem, &error_abort)) {
> > >              break;
> > >          }
> > >  
> > > @@ -116,7 +116,7 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
> > >      if (!virtio_queue_ready(vq)) {
> > >          return;
> > >      }
> > > -    while (virtqueue_pop(vq, &elem)) {
> > > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> > >          virtqueue_push(vq, &elem, 0);
> > >      }
> > >      virtio_notify(vdev, vq);
> > > @@ -137,7 +137,7 @@ static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
> > >  
> > >          /* Pop an elem only if we haven't left off a previous one mid-way */
> > >          if (!port->elem.out_num) {
> > > -            if (!virtqueue_pop(vq, &port->elem)) {
> > > +            if (!virtqueue_pop(vq, &port->elem, &error_abort)) {
> > >                  break;
> > >              }
> > >              port->iov_idx = 0;
> > > @@ -190,7 +190,7 @@ static size_t send_control_msg(VirtIOSerial *vser, void *buf, size_t len)
> > >      if (!virtio_queue_ready(vq)) {
> > >          return 0;
> > >      }
> > > -    if (!virtqueue_pop(vq, &elem)) {
> > > +    if (!virtqueue_pop(vq, &elem, &error_abort)) {
> > >          return 0;
> > >      }
> > >  
> > > @@ -420,7 +420,7 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq)
> > >  
> > >      len = 0;
> > >      buf = NULL;
> > > -    while (virtqueue_pop(vq, &elem)) {
> > > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> > >          size_t cur_len;
> > >  
> > >          cur_len = iov_size(elem.out_sg, elem.out_num);
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index 59f76bc..bbcb51f 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -804,7 +804,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
> > >      struct iovec *iov, *iov2;
> > >      unsigned int iov_cnt;
> > >  
> > > -    while (virtqueue_pop(vq, &elem)) {
> > > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> > >          if (iov_size(elem.in_sg, elem.in_num) < sizeof(status) ||
> > >              iov_size(elem.out_sg, elem.out_num) < sizeof(ctrl)) {
> > >              error_report("virtio-net ctrl missing headers");
> > > @@ -1031,7 +1031,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, const uint8_t *buf, size_t
> > >  
> > >          total = 0;
> > >  
> > > -        if (virtqueue_pop(q->rx_vq, &elem) == 0) {
> > > +        if (virtqueue_pop(q->rx_vq, &elem, &error_abort) == 0) {
> > >              if (i == 0)
> > >                  return -1;
> > >              error_report("virtio-net unexpected empty queue: "
> > > @@ -1134,7 +1134,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
> > >          return num_packets;
> > >      }
> > >  
> > > -    while (virtqueue_pop(q->tx_vq, &elem)) {
> > > +    while (virtqueue_pop(q->tx_vq, &elem, &error_abort)) {
> > >          ssize_t ret, len;
> > >          unsigned int out_num = elem.out_num;
> > >          struct iovec *out_sg = &elem.out_sg[0];
> > > diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> > > index c9bea06..40ba03d 100644
> > > --- a/hw/scsi/virtio-scsi.c
> > > +++ b/hw/scsi/virtio-scsi.c
> > > @@ -177,7 +177,7 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req,
> > >  static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq)
> > >  {
> > >      VirtIOSCSIReq *req = virtio_scsi_init_req(s, vq);
> > > -    if (!virtqueue_pop(vq, &req->elem)) {
> > > +    if (!virtqueue_pop(vq, &req->elem, &error_abort)) {
> > >          virtio_scsi_free_req(req);
> > >          return NULL;
> > >      }
> > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > > index 95b0643..e26c0a7 100644
> > > --- a/hw/virtio/virtio-balloon.c
> > > +++ b/hw/virtio/virtio-balloon.c
> > > @@ -206,7 +206,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > >      VirtQueueElement elem;
> > >      MemoryRegionSection section;
> > >  
> > > -    while (virtqueue_pop(vq, &elem)) {
> > > +    while (virtqueue_pop(vq, &elem, &error_abort)) {
> > >          size_t offset = 0;
> > >          uint32_t pfn;
> > >  
> > > @@ -246,7 +246,7 @@ static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
> > >      size_t offset = 0;
> > >      qemu_timeval tv;
> > >  
> > > -    if (!virtqueue_pop(vq, elem)) {
> > > +    if (!virtqueue_pop(vq, elem, &error_abort)) {
> > >          goto out;
> > >      }
> > >  
> > > diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> > > index 9e1bd75..c3cbdc3 100644
> > > --- a/hw/virtio/virtio-rng.c
> > > +++ b/hw/virtio/virtio-rng.c
> > > @@ -56,7 +56,7 @@ static void chr_read(void *opaque, const void *buf, size_t size)
> > >  
> > >      offset = 0;
> > >      while (offset < size) {
> > > -        if (!virtqueue_pop(vrng->vq, &elem)) {
> > > +        if (!virtqueue_pop(vrng->vq, &elem, &error_abort)) {
> > >              break;
> > >          }
> > >          len = iov_from_buf(elem.in_sg, elem.in_num,
> > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > index 1cd454b..e6f9f6b 100644
> > > --- a/hw/virtio/virtio.c
> > > +++ b/hw/virtio/virtio.c
> > > @@ -481,29 +481,49 @@ fail:
> > >      error_setg(errp, "virtio: error trying to map MMIO memory");
> > >  }
> > >  
> > > -int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> > > +static void virtqueue_undo_map_sg(struct iovec *sg, hwaddr *addr,
> > > +                                  size_t num_sg, int is_write)
> > > +{
> > > +    int i;
> > > +
> > > +    for (i = 0; i < num_sg; i++) {
> > > +        cpu_physical_memory_unmap(sg[i].iov_base, sg[i].iov_len,
> > > +                                  is_write, 0);
> > > +    }
> > > +}
> > > +
> > > +int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem, Error **errp)
> > >  {
> > >      unsigned int i, head, max;
> > > +    int ret;
> > >      hwaddr desc_pa = vq->vring.desc;
> > >      VirtIODevice *vdev = vq->vdev;
> > > +    Error *local_err = NULL;
> > >  
> > > -    if (!virtqueue_num_heads(vq, vq->last_avail_idx, &error_abort))
> > > -        return 0;
> > > +    ret = virtqueue_num_heads(vq, vq->last_avail_idx, &local_err);
> > > +    if (ret <= 0) {
> > > +        goto err;
> > > +    }
> > >  
> > 
> > Strange.
> > Doesn't this make pop print an error if ring is empty?
> 
> Code at err label propagates local_err (NULL if empty, hence nop) and returns
> ret (0 if empty). Maybe rename it to "out".

But there's no need to print two errors. virtqueue_num_heads already
prints an error, so just exit here.

> > 
> > 
> > 
> > >      /* When we start there are none of either input nor output. */
> > >      elem->out_num = elem->in_num = 0;
> > >  
> > >      max = vq->vring.num;
> > >  
> > > -    i = head = virtqueue_get_head(vq, vq->last_avail_idx++, &error_abort);
> > > +    ret = virtqueue_get_head(vq, vq->last_avail_idx++, &local_err);
> > > +    if (ret < 0) {
> > > +        goto err;
> > > +    }
> > > +    head = i = ret;
> > > +
> > >      if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
> > >          vring_set_avail_event(vq, vq->last_avail_idx);
> > >      }
> > >  
> > >      if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
> > >          if (vring_desc_len(vdev, desc_pa, i) % sizeof(VRingDesc)) {
> > > -            error_report("Invalid size for indirect buffer table");
> > > -            exit(1);
> > > +            error_setg(errp, "Invalid size for indirect buffer table");
> > > +            return -EINVAL;
> > 
> > Again, you have both error_setg and a return code.
> > There's no need for this.
> > Just return ring empty on error.
> 
> How would caller distinguish real ring empty from error then?
> 
> Fam

There's no need for it to do this.  When some function detects an error,
it should set the NEEDS_RESET status, and be done with it.  No need to
propagate the error condition back and forth, it would be a bunch of
poorly-tested code.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  9:16           ` Fam Zheng
@ 2015-04-21  9:55             ` Cornelia Huck
  2015-04-21  9:59             ` Michael S. Tsirkin
  1 sibling, 0 replies; 54+ messages in thread
From: Cornelia Huck @ 2015-04-21  9:55 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Michael S. Tsirkin, qemu-devel, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah, Paolo Bonzini

On Tue, 21 Apr 2015 17:16:53 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Tue, 04/21 11:08, Cornelia Huck wrote:

> > My concern is mainly about legacy setups that aren't used interactively.
> > 
> 
> How about pausing guest and generating an QMP event?

That sounds good as well.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc
  2015-04-21  7:30     ` Fam Zheng
@ 2015-04-21  9:56       ` Michael S. Tsirkin
  0 siblings, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  9:56 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 03:30:23PM +0800, Fam Zheng wrote:
> On Tue, 04/21 08:37, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:19PM +0800, Fam Zheng wrote:
> > > Two callers pass error_abort now, which can be changed to check return value
> > > and pass the error on.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > ---
> > >  hw/virtio/virtio.c | 27 ++++++++++++++++++---------
> > >  1 file changed, 18 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > index a525f8e..2a24829 100644
> > > --- a/hw/virtio/virtio.c
> > > +++ b/hw/virtio/virtio.c
> > > @@ -329,10 +329,11 @@ static int virtqueue_get_head(VirtQueue *vq, unsigned int idx,
> > >      return head;
> > >  }
> > >  
> > > -static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> > > -                                    unsigned int i, unsigned int max)
> > > +static int virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> > > +                               unsigned int i, unsigned int max,
> > > +                               Error **errp)
> > >  {
> > > -    unsigned int next;
> > > +    int next;
> > >  
> > >      /* If this descriptor says it doesn't chain, we're done. */
> > >      if (!(vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_NEXT)) {
> > > @@ -345,8 +346,8 @@ static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> > >      smp_wmb();
> > >  
> > >      if (next >= max) {
> > > -        error_report("Desc next is %u", next);
> > > -        exit(1);
> > > +        error_setg(errp, "Desc next is %u", next);
> > > +        return -EINVAL;
> > 
> > I think it's best to return max here. No need to change return type
> > then.
> 
> We use negative return code elsewherer for reporting errors, I personally
> prefer -EINVAL.  Are you concerned about overflow?

Yes.

> > 
> > >      }
> > >  
> > >      return next;
> > > @@ -392,7 +393,7 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> > >              num_bufs = i = 0;
> > >          }
> > >  
> > > -        do {
> > > +        while (true) {
> > >              /* If we've got too many, that implies a descriptor loop. */
> > >              if (++num_bufs > max) {
> > >                  error_report("Looped descriptor");
> > > @@ -407,7 +408,11 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
> > >              if (in_total >= max_in_bytes && out_total >= max_out_bytes) {
> > >                  goto done;
> > >              }
> > > -        } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> > > +            i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> > > +            if (i == max) {
> > > +                break;
> > > +            }
> > > +        }
> > >  
> > >          if (!indirect)
> > >              total_bufs = num_bufs;
> > > @@ -493,7 +498,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> > >      }
> > >  
> > >      /* Collect all the descriptors */
> > > -    do {
> > > +    while (true) {
> > >          struct iovec *sg;
> > >  
> > >          if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
> > > @@ -519,7 +524,11 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
> > >              error_report("Looped descriptor");
> > >              exit(1);
> > >          }
> > > -    } while ((i = virtqueue_next_desc(vdev, desc_pa, i, max)) != max);
> > > +        i = virtqueue_next_desc(vdev, desc_pa, i, max, &error_abort);
> > > +        if (i == max) {
> > > +            break;
> > > +        }
> > > +    }
> > >  
> > 
> > Why refactor this as part of this patch?
> 
> Graceful error handling will need to un-inline the loop condition, so refactor
> it as we're touching the line.
> 
> Fam

I don't think adding a ton of untested paths is a good strategy for
error-handling. When you detect an error, report it then go back to
normal path as quickly as possible. In this case, reporting
ring empty to caller will make caller stop which is
exactly what we want.

> > 
> > >      /* Now map what we have collected */
> > >      virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1,
> > > -- 
> > > 1.9.3

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  9:16           ` Fam Zheng
  2015-04-21  9:55             ` Cornelia Huck
@ 2015-04-21  9:59             ` Michael S. Tsirkin
  1 sibling, 0 replies; 54+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  9:59 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-devel, Amit Shah, Aneesh Kumar K.V,
	Stefan Hajnoczi, Cornelia Huck, Paolo Bonzini

On Tue, Apr 21, 2015 at 05:16:53PM +0800, Fam Zheng wrote:
> On Tue, 04/21 11:08, Cornelia Huck wrote:
> > On Tue, 21 Apr 2015 16:38:31 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> > 
> > > On Tue, 04/21 10:04, Cornelia Huck wrote:
> > > > On Tue, 21 Apr 2015 15:44:02 +0800
> > > > Fam Zheng <famz@redhat.com> wrote:
> > > > 
> > > > > On Mon, 04/20 17:13, Cornelia Huck wrote:
> > > > > > On Fri, 17 Apr 2015 15:59:15 +0800
> > > > > > Fam Zheng <famz@redhat.com> wrote:
> > > > > > 
> > > > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > > > data with vring. That has drawbacks such as losing unsaved data (e.g. when
> > > > > > > guest user is writing a very long email), or possible denial of service in
> > > > > > > a nested vm use case where virtio device is passed through.
> > > > > > > 
> > > > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > > > improve this by communicating the error state between virtio devices and
> > > > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > > > should detect this bit and report to userspace, or recover the device by
> > > > > > > resetting it.
> > > > > > > 
> > > > > > > This series makes necessary changes in virtio core code, based on which
> > > > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > > > driver part will also be worked on.
> > > > > > > 
> > > > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > > > address that, we could probably add a new error action option to virtio
> > > > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > > > be done on top.
> > > > > > 
> > > > > > In principle, this looks nice; I'm not sure however how this affects
> > > > > > non-virtio-1 devices.
> > > > > > 
> > > > > > If a device is operating in virtio-1 mode, everything is clearly
> > > > > > specified: The guest is notified and if it is aware of the NEEDS_RESET
> > > > > > bit, it can react accordingly.
> > > > > > 
> > > > > > But what about legacy devices? Even if they are notified, they don't
> > > > > > know to check for NEEDS_RESET - and I'm not sure if the undefined
> > > > > > behaviour after NEEDS_RESET might lead to bigger trouble than killing
> > > > > > off the guest.
> > > > > > 
> > > > > 
> > > > > The device should become unresponsive to VQ output until guest issues a reset
> > > > > through bus commands.  Do you have an example of "big trouble" in mind?
> > > > 
> > > > I'm not sure what's supposed to happen if NEEDS_RESET is set but not
> > > > everything is fenced off. The guest may see that queues have become
> > > > unresponsive, but if we don't stop ioeventfds and fence off
> > > > notifications, it may easily get into an undefined state internally.
> > > 
> > > Yeah, disabling ioeventfds and notifications is a good idea.
> > > 
> > > > And if it is connected to other guests via networking, having it limp
> > > > on may be worse than just killing it off. (Which parts of the data have
> > > > been cleanly written to disk and which haven't?
> > > 
> > > Well, we don't know that even without this series, do we?
> > 
> > We know it hasn't, as the guest is dead :)
> > 
> > > 
> > > > How is it going to get
> > > > out of that pickle if it has no good idea of what is wrong?
> > > 
> > > If it's virtio-1 compatible, it can reset the device or mark the device
> > > ususable, either way guest gets a chance to save the work.
> > 
> > My problem is not with virtio-1 devices; although data certainly can't
> > be written if the device has become unusable.
> > 
> > > 
> > > If it's not, it's merely an unresponsive device, and guest user can
> > > reboot/shutdown.
> > 
> > But how does any management software know? If I'm logged into a system
> > and I notice that saving my data doesn't complete, I can trigger an
> > action (although reboot/shutdown may not work anymore if too many
> > threads are waiting on writeback), but how can an automation system
> > know? It is probably more useful for those setups to have a hard stop
> > if recovery is not possible - and for legacy systems, that means
> > killing the guest afaics.
> > 
> > > 
> > > > 
> > > > If I have to debug a non-working guest, I prefer a crashed one with a
> > > > clean state over one that has continued running after the error
> > > > occurred.
> > > 
> > > For debugging purpose, crashing is definitely fine (even better :), but only
> > > because we won't have critical applications in guest. 
> > 
> > I would argue even for critical applications. They should have a second
> > guest as backup :)
> > 
> > > It makes sense to user to
> > > avoid the overkiller "exit(1)"'s in QEMU. They don't even generate a core file.
> > 
> > Let's keep dying, but use abort? Would that help?
> > 
> > > And even if they do, it would be much more painful to recover an unsaved
> > > libreoffice document from a memory core.
> > 
> > See my reply above.
> > 
> > My concern is mainly about legacy setups that aren't used interactively.
> > 
> 
> How about pausing guest and generating an QMP event?
> 
> Fam

This can be an option, default should be backeards-compatible though.

-- 
MST

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2015-04-21  9:59 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-17  7:59 [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 01/18] virtio: Return error from virtqueue_map_sg Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 02/18] virtio: Return error from virtqueue_num_heads Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 03/18] virtio: Return error from virtqueue_get_head Fam Zheng
2015-04-21  6:27   ` Michael S. Tsirkin
2015-04-17  7:59 ` [Qemu-devel] [PATCH 04/18] virtio: Return error from virtqueue_next_desc Fam Zheng
2015-04-21  6:37   ` Michael S. Tsirkin
2015-04-21  7:30     ` Fam Zheng
2015-04-21  9:56       ` Michael S. Tsirkin
2015-04-17  7:59 ` [Qemu-devel] [PATCH 05/18] virtio: Return error from virtqueue_get_avail_bytes Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 06/18] virtio: Return error from virtqueue_pop Fam Zheng
2015-04-21  6:49   ` Michael S. Tsirkin
2015-04-21  7:24     ` Fam Zheng
2015-04-21  9:51       ` Michael S. Tsirkin
2015-04-17  7:59 ` [Qemu-devel] [PATCH 07/18] virtio: Return error from virtqueue_avail_bytes Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 08/18] virtio: Return error from virtio_add_queue Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 09/18] virtio: Return error from virtio_del_queue Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 10/18] virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 11/18] virtio: Add "needs_reset" flag to virtio device Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 12/18] virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 13/18] virtio-blk: Graceful error handling of virtqueue_pop Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 14/18] qtest: Add "QTEST_FILTER" to filter test cases Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 15/18] qtest: virtio-blk: Extract "setup" for future reuse Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 16/18] libqos: Add qvirtio_needs_reset Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 17/18] qtest: Add test case for "needs reset" of virtio-blk Fam Zheng
2015-04-17  7:59 ` [Qemu-devel] [PATCH 18/18] qtest: virtio-blk: Suppress virtio error messages in "make check" Fam Zheng
2015-04-20 15:13 ` [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Cornelia Huck
2015-04-21  7:44   ` Fam Zheng
2015-04-21  8:04     ` Cornelia Huck
2015-04-21  8:38       ` Fam Zheng
2015-04-21  9:08         ` Cornelia Huck
2015-04-21  9:16           ` Fam Zheng
2015-04-21  9:55             ` Cornelia Huck
2015-04-21  9:59             ` Michael S. Tsirkin
2015-04-20 17:36 ` Michael S. Tsirkin
2015-04-20 17:36   ` Michael S. Tsirkin
2015-04-20 19:10   ` [Qemu-devel] " Paolo Bonzini
2015-04-20 19:10     ` Paolo Bonzini
2015-04-20 20:34     ` [Qemu-devel] " Michael S. Tsirkin
2015-04-20 20:34       ` Michael S. Tsirkin
2015-04-21  2:39       ` [Qemu-devel] " Fam Zheng
2015-04-21  2:39         ` Fam Zheng
2015-04-21  6:52       ` [Qemu-devel] " Paolo Bonzini
2015-04-21  6:52         ` Paolo Bonzini
2015-04-21  6:58         ` [Qemu-devel] " Michael S. Tsirkin
2015-04-21  6:58           ` Michael S. Tsirkin
2015-04-21  2:37   ` [Qemu-devel] " Fam Zheng
2015-04-21  2:37     ` Fam Zheng
2015-04-21  5:22     ` Michael S. Tsirkin
2015-04-21  5:22       ` Michael S. Tsirkin
2015-04-21  5:50       ` Fam Zheng
2015-04-21  5:50         ` Fam Zheng
2015-04-21  6:09         ` Michael S. Tsirkin
2015-04-21  6:09           ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.