All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2
@ 2018-11-02 10:00 Tim Smith
  2018-11-02 10:00 ` [PATCH 1/3] Improve xen_disk batching behaviour Tim Smith
                   ` (7 more replies)
  0 siblings, 8 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:00 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

A series of performance improvements for disks using the Xen PV ring.

These have had fairly extensive testing.

The batching and latency improvements together boost the throughput
of small reads and writes by two to six percent (measured using fio
in the guest)

Avoiding repeated calls to posix_memalign() reduced the dirty heap
from 25MB to 5MB in the case of a single datapath process while also
improving performance.

v2 removes some checkpatch complaints and fixes the CCs

---

Tim Smith (3):
      Improve xen_disk batching behaviour
      Improve xen_disk response latency
      Avoid repeated memory allocation in xen_disk


 hw/block/xen_disk.c |   82 +++++++++++++++++++++++++++++----------------------
 1 file changed, 46 insertions(+), 36 deletions(-)

--
Tim Smith <tim.smith@citrix.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
  2018-11-02 10:00 ` [PATCH 1/3] Improve xen_disk batching behaviour Tim Smith
@ 2018-11-02 10:00 ` Tim Smith
  2018-11-02 11:14   ` Paul Durrant
                     ` (3 more replies)
  2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
                   ` (5 subsequent siblings)
  7 siblings, 4 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:00 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 36eff94f84..cb2881b7e6 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -101,6 +101,9 @@ struct XenBlkDev {
     AioContext          *ctx;
 };
 
+/* Threshold of in-flight requests above which we will start using
+ * blk_io_plug()/blk_io_unplug() to batch requests */
+#define IO_PLUG_THRESHOLD 1
 /* ------------------------------------------------------------- */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 {
     RING_IDX rc, rp;
     struct ioreq *ioreq;
+    int inflight_atstart = blkdev->requests_inflight;
+    int batched = 0;
 
     blkdev->more_work = 0;
 
@@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
     blk_send_response_all(blkdev);
+    /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
+     * when we got here, this is an indication that there the bottleneck
+     * is below us, so it's worth beginning to batch up I/O requests
+     * rather than submitting them immediately. The maximum number
+     * of requests we're willing to batch is the number already in
+     * flight, so it can grow up to max_requests when the bottleneck
+     * is below us */
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_plug(blkdev->blk);
+    }
     while (rc != rp) {
         /* pull request from ring */
         if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
@@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
             continue;
         }
 
+        if (inflight_atstart > IO_PLUG_THRESHOLD &&
+            batched >= inflight_atstart) {
+            blk_io_unplug(blkdev->blk);
+        }
         ioreq_runio_qemu_aio(ioreq);
+        if (inflight_atstart > IO_PLUG_THRESHOLD) {
+            if (batched >= inflight_atstart) {
+                blk_io_plug(blkdev->blk);
+                batched = 0;
+            } else {
+                batched++;
+            }
+        }
+    }
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_unplug(blkdev->blk);
     }
 
     if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) {

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
@ 2018-11-02 10:00 ` Tim Smith
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:00 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 36eff94f84..cb2881b7e6 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -101,6 +101,9 @@ struct XenBlkDev {
     AioContext          *ctx;
 };
 
+/* Threshold of in-flight requests above which we will start using
+ * blk_io_plug()/blk_io_unplug() to batch requests */
+#define IO_PLUG_THRESHOLD 1
 /* ------------------------------------------------------------- */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 {
     RING_IDX rc, rp;
     struct ioreq *ioreq;
+    int inflight_atstart = blkdev->requests_inflight;
+    int batched = 0;
 
     blkdev->more_work = 0;
 
@@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
     blk_send_response_all(blkdev);
+    /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
+     * when we got here, this is an indication that there the bottleneck
+     * is below us, so it's worth beginning to batch up I/O requests
+     * rather than submitting them immediately. The maximum number
+     * of requests we're willing to batch is the number already in
+     * flight, so it can grow up to max_requests when the bottleneck
+     * is below us */
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_plug(blkdev->blk);
+    }
     while (rc != rp) {
         /* pull request from ring */
         if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
@@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
             continue;
         }
 
+        if (inflight_atstart > IO_PLUG_THRESHOLD &&
+            batched >= inflight_atstart) {
+            blk_io_unplug(blkdev->blk);
+        }
         ioreq_runio_qemu_aio(ioreq);
+        if (inflight_atstart > IO_PLUG_THRESHOLD) {
+            if (batched >= inflight_atstart) {
+                blk_io_plug(blkdev->blk);
+                batched = 0;
+            } else {
+                batched++;
+            }
+        }
+    }
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_unplug(blkdev->blk);
     }
 
     if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) {


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
  2018-11-02 10:00 ` [PATCH 1/3] Improve xen_disk batching behaviour Tim Smith
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
@ 2018-11-02 10:01 ` Tim Smith
  2018-11-02 11:14     ` Paul Durrant
                     ` (2 more replies)
  2018-11-02 10:01 ` Tim Smith
                   ` (4 subsequent siblings)
  7 siblings, 3 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:01 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.

To alleviate that, complete IO requests as soon as they come back.
blk_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   43 ++++++++++++-------------------------------
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index cb2881b7e6..b506e23868 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -83,11 +83,9 @@ struct XenBlkDev {
 
     /* request lists */
     QLIST_HEAD(inflight_head, ioreq) inflight;
-    QLIST_HEAD(finished_head, ioreq) finished;
     QLIST_HEAD(freelist_head, ioreq) freelist;
     int                 requests_total;
     int                 requests_inflight;
-    int                 requests_finished;
     unsigned int        max_requests;
 
     gboolean            feature_discard;
@@ -104,6 +102,9 @@ struct XenBlkDev {
 /* Threshold of in-flight requests above which we will start using
  * blk_io_plug()/blk_io_unplug() to batch requests */
 #define IO_PLUG_THRESHOLD 1
+
+static int blk_send_response(struct ioreq *ioreq);
+
 /* ------------------------------------------------------------- */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
     QLIST_REMOVE(ioreq, list);
-    QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
     blkdev->requests_inflight--;
-    blkdev->requests_finished++;
 }
 
-static void ioreq_release(struct ioreq *ioreq, bool finish)
+static void ioreq_release(struct ioreq *ioreq)
 {
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
@@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
     ioreq_reset(ioreq);
     ioreq->blkdev = blkdev;
     QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
-    if (finish) {
-        blkdev->requests_finished--;
-    } else {
-        blkdev->requests_inflight--;
-    }
+    blkdev->requests_inflight--;
 }
 
 /*
@@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
     default:
         break;
     }
+    if (blk_send_response(ioreq)) {
+        xen_pv_send_notify(&blkdev->xendev);
+    }
+    ioreq_release(ioreq);
     qemu_bh_schedule(blkdev->bh);
 
 done:
@@ -455,7 +454,7 @@ err:
     return -1;
 }
 
-static int blk_send_response_one(struct ioreq *ioreq)
+static int blk_send_response(struct ioreq *ioreq)
 {
     struct XenBlkDev  *blkdev = ioreq->blkdev;
     int               send_notify   = 0;
@@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
     return send_notify;
 }
 
-/* walk finished list, send outstanding responses, free requests */
-static void blk_send_response_all(struct XenBlkDev *blkdev)
-{
-    struct ioreq *ioreq;
-    int send_notify = 0;
-
-    while (!QLIST_EMPTY(&blkdev->finished)) {
-        ioreq = QLIST_FIRST(&blkdev->finished);
-        send_notify += blk_send_response_one(ioreq);
-        ioreq_release(ioreq, true);
-    }
-    if (send_notify) {
-        xen_pv_send_notify(&blkdev->xendev);
-    }
-}
-
 static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, RING_IDX rc)
 {
     switch (blkdev->protocol) {
@@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
     rp = blkdev->rings.common.sring->req_prod;
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-    blk_send_response_all(blkdev);
     /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
      * when we got here, this is an indication that there the bottleneck
      * is below us, so it's worth beginning to batch up I/O requests
@@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
                 break;
             };
 
-            if (blk_send_response_one(ioreq)) {
+            if (blk_send_response(ioreq)) {
                 xen_pv_send_notify(&blkdev->xendev);
             }
-            ioreq_release(ioreq, false);
+            ioreq_release(ioreq);
             continue;
         }
 
@@ -646,7 +628,6 @@ static void blk_alloc(struct XenDevice *xendev)
     trace_xen_disk_alloc(xendev->name);
 
     QLIST_INIT(&blkdev->inflight);
-    QLIST_INIT(&blkdev->finished);
     QLIST_INIT(&blkdev->freelist);
 
     blkdev->iothread = iothread_create(xendev->name, &err);

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/3] Improve xen_disk response latency
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
                   ` (2 preceding siblings ...)
  2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
@ 2018-11-02 10:01 ` Tim Smith
  2018-11-02 10:01 ` [PATCH 3/3] Avoid repeated memory allocation in xen_disk Tim Smith
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:01 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.

To alleviate that, complete IO requests as soon as they come back.
blk_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   43 ++++++++++++-------------------------------
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index cb2881b7e6..b506e23868 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -83,11 +83,9 @@ struct XenBlkDev {
 
     /* request lists */
     QLIST_HEAD(inflight_head, ioreq) inflight;
-    QLIST_HEAD(finished_head, ioreq) finished;
     QLIST_HEAD(freelist_head, ioreq) freelist;
     int                 requests_total;
     int                 requests_inflight;
-    int                 requests_finished;
     unsigned int        max_requests;
 
     gboolean            feature_discard;
@@ -104,6 +102,9 @@ struct XenBlkDev {
 /* Threshold of in-flight requests above which we will start using
  * blk_io_plug()/blk_io_unplug() to batch requests */
 #define IO_PLUG_THRESHOLD 1
+
+static int blk_send_response(struct ioreq *ioreq);
+
 /* ------------------------------------------------------------- */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
     QLIST_REMOVE(ioreq, list);
-    QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
     blkdev->requests_inflight--;
-    blkdev->requests_finished++;
 }
 
-static void ioreq_release(struct ioreq *ioreq, bool finish)
+static void ioreq_release(struct ioreq *ioreq)
 {
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
@@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
     ioreq_reset(ioreq);
     ioreq->blkdev = blkdev;
     QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
-    if (finish) {
-        blkdev->requests_finished--;
-    } else {
-        blkdev->requests_inflight--;
-    }
+    blkdev->requests_inflight--;
 }
 
 /*
@@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
     default:
         break;
     }
+    if (blk_send_response(ioreq)) {
+        xen_pv_send_notify(&blkdev->xendev);
+    }
+    ioreq_release(ioreq);
     qemu_bh_schedule(blkdev->bh);
 
 done:
@@ -455,7 +454,7 @@ err:
     return -1;
 }
 
-static int blk_send_response_one(struct ioreq *ioreq)
+static int blk_send_response(struct ioreq *ioreq)
 {
     struct XenBlkDev  *blkdev = ioreq->blkdev;
     int               send_notify   = 0;
@@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
     return send_notify;
 }
 
-/* walk finished list, send outstanding responses, free requests */
-static void blk_send_response_all(struct XenBlkDev *blkdev)
-{
-    struct ioreq *ioreq;
-    int send_notify = 0;
-
-    while (!QLIST_EMPTY(&blkdev->finished)) {
-        ioreq = QLIST_FIRST(&blkdev->finished);
-        send_notify += blk_send_response_one(ioreq);
-        ioreq_release(ioreq, true);
-    }
-    if (send_notify) {
-        xen_pv_send_notify(&blkdev->xendev);
-    }
-}
-
 static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, RING_IDX rc)
 {
     switch (blkdev->protocol) {
@@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
     rp = blkdev->rings.common.sring->req_prod;
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-    blk_send_response_all(blkdev);
     /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
      * when we got here, this is an indication that there the bottleneck
      * is below us, so it's worth beginning to batch up I/O requests
@@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
                 break;
             };
 
-            if (blk_send_response_one(ioreq)) {
+            if (blk_send_response(ioreq)) {
                 xen_pv_send_notify(&blkdev->xendev);
             }
-            ioreq_release(ioreq, false);
+            ioreq_release(ioreq);
             continue;
         }
 
@@ -646,7 +628,6 @@ static void blk_alloc(struct XenDevice *xendev)
     trace_xen_disk_alloc(xendev->name);
 
     QLIST_INIT(&blkdev->inflight);
-    QLIST_INIT(&blkdev->finished);
     QLIST_INIT(&blkdev->freelist);
 
     blkdev->iothread = iothread_create(xendev->name, &err);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
                   ` (4 preceding siblings ...)
  2018-11-02 10:01 ` [PATCH 3/3] Avoid repeated memory allocation in xen_disk Tim Smith
@ 2018-11-02 10:01 ` Tim Smith
  2018-11-02 11:15     ` Paul Durrant
  2018-11-02 13:53     ` Anthony PERARD
  2018-11-02 11:04 ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Kevin Wolf
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
  7 siblings, 2 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:01 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.

Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index b506e23868..faaeefba29 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
     memset(&ioreq->req, 0, sizeof(ioreq->req));
     ioreq->status = 0;
     ioreq->start = 0;
-    ioreq->buf = NULL;
     ioreq->size = 0;
     ioreq->presync = 0;
 
@@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
         /* allocate new struct */
         ioreq = g_malloc0(sizeof(*ioreq));
         ioreq->blkdev = blkdev;
+        /* We cannot need more pages per ioreq than this, and we do re-use
+         * ioreqs, so allocate the memory once here, to be freed in
+         * blk_free() when the ioreq is freed. */
+        ioreq->buf = qemu_memalign(XC_PAGE_SIZE, BLKIF_MAX_SEGMENTS_PER_REQUEST
+                                   * XC_PAGE_SIZE);
         blkdev->requests_total++;
         qemu_iovec_init(&ioreq->v, 1);
     } else {
@@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
         if (ret == 0) {
             ioreq_grant_copy(ioreq);
         }
-        qemu_vfree(ioreq->buf);
         break;
     case BLKIF_OP_WRITE:
     case BLKIF_OP_FLUSH_DISKCACHE:
         if (!ioreq->req.nr_segments) {
             break;
         }
-        qemu_vfree(ioreq->buf);
         break;
     default:
         break;
@@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
-    ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
     if (ioreq->req.nr_segments &&
         (ioreq->req.operation == BLKIF_OP_WRITE ||
          ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
         ioreq_grant_copy(ioreq)) {
-        qemu_vfree(ioreq->buf);
         goto err;
     }
 
@@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
         ioreq = QLIST_FIRST(&blkdev->freelist);
         QLIST_REMOVE(ioreq, list);
         qemu_iovec_destroy(&ioreq->v);
+        qemu_vfree(ioreq->buf);
         g_free(ioreq);
     }
 

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/3] Avoid repeated memory allocation in xen_disk
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
                   ` (3 preceding siblings ...)
  2018-11-02 10:01 ` Tim Smith
@ 2018-11-02 10:01 ` Tim Smith
  2018-11-02 10:01 ` [Qemu-devel] " Tim Smith
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 49+ messages in thread
From: Tim Smith @ 2018-11-02 10:01 UTC (permalink / raw)
  To: xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz

xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.

Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
---
 hw/block/xen_disk.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index b506e23868..faaeefba29 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
     memset(&ioreq->req, 0, sizeof(ioreq->req));
     ioreq->status = 0;
     ioreq->start = 0;
-    ioreq->buf = NULL;
     ioreq->size = 0;
     ioreq->presync = 0;
 
@@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
         /* allocate new struct */
         ioreq = g_malloc0(sizeof(*ioreq));
         ioreq->blkdev = blkdev;
+        /* We cannot need more pages per ioreq than this, and we do re-use
+         * ioreqs, so allocate the memory once here, to be freed in
+         * blk_free() when the ioreq is freed. */
+        ioreq->buf = qemu_memalign(XC_PAGE_SIZE, BLKIF_MAX_SEGMENTS_PER_REQUEST
+                                   * XC_PAGE_SIZE);
         blkdev->requests_total++;
         qemu_iovec_init(&ioreq->v, 1);
     } else {
@@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
         if (ret == 0) {
             ioreq_grant_copy(ioreq);
         }
-        qemu_vfree(ioreq->buf);
         break;
     case BLKIF_OP_WRITE:
     case BLKIF_OP_FLUSH_DISKCACHE:
         if (!ioreq->req.nr_segments) {
             break;
         }
-        qemu_vfree(ioreq->buf);
         break;
     default:
         break;
@@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
-    ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
     if (ioreq->req.nr_segments &&
         (ioreq->req.operation == BLKIF_OP_WRITE ||
          ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
         ioreq_grant_copy(ioreq)) {
-        qemu_vfree(ioreq->buf);
         goto err;
     }
 
@@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
         ioreq = QLIST_FIRST(&blkdev->freelist);
         QLIST_REMOVE(ioreq, list);
         qemu_iovec_destroy(&ioreq->v);
+        qemu_vfree(ioreq->buf);
         g_free(ioreq);
     }
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
                   ` (6 preceding siblings ...)
  2018-11-02 11:04 ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Kevin Wolf
@ 2018-11-02 11:04 ` Kevin Wolf
  2018-11-02 11:13   ` Paul Durrant
                     ` (3 more replies)
  7 siblings, 4 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-02 11:04 UTC (permalink / raw)
  To: Tim Smith
  Cc: xen-devel, qemu-devel, qemu-block, Anthony Perard, Paul Durrant,
	Stefano Stabellini, Max Reitz, armbru

Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> A series of performance improvements for disks using the Xen PV ring.
> 
> These have had fairly extensive testing.
> 
> The batching and latency improvements together boost the throughput
> of small reads and writes by two to six percent (measured using fio
> in the guest)
> 
> Avoiding repeated calls to posix_memalign() reduced the dirty heap
> from 25MB to 5MB in the case of a single datapath process while also
> improving performance.
> 
> v2 removes some checkpatch complaints and fixes the CCs

Completely unrelated, but since you're the first person touching
xen_disk in a while, you're my victim:

At KVM Forum we discussed sending a patch to deprecate xen_disk because
after all those years, it still hasn't been converted to qdev. Markus is
currently fixing some other not yet qdevified block device, but after
that xen_disk will be the only one left.

A while ago, a downstream patch review found out that there are some QMP
commands that would immediately crash if a xen_disk device were present
because of the lacking qdevification. This is not the code quality
standard I envision for QEMU. It's time for non-qdev devices to go.

So if you guys are still interested in the device, could someone please
finally look into converting it?

Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
                   ` (5 preceding siblings ...)
  2018-11-02 10:01 ` [Qemu-devel] " Tim Smith
@ 2018-11-02 11:04 ` Kevin Wolf
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
  7 siblings, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-02 11:04 UTC (permalink / raw)
  To: Tim Smith
  Cc: Stefano Stabellini, qemu-block, armbru, qemu-devel, Max Reitz,
	Paul Durrant, Anthony Perard, xen-devel

Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> A series of performance improvements for disks using the Xen PV ring.
> 
> These have had fairly extensive testing.
> 
> The batching and latency improvements together boost the throughput
> of small reads and writes by two to six percent (measured using fio
> in the guest)
> 
> Avoiding repeated calls to posix_memalign() reduced the dirty heap
> from 25MB to 5MB in the case of a single datapath process while also
> improving performance.
> 
> v2 removes some checkpatch complaints and fixes the CCs

Completely unrelated, but since you're the first person touching
xen_disk in a while, you're my victim:

At KVM Forum we discussed sending a patch to deprecate xen_disk because
after all those years, it still hasn't been converted to qdev. Markus is
currently fixing some other not yet qdevified block device, but after
that xen_disk will be the only one left.

A while ago, a downstream patch review found out that there are some QMP
commands that would immediately crash if a xen_disk device were present
because of the lacking qdevification. This is not the code quality
standard I envision for QEMU. It's time for non-qdev devices to go.

So if you guys are still interested in the device, could someone please
finally look into converting it?

Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
  2018-11-02 11:13   ` Paul Durrant
@ 2018-11-02 11:13   ` Paul Durrant
  2018-11-02 12:14     ` Kevin Wolf
                       ` (2 more replies)
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
  2018-12-12  8:59   ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
  3 siblings, 3 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:13 UTC (permalink / raw)
  To: 'Kevin Wolf', Tim Smith
  Cc: xen-devel, qemu-devel, qemu-block, Anthony Perard,
	Stefano Stabellini, Max Reitz, armbru

> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Sent: 02 November 2018 11:04
> To: Tim Smith <tim.smith@citrix.com>
> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> for xen_disk v2)
> 
> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > A series of performance improvements for disks using the Xen PV ring.
> >
> > These have had fairly extensive testing.
> >
> > The batching and latency improvements together boost the throughput
> > of small reads and writes by two to six percent (measured using fio
> > in the guest)
> >
> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > from 25MB to 5MB in the case of a single datapath process while also
> > improving performance.
> >
> > v2 removes some checkpatch complaints and fixes the CCs
> 
> Completely unrelated, but since you're the first person touching
> xen_disk in a while, you're my victim:
> 
> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> after all those years, it still hasn't been converted to qdev. Markus is
> currently fixing some other not yet qdevified block device, but after
> that xen_disk will be the only one left.
> 
> A while ago, a downstream patch review found out that there are some QMP
> commands that would immediately crash if a xen_disk device were present
> because of the lacking qdevification. This is not the code quality
> standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> So if you guys are still interested in the device, could someone please
> finally look into converting it?
> 

I have a patch series to do exactly this. It's somewhat involved as I need to convert the whole PV backend infrastructure. I will try to rebase and clean up my series a.s.a.p.

  Paul

> Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
@ 2018-11-02 11:13   ` Paul Durrant
  2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:13 UTC (permalink / raw)
  To: 'Kevin Wolf', Tim Smith
  Cc: Stefano Stabellini, qemu-block, armbru, qemu-devel, Max Reitz,
	Anthony Perard, xen-devel

> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Sent: 02 November 2018 11:04
> To: Tim Smith <tim.smith@citrix.com>
> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> for xen_disk v2)
> 
> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > A series of performance improvements for disks using the Xen PV ring.
> >
> > These have had fairly extensive testing.
> >
> > The batching and latency improvements together boost the throughput
> > of small reads and writes by two to six percent (measured using fio
> > in the guest)
> >
> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > from 25MB to 5MB in the case of a single datapath process while also
> > improving performance.
> >
> > v2 removes some checkpatch complaints and fixes the CCs
> 
> Completely unrelated, but since you're the first person touching
> xen_disk in a while, you're my victim:
> 
> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> after all those years, it still hasn't been converted to qdev. Markus is
> currently fixing some other not yet qdevified block device, but after
> that xen_disk will be the only one left.
> 
> A while ago, a downstream patch review found out that there are some QMP
> commands that would immediately crash if a xen_disk device were present
> because of the lacking qdevification. This is not the code quality
> standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> So if you guys are still interested in the device, could someone please
> finally look into converting it?
> 

I have a patch series to do exactly this. It's somewhat involved as I need to convert the whole PV backend infrastructure. I will try to rebase and clean up my series a.s.a.p.

  Paul

> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
@ 2018-11-02 11:14   ` Paul Durrant
  2018-11-02 11:14   ` Paul Durrant
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:14 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz

> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 1/3] Improve xen_disk batching behaviour
> 
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 36eff94f84..cb2881b7e6 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -101,6 +101,9 @@ struct XenBlkDev {
>      AioContext          *ctx;
>  };
> 
> +/* Threshold of in-flight requests above which we will start using
> + * blk_io_plug()/blk_io_unplug() to batch requests */
> +#define IO_PLUG_THRESHOLD 1
>  /* ------------------------------------------------------------- */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  {
>      RING_IDX rc, rp;
>      struct ioreq *ioreq;
> +    int inflight_atstart = blkdev->requests_inflight;
> +    int batched = 0;
> 
>      blkdev->more_work = 0;
> 
> @@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>      xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
>      blk_send_response_all(blkdev);
> +    /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
> +     * when we got here, this is an indication that there the bottleneck
> +     * is below us, so it's worth beginning to batch up I/O requests
> +     * rather than submitting them immediately. The maximum number
> +     * of requests we're willing to batch is the number already in
> +     * flight, so it can grow up to max_requests when the bottleneck
> +     * is below us */
> +    if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +        blk_io_plug(blkdev->blk);
> +    }
>      while (rc != rp) {
>          /* pull request from ring */
>          if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
> @@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>              continue;
>          }
> 
> +        if (inflight_atstart > IO_PLUG_THRESHOLD &&
> +            batched >= inflight_atstart) {
> +            blk_io_unplug(blkdev->blk);
> +        }
>          ioreq_runio_qemu_aio(ioreq);
> +        if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +            if (batched >= inflight_atstart) {
> +                blk_io_plug(blkdev->blk);
> +                batched = 0;
> +            } else {
> +                batched++;
> +            }
> +        }
> +    }
> +    if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +        blk_io_unplug(blkdev->blk);
>      }
> 
>      if (blkdev->more_work && blkdev->requests_inflight < blkdev-
> >max_requests) {


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
  2018-11-02 11:14   ` Paul Durrant
@ 2018-11-02 11:14   ` Paul Durrant
  2018-11-02 13:53   ` Anthony PERARD
  2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
  3 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:14 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz

> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 1/3] Improve xen_disk batching behaviour
> 
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 36eff94f84..cb2881b7e6 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -101,6 +101,9 @@ struct XenBlkDev {
>      AioContext          *ctx;
>  };
> 
> +/* Threshold of in-flight requests above which we will start using
> + * blk_io_plug()/blk_io_unplug() to batch requests */
> +#define IO_PLUG_THRESHOLD 1
>  /* ------------------------------------------------------------- */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  {
>      RING_IDX rc, rp;
>      struct ioreq *ioreq;
> +    int inflight_atstart = blkdev->requests_inflight;
> +    int batched = 0;
> 
>      blkdev->more_work = 0;
> 
> @@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>      xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
>      blk_send_response_all(blkdev);
> +    /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
> +     * when we got here, this is an indication that there the bottleneck
> +     * is below us, so it's worth beginning to batch up I/O requests
> +     * rather than submitting them immediately. The maximum number
> +     * of requests we're willing to batch is the number already in
> +     * flight, so it can grow up to max_requests when the bottleneck
> +     * is below us */
> +    if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +        blk_io_plug(blkdev->blk);
> +    }
>      while (rc != rp) {
>          /* pull request from ring */
>          if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
> @@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>              continue;
>          }
> 
> +        if (inflight_atstart > IO_PLUG_THRESHOLD &&
> +            batched >= inflight_atstart) {
> +            blk_io_unplug(blkdev->blk);
> +        }
>          ioreq_runio_qemu_aio(ioreq);
> +        if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +            if (batched >= inflight_atstart) {
> +                blk_io_plug(blkdev->blk);
> +                batched = 0;
> +            } else {
> +                batched++;
> +            }
> +        }
> +    }
> +    if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +        blk_io_unplug(blkdev->blk);
>      }
> 
>      if (blkdev->more_work && blkdev->requests_inflight < blkdev-
> >max_requests) {

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency
  2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
@ 2018-11-02 11:14     ` Paul Durrant
  2018-11-02 13:53   ` Anthony PERARD
  2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
  2 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:14 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz



> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 2/3] Improve xen_disk response latency
> 
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   43 ++++++++++++-------------------------------
>  1 file changed, 12 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index cb2881b7e6..b506e23868 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -83,11 +83,9 @@ struct XenBlkDev {
> 
>      /* request lists */
>      QLIST_HEAD(inflight_head, ioreq) inflight;
> -    QLIST_HEAD(finished_head, ioreq) finished;
>      QLIST_HEAD(freelist_head, ioreq) freelist;
>      int                 requests_total;
>      int                 requests_inflight;
> -    int                 requests_finished;
>      unsigned int        max_requests;
> 
>      gboolean            feature_discard;
> @@ -104,6 +102,9 @@ struct XenBlkDev {
>  /* Threshold of in-flight requests above which we will start using
>   * blk_io_plug()/blk_io_unplug() to batch requests */
>  #define IO_PLUG_THRESHOLD 1
> +
> +static int blk_send_response(struct ioreq *ioreq);
> +
>  /* ------------------------------------------------------------- */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
>      QLIST_REMOVE(ioreq, list);
> -    QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
>      blkdev->requests_inflight--;
> -    blkdev->requests_finished++;
>  }
> 
> -static void ioreq_release(struct ioreq *ioreq, bool finish)
> +static void ioreq_release(struct ioreq *ioreq)
>  {
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> @@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool
> finish)
>      ioreq_reset(ioreq);
>      ioreq->blkdev = blkdev;
>      QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
> -    if (finish) {
> -        blkdev->requests_finished--;
> -    } else {
> -        blkdev->requests_inflight--;
> -    }
> +    blkdev->requests_inflight--;
>  }
> 
>  /*
> @@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
>      default:
>          break;
>      }
> +    if (blk_send_response(ioreq)) {
> +        xen_pv_send_notify(&blkdev->xendev);
> +    }
> +    ioreq_release(ioreq);
>      qemu_bh_schedule(blkdev->bh);
> 
>  done:
> @@ -455,7 +454,7 @@ err:
>      return -1;
>  }
> 
> -static int blk_send_response_one(struct ioreq *ioreq)
> +static int blk_send_response(struct ioreq *ioreq)
>  {
>      struct XenBlkDev  *blkdev = ioreq->blkdev;
>      int               send_notify   = 0;
> @@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
>      return send_notify;
>  }
> 
> -/* walk finished list, send outstanding responses, free requests */
> -static void blk_send_response_all(struct XenBlkDev *blkdev)
> -{
> -    struct ioreq *ioreq;
> -    int send_notify = 0;
> -
> -    while (!QLIST_EMPTY(&blkdev->finished)) {
> -        ioreq = QLIST_FIRST(&blkdev->finished);
> -        send_notify += blk_send_response_one(ioreq);
> -        ioreq_release(ioreq, true);
> -    }
> -    if (send_notify) {
> -        xen_pv_send_notify(&blkdev->xendev);
> -    }
> -}
> -
>  static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq,
> RING_IDX rc)
>  {
>      switch (blkdev->protocol) {
> @@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>      rp = blkdev->rings.common.sring->req_prod;
>      xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
> -    blk_send_response_all(blkdev);
>      /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
>       * when we got here, this is an indication that there the bottleneck
>       * is below us, so it's worth beginning to batch up I/O requests
> @@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>                  break;
>              };
> 
> -            if (blk_send_response_one(ioreq)) {
> +            if (blk_send_response(ioreq)) {
>                  xen_pv_send_notify(&blkdev->xendev);
>              }
> -            ioreq_release(ioreq, false);
> +            ioreq_release(ioreq);
>              continue;
>          }
> 
> @@ -646,7 +628,6 @@ static void blk_alloc(struct XenDevice *xendev)
>      trace_xen_disk_alloc(xendev->name);
> 
>      QLIST_INIT(&blkdev->inflight);
> -    QLIST_INIT(&blkdev->finished);
>      QLIST_INIT(&blkdev->freelist);
> 
>      blkdev->iothread = iothread_create(xendev->name, &err);


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/3] Improve xen_disk response latency
@ 2018-11-02 11:14     ` Paul Durrant
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:14 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz



> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 2/3] Improve xen_disk response latency
> 
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   43 ++++++++++++-------------------------------
>  1 file changed, 12 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index cb2881b7e6..b506e23868 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -83,11 +83,9 @@ struct XenBlkDev {
> 
>      /* request lists */
>      QLIST_HEAD(inflight_head, ioreq) inflight;
> -    QLIST_HEAD(finished_head, ioreq) finished;
>      QLIST_HEAD(freelist_head, ioreq) freelist;
>      int                 requests_total;
>      int                 requests_inflight;
> -    int                 requests_finished;
>      unsigned int        max_requests;
> 
>      gboolean            feature_discard;
> @@ -104,6 +102,9 @@ struct XenBlkDev {
>  /* Threshold of in-flight requests above which we will start using
>   * blk_io_plug()/blk_io_unplug() to batch requests */
>  #define IO_PLUG_THRESHOLD 1
> +
> +static int blk_send_response(struct ioreq *ioreq);
> +
>  /* ------------------------------------------------------------- */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
>      QLIST_REMOVE(ioreq, list);
> -    QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
>      blkdev->requests_inflight--;
> -    blkdev->requests_finished++;
>  }
> 
> -static void ioreq_release(struct ioreq *ioreq, bool finish)
> +static void ioreq_release(struct ioreq *ioreq)
>  {
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> @@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool
> finish)
>      ioreq_reset(ioreq);
>      ioreq->blkdev = blkdev;
>      QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
> -    if (finish) {
> -        blkdev->requests_finished--;
> -    } else {
> -        blkdev->requests_inflight--;
> -    }
> +    blkdev->requests_inflight--;
>  }
> 
>  /*
> @@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
>      default:
>          break;
>      }
> +    if (blk_send_response(ioreq)) {
> +        xen_pv_send_notify(&blkdev->xendev);
> +    }
> +    ioreq_release(ioreq);
>      qemu_bh_schedule(blkdev->bh);
> 
>  done:
> @@ -455,7 +454,7 @@ err:
>      return -1;
>  }
> 
> -static int blk_send_response_one(struct ioreq *ioreq)
> +static int blk_send_response(struct ioreq *ioreq)
>  {
>      struct XenBlkDev  *blkdev = ioreq->blkdev;
>      int               send_notify   = 0;
> @@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
>      return send_notify;
>  }
> 
> -/* walk finished list, send outstanding responses, free requests */
> -static void blk_send_response_all(struct XenBlkDev *blkdev)
> -{
> -    struct ioreq *ioreq;
> -    int send_notify = 0;
> -
> -    while (!QLIST_EMPTY(&blkdev->finished)) {
> -        ioreq = QLIST_FIRST(&blkdev->finished);
> -        send_notify += blk_send_response_one(ioreq);
> -        ioreq_release(ioreq, true);
> -    }
> -    if (send_notify) {
> -        xen_pv_send_notify(&blkdev->xendev);
> -    }
> -}
> -
>  static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq,
> RING_IDX rc)
>  {
>      switch (blkdev->protocol) {
> @@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>      rp = blkdev->rings.common.sring->req_prod;
>      xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
> -    blk_send_response_all(blkdev);
>      /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
>       * when we got here, this is an indication that there the bottleneck
>       * is below us, so it's worth beginning to batch up I/O requests
> @@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>                  break;
>              };
> 
> -            if (blk_send_response_one(ioreq)) {
> +            if (blk_send_response(ioreq)) {
>                  xen_pv_send_notify(&blkdev->xendev);
>              }
> -            ioreq_release(ioreq, false);
> +            ioreq_release(ioreq);
>              continue;
>          }
> 
> @@ -646,7 +628,6 @@ static void blk_alloc(struct XenDevice *xendev)
>      trace_xen_disk_alloc(xendev->name);
> 
>      QLIST_INIT(&blkdev->inflight);
> -    QLIST_INIT(&blkdev->finished);
>      QLIST_INIT(&blkdev->freelist);
> 
>      blkdev->iothread = iothread_create(xendev->name, &err);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk
  2018-11-02 10:01 ` [Qemu-devel] " Tim Smith
@ 2018-11-02 11:15     ` Paul Durrant
  2018-11-02 13:53     ` Anthony PERARD
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:15 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz

> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 3/3] Avoid repeated memory allocation in xen_disk
> 
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index b506e23868..faaeefba29 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
>      memset(&ioreq->req, 0, sizeof(ioreq->req));
>      ioreq->status = 0;
>      ioreq->start = 0;
> -    ioreq->buf = NULL;
>      ioreq->size = 0;
>      ioreq->presync = 0;
> 
> @@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev
> *blkdev)
>          /* allocate new struct */
>          ioreq = g_malloc0(sizeof(*ioreq));
>          ioreq->blkdev = blkdev;
> +        /* We cannot need more pages per ioreq than this, and we do re-
> use
> +         * ioreqs, so allocate the memory once here, to be freed in
> +         * blk_free() when the ioreq is freed. */
> +        ioreq->buf = qemu_memalign(XC_PAGE_SIZE,
> BLKIF_MAX_SEGMENTS_PER_REQUEST
> +                                   * XC_PAGE_SIZE);
>          blkdev->requests_total++;
>          qemu_iovec_init(&ioreq->v, 1);
>      } else {
> @@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
>          if (ret == 0) {
>              ioreq_grant_copy(ioreq);
>          }
> -        qemu_vfree(ioreq->buf);
>          break;
>      case BLKIF_OP_WRITE:
>      case BLKIF_OP_FLUSH_DISKCACHE:
>          if (!ioreq->req.nr_segments) {
>              break;
>          }
> -        qemu_vfree(ioreq->buf);
>          break;
>      default:
>          break;
> @@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> -    ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
>      if (ioreq->req.nr_segments &&
>          (ioreq->req.operation == BLKIF_OP_WRITE ||
>           ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
>          ioreq_grant_copy(ioreq)) {
> -        qemu_vfree(ioreq->buf);
>          goto err;
>      }
> 
> @@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
>          ioreq = QLIST_FIRST(&blkdev->freelist);
>          QLIST_REMOVE(ioreq, list);
>          qemu_iovec_destroy(&ioreq->v);
> +        qemu_vfree(ioreq->buf);
>          g_free(ioreq);
>      }
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/3] Avoid repeated memory allocation in xen_disk
@ 2018-11-02 11:15     ` Paul Durrant
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-02 11:15 UTC (permalink / raw)
  To: Tim Smith, xen-devel, qemu-devel, qemu-block
  Cc: Anthony Perard, Kevin Wolf, Stefano Stabellini, Max Reitz

> -----Original Message-----
> From: Tim Smith [mailto:tim.smith@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> block@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; Kevin Wolf
> <kwolf@redhat.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>
> Subject: [PATCH 3/3] Avoid repeated memory allocation in xen_disk
> 
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  hw/block/xen_disk.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index b506e23868..faaeefba29 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
>      memset(&ioreq->req, 0, sizeof(ioreq->req));
>      ioreq->status = 0;
>      ioreq->start = 0;
> -    ioreq->buf = NULL;
>      ioreq->size = 0;
>      ioreq->presync = 0;
> 
> @@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev
> *blkdev)
>          /* allocate new struct */
>          ioreq = g_malloc0(sizeof(*ioreq));
>          ioreq->blkdev = blkdev;
> +        /* We cannot need more pages per ioreq than this, and we do re-
> use
> +         * ioreqs, so allocate the memory once here, to be freed in
> +         * blk_free() when the ioreq is freed. */
> +        ioreq->buf = qemu_memalign(XC_PAGE_SIZE,
> BLKIF_MAX_SEGMENTS_PER_REQUEST
> +                                   * XC_PAGE_SIZE);
>          blkdev->requests_total++;
>          qemu_iovec_init(&ioreq->v, 1);
>      } else {
> @@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
>          if (ret == 0) {
>              ioreq_grant_copy(ioreq);
>          }
> -        qemu_vfree(ioreq->buf);
>          break;
>      case BLKIF_OP_WRITE:
>      case BLKIF_OP_FLUSH_DISKCACHE:
>          if (!ioreq->req.nr_segments) {
>              break;
>          }
> -        qemu_vfree(ioreq->buf);
>          break;
>      default:
>          break;
> @@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
>      struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> -    ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
>      if (ioreq->req.nr_segments &&
>          (ioreq->req.operation == BLKIF_OP_WRITE ||
>           ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
>          ioreq_grant_copy(ioreq)) {
> -        qemu_vfree(ioreq->buf);
>          goto err;
>      }
> 
> @@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
>          ioreq = QLIST_FIRST(&blkdev->freelist);
>          QLIST_REMOVE(ioreq, list);
>          qemu_iovec_destroy(&ioreq->v);
> +        qemu_vfree(ioreq->buf);
>          g_free(ioreq);
>      }
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
  2018-11-02 12:14     ` Kevin Wolf
@ 2018-11-02 12:14     ` Kevin Wolf
  2018-11-05 15:57       ` Markus Armbruster
  2 siblings, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-02 12:14 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Tim Smith, xen-devel, qemu-devel, qemu-block, Anthony Perard,
	Stefano Stabellini, Max Reitz, armbru

Am 02.11.2018 um 12:13 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Sent: 02 November 2018 11:04
> > To: Tim Smith <tim.smith@citrix.com>
> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> > Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> > for xen_disk v2)
> > 
> > Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > A series of performance improvements for disks using the Xen PV ring.
> > >
> > > These have had fairly extensive testing.
> > >
> > > The batching and latency improvements together boost the throughput
> > > of small reads and writes by two to six percent (measured using fio
> > > in the guest)
> > >
> > > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > > from 25MB to 5MB in the case of a single datapath process while also
> > > improving performance.
> > >
> > > v2 removes some checkpatch complaints and fixes the CCs
> > 
> > Completely unrelated, but since you're the first person touching
> > xen_disk in a while, you're my victim:
> > 
> > At KVM Forum we discussed sending a patch to deprecate xen_disk because
> > after all those years, it still hasn't been converted to qdev. Markus is
> > currently fixing some other not yet qdevified block device, but after
> > that xen_disk will be the only one left.
> > 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> > 
> > So if you guys are still interested in the device, could someone please
> > finally look into converting it?
> 
> I have a patch series to do exactly this. It's somewhat involved as I
> need to convert the whole PV backend infrastructure. I will try to
> rebase and clean up my series a.s.a.p.

Thanks a lot, Paul! This is good news.

Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
@ 2018-11-02 12:14     ` Kevin Wolf
  2018-11-02 12:14     ` [Qemu-devel] " Kevin Wolf
  2018-11-05 15:57       ` Markus Armbruster
  2 siblings, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-02 12:14 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, qemu-block, Tim Smith, armbru, qemu-devel,
	Max Reitz, Anthony Perard, xen-devel

Am 02.11.2018 um 12:13 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Sent: 02 November 2018 11:04
> > To: Tim Smith <tim.smith@citrix.com>
> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> > Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> > for xen_disk v2)
> > 
> > Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > A series of performance improvements for disks using the Xen PV ring.
> > >
> > > These have had fairly extensive testing.
> > >
> > > The batching and latency improvements together boost the throughput
> > > of small reads and writes by two to six percent (measured using fio
> > > in the guest)
> > >
> > > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > > from 25MB to 5MB in the case of a single datapath process while also
> > > improving performance.
> > >
> > > v2 removes some checkpatch complaints and fixes the CCs
> > 
> > Completely unrelated, but since you're the first person touching
> > xen_disk in a while, you're my victim:
> > 
> > At KVM Forum we discussed sending a patch to deprecate xen_disk because
> > after all those years, it still hasn't been converted to qdev. Markus is
> > currently fixing some other not yet qdevified block device, but after
> > that xen_disk will be the only one left.
> > 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> > 
> > So if you guys are still interested in the device, could someone please
> > finally look into converting it?
> 
> I have a patch series to do exactly this. It's somewhat involved as I
> need to convert the whole PV backend infrastructure. I will try to
> rebase and clean up my series a.s.a.p.

Thanks a lot, Paul! This is good news.

Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
                     ` (2 preceding siblings ...)
  2018-11-02 13:53   ` Anthony PERARD
@ 2018-11-02 13:53   ` Anthony PERARD
  3 siblings, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: xen-devel, qemu-devel, qemu-block, Kevin Wolf, Paul Durrant,
	Stefano Stabellini, Max Reitz

On Fri, Nov 02, 2018 at 10:00:59AM +0000, Tim Smith wrote:
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/3] Improve xen_disk batching behaviour
  2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
  2018-11-02 11:14   ` Paul Durrant
  2018-11-02 11:14   ` Paul Durrant
@ 2018-11-02 13:53   ` Anthony PERARD
  2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
  3 siblings, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel,
	Max Reitz, Paul Durrant, xen-devel

On Fri, Nov 02, 2018 at 10:00:59AM +0000, Tim Smith wrote:
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency
  2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
  2018-11-02 11:14     ` Paul Durrant
  2018-11-02 13:53   ` Anthony PERARD
@ 2018-11-02 13:53   ` Anthony PERARD
  2 siblings, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: xen-devel, qemu-devel, qemu-block, Kevin Wolf, Paul Durrant,
	Stefano Stabellini, Max Reitz

On Fri, Nov 02, 2018 at 10:01:04AM +0000, Tim Smith wrote:
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/3] Improve xen_disk response latency
  2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
  2018-11-02 11:14     ` Paul Durrant
@ 2018-11-02 13:53   ` Anthony PERARD
  2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
  2 siblings, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel,
	Max Reitz, Paul Durrant, xen-devel

On Fri, Nov 02, 2018 at 10:01:04AM +0000, Tim Smith wrote:
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk
  2018-11-02 10:01 ` [Qemu-devel] " Tim Smith
@ 2018-11-02 13:53     ` Anthony PERARD
  2018-11-02 13:53     ` Anthony PERARD
  1 sibling, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: xen-devel, qemu-devel, qemu-block, Kevin Wolf, Paul Durrant,
	Stefano Stabellini, Max Reitz

On Fri, Nov 02, 2018 at 10:01:09AM +0000, Tim Smith wrote:
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/3] Avoid repeated memory allocation in xen_disk
@ 2018-11-02 13:53     ` Anthony PERARD
  0 siblings, 0 replies; 49+ messages in thread
From: Anthony PERARD @ 2018-11-02 13:53 UTC (permalink / raw)
  To: Tim Smith
  Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel,
	Max Reitz, Paul Durrant, xen-devel

On Fri, Nov 02, 2018 at 10:01:09AM +0000, Tim Smith wrote:
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith <tim.smith@citrix.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
@ 2018-11-05 15:57       ` Markus Armbruster
  2018-11-02 12:14     ` [Qemu-devel] " Kevin Wolf
  2018-11-05 15:57       ` Markus Armbruster
  2 siblings, 0 replies; 49+ messages in thread
From: Markus Armbruster @ 2018-11-05 15:57 UTC (permalink / raw)
  To: Paul Durrant
  Cc: 'Kevin Wolf',
	Tim Smith, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz,
	Anthony Perard, xen-devel

Paul Durrant <Paul.Durrant@citrix.com> writes:

>> -----Original Message-----
>> From: Kevin Wolf [mailto:kwolf@redhat.com]
>> Sent: 02 November 2018 11:04
>> To: Tim Smith <tim.smith@citrix.com>
>> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
>> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
>> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
>> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
>> for xen_disk v2)
>> 
>> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
>> > A series of performance improvements for disks using the Xen PV ring.
>> >
>> > These have had fairly extensive testing.
>> >
>> > The batching and latency improvements together boost the throughput
>> > of small reads and writes by two to six percent (measured using fio
>> > in the guest)
>> >
>> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
>> > from 25MB to 5MB in the case of a single datapath process while also
>> > improving performance.
>> >
>> > v2 removes some checkpatch complaints and fixes the CCs
>> 
>> Completely unrelated, but since you're the first person touching
>> xen_disk in a while, you're my victim:
>> 
>> At KVM Forum we discussed sending a patch to deprecate xen_disk because
>> after all those years, it still hasn't been converted to qdev. Markus is
>> currently fixing some other not yet qdevified block device, but after
>> that xen_disk will be the only one left.
>> 
>> A while ago, a downstream patch review found out that there are some QMP
>> commands that would immediately crash if a xen_disk device were present
>> because of the lacking qdevification. This is not the code quality
>> standard I envision for QEMU. It's time for non-qdev devices to go.
>> 
>> So if you guys are still interested in the device, could someone please
>> finally look into converting it?
>> 
>
> I have a patch series to do exactly this. It's somewhat involved as I
> need to convert the whole PV backend infrastructure. I will try to
> rebase and clean up my series a.s.a.p.

Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
work if you haven't done so already.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
@ 2018-11-05 15:57       ` Markus Armbruster
  0 siblings, 0 replies; 49+ messages in thread
From: Markus Armbruster @ 2018-11-05 15:57 UTC (permalink / raw)
  To: Paul Durrant
  Cc: 'Kevin Wolf',
	Stefano Stabellini, qemu-block, Tim Smith, qemu-devel, Max Reitz,
	Anthony Perard, xen-devel

Paul Durrant <Paul.Durrant@citrix.com> writes:

>> -----Original Message-----
>> From: Kevin Wolf [mailto:kwolf@redhat.com]
>> Sent: 02 November 2018 11:04
>> To: Tim Smith <tim.smith@citrix.com>
>> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
>> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
>> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
>> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
>> for xen_disk v2)
>> 
>> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
>> > A series of performance improvements for disks using the Xen PV ring.
>> >
>> > These have had fairly extensive testing.
>> >
>> > The batching and latency improvements together boost the throughput
>> > of small reads and writes by two to six percent (measured using fio
>> > in the guest)
>> >
>> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
>> > from 25MB to 5MB in the case of a single datapath process while also
>> > improving performance.
>> >
>> > v2 removes some checkpatch complaints and fixes the CCs
>> 
>> Completely unrelated, but since you're the first person touching
>> xen_disk in a while, you're my victim:
>> 
>> At KVM Forum we discussed sending a patch to deprecate xen_disk because
>> after all those years, it still hasn't been converted to qdev. Markus is
>> currently fixing some other not yet qdevified block device, but after
>> that xen_disk will be the only one left.
>> 
>> A while ago, a downstream patch review found out that there are some QMP
>> commands that would immediately crash if a xen_disk device were present
>> because of the lacking qdevification. This is not the code quality
>> standard I envision for QEMU. It's time for non-qdev devices to go.
>> 
>> So if you guys are still interested in the device, could someone please
>> finally look into converting it?
>> 
>
> I have a patch series to do exactly this. It's somewhat involved as I
> need to convert the whole PV backend infrastructure. I will try to
> rebase and clean up my series a.s.a.p.

Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
work if you haven't done so already.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-05 15:57       ` Markus Armbruster
@ 2018-11-05 16:15         ` Paul Durrant
  -1 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-05 16:15 UTC (permalink / raw)
  To: 'Markus Armbruster'
  Cc: 'Kevin Wolf',
	Tim Smith, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz,
	Anthony Perard, xen-devel

> -----Original Message-----
> From: Markus Armbruster [mailto:armbru@redhat.com]
> Sent: 05 November 2018 15:58
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Paul Durrant <Paul.Durrant@citrix.com> writes:
> 
> >> -----Original Message-----
> >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> >> Sent: 02 November 2018 11:04
> >> To: Tim Smith <tim.smith@citrix.com>
> >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> Durrant
> >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> improvements
> >> for xen_disk v2)
> >>
> >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> >> > A series of performance improvements for disks using the Xen PV ring.
> >> >
> >> > These have had fairly extensive testing.
> >> >
> >> > The batching and latency improvements together boost the throughput
> >> > of small reads and writes by two to six percent (measured using fio
> >> > in the guest)
> >> >
> >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> >> > from 25MB to 5MB in the case of a single datapath process while also
> >> > improving performance.
> >> >
> >> > v2 removes some checkpatch complaints and fixes the CCs
> >>
> >> Completely unrelated, but since you're the first person touching
> >> xen_disk in a while, you're my victim:
> >>
> >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> >> after all those years, it still hasn't been converted to qdev. Markus
> is
> >> currently fixing some other not yet qdevified block device, but after
> >> that xen_disk will be the only one left.
> >>
> >> A while ago, a downstream patch review found out that there are some
> QMP
> >> commands that would immediately crash if a xen_disk device were present
> >> because of the lacking qdevification. This is not the code quality
> >> standard I envision for QEMU. It's time for non-qdev devices to go.
> >>
> >> So if you guys are still interested in the device, could someone please
> >> finally look into converting it?
> >>
> >
> > I have a patch series to do exactly this. It's somewhat involved as I
> > need to convert the whole PV backend infrastructure. I will try to
> > rebase and clean up my series a.s.a.p.
> 
> Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> work if you haven't done so already.

Sure. I have already spoken to Anthony.

  Thanks,

   Paul

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
@ 2018-11-05 16:15         ` Paul Durrant
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-05 16:15 UTC (permalink / raw)
  To: 'Markus Armbruster'
  Cc: 'Kevin Wolf',
	Stefano Stabellini, qemu-block, Tim Smith, qemu-devel, Max Reitz,
	Anthony Perard, xen-devel

> -----Original Message-----
> From: Markus Armbruster [mailto:armbru@redhat.com]
> Sent: 05 November 2018 15:58
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Paul Durrant <Paul.Durrant@citrix.com> writes:
> 
> >> -----Original Message-----
> >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> >> Sent: 02 November 2018 11:04
> >> To: Tim Smith <tim.smith@citrix.com>
> >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> Durrant
> >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> improvements
> >> for xen_disk v2)
> >>
> >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> >> > A series of performance improvements for disks using the Xen PV ring.
> >> >
> >> > These have had fairly extensive testing.
> >> >
> >> > The batching and latency improvements together boost the throughput
> >> > of small reads and writes by two to six percent (measured using fio
> >> > in the guest)
> >> >
> >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> >> > from 25MB to 5MB in the case of a single datapath process while also
> >> > improving performance.
> >> >
> >> > v2 removes some checkpatch complaints and fixes the CCs
> >>
> >> Completely unrelated, but since you're the first person touching
> >> xen_disk in a while, you're my victim:
> >>
> >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> >> after all those years, it still hasn't been converted to qdev. Markus
> is
> >> currently fixing some other not yet qdevified block device, but after
> >> that xen_disk will be the only one left.
> >>
> >> A while ago, a downstream patch review found out that there are some
> QMP
> >> commands that would immediately crash if a xen_disk device were present
> >> because of the lacking qdevification. This is not the code quality
> >> standard I envision for QEMU. It's time for non-qdev devices to go.
> >>
> >> So if you guys are still interested in the device, could someone please
> >> finally look into converting it?
> >>
> >
> > I have a patch series to do exactly this. It's somewhat involved as I
> > need to convert the whole PV backend infrastructure. I will try to
> > rebase and clean up my series a.s.a.p.
> 
> Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> work if you haven't done so already.

Sure. I have already spoken to Anthony.

  Thanks,

   Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-05 15:57       ` Markus Armbruster
                         ` (2 preceding siblings ...)
  (?)
@ 2018-11-08 14:00       ` Paul Durrant
  2018-11-08 15:21         ` Kevin Wolf
  2018-11-08 15:21         ` Kevin Wolf
  -1 siblings, 2 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 14:00 UTC (permalink / raw)
  To: 'Markus Armbruster', 'Kevin Wolf', Anthony Perard
  Cc: Tim Smith, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz,
	xen-devel

> -----Original Message-----
> From: Markus Armbruster [mailto:armbru@redhat.com]
> Sent: 05 November 2018 15:58
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Paul Durrant <Paul.Durrant@citrix.com> writes:
> 
> >> -----Original Message-----
> >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> >> Sent: 02 November 2018 11:04
> >> To: Tim Smith <tim.smith@citrix.com>
> >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> Durrant
> >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> improvements
> >> for xen_disk v2)
> >>
> >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> >> > A series of performance improvements for disks using the Xen PV ring.
> >> >
> >> > These have had fairly extensive testing.
> >> >
> >> > The batching and latency improvements together boost the throughput
> >> > of small reads and writes by two to six percent (measured using fio
> >> > in the guest)
> >> >
> >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> >> > from 25MB to 5MB in the case of a single datapath process while also
> >> > improving performance.
> >> >
> >> > v2 removes some checkpatch complaints and fixes the CCs
> >>
> >> Completely unrelated, but since you're the first person touching
> >> xen_disk in a while, you're my victim:
> >>
> >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> >> after all those years, it still hasn't been converted to qdev. Markus
> is
> >> currently fixing some other not yet qdevified block device, but after
> >> that xen_disk will be the only one left.
> >>
> >> A while ago, a downstream patch review found out that there are some
> QMP
> >> commands that would immediately crash if a xen_disk device were present
> >> because of the lacking qdevification. This is not the code quality
> >> standard I envision for QEMU. It's time for non-qdev devices to go.
> >>
> >> So if you guys are still interested in the device, could someone please
> >> finally look into converting it?
> >>
> >
> > I have a patch series to do exactly this. It's somewhat involved as I
> > need to convert the whole PV backend infrastructure. I will try to
> > rebase and clean up my series a.s.a.p.
> 
> Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> work if you haven't done so already.

I've come across a bit of a problem that I'm not sure how best to deal with and so am looking for some advice.

I now have a qdevified PV disk backend but I can't bring it up because it fails to acquire a write lock on the qcow2 it is pointing at. This is because there is also an emulated IDE drive using the same qcow2. This does not appear to be a problem for the non-qdev xen-disk, presumably because it is not opening the qcow2 until the emulated device is unplugged and I don't really want to introduce similar hackery in my new backend (i.e. I want it to attach to its drive, and hence open the qcow2, during realize).

So, I'm not sure what to do... It is not a problem that both a PV backend and an emulated device are using the same qcow2 because they will never actually operate simultaneously so is there any way I can bypass the qcow2 lock check when I create the drive for my PV backend? (BTW I tried re-using the drive created for the emulated device, but that doesn't work because there is a check if a drive is already attached to something).

Any ideas?

  Paul

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-05 15:57       ` Markus Armbruster
  (?)
  (?)
@ 2018-11-08 14:00       ` Paul Durrant
  -1 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 14:00 UTC (permalink / raw)
  To: 'Markus Armbruster', 'Kevin Wolf', Anthony Perard
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel, Max Reitz,
	xen-devel

> -----Original Message-----
> From: Markus Armbruster [mailto:armbru@redhat.com]
> Sent: 05 November 2018 15:58
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Paul Durrant <Paul.Durrant@citrix.com> writes:
> 
> >> -----Original Message-----
> >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> >> Sent: 02 November 2018 11:04
> >> To: Tim Smith <tim.smith@citrix.com>
> >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> Durrant
> >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> improvements
> >> for xen_disk v2)
> >>
> >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> >> > A series of performance improvements for disks using the Xen PV ring.
> >> >
> >> > These have had fairly extensive testing.
> >> >
> >> > The batching and latency improvements together boost the throughput
> >> > of small reads and writes by two to six percent (measured using fio
> >> > in the guest)
> >> >
> >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> >> > from 25MB to 5MB in the case of a single datapath process while also
> >> > improving performance.
> >> >
> >> > v2 removes some checkpatch complaints and fixes the CCs
> >>
> >> Completely unrelated, but since you're the first person touching
> >> xen_disk in a while, you're my victim:
> >>
> >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> >> after all those years, it still hasn't been converted to qdev. Markus
> is
> >> currently fixing some other not yet qdevified block device, but after
> >> that xen_disk will be the only one left.
> >>
> >> A while ago, a downstream patch review found out that there are some
> QMP
> >> commands that would immediately crash if a xen_disk device were present
> >> because of the lacking qdevification. This is not the code quality
> >> standard I envision for QEMU. It's time for non-qdev devices to go.
> >>
> >> So if you guys are still interested in the device, could someone please
> >> finally look into converting it?
> >>
> >
> > I have a patch series to do exactly this. It's somewhat involved as I
> > need to convert the whole PV backend infrastructure. I will try to
> > rebase and clean up my series a.s.a.p.
> 
> Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> work if you haven't done so already.

I've come across a bit of a problem that I'm not sure how best to deal with and so am looking for some advice.

I now have a qdevified PV disk backend but I can't bring it up because it fails to acquire a write lock on the qcow2 it is pointing at. This is because there is also an emulated IDE drive using the same qcow2. This does not appear to be a problem for the non-qdev xen-disk, presumably because it is not opening the qcow2 until the emulated device is unplugged and I don't really want to introduce similar hackery in my new backend (i.e. I want it to attach to its drive, and hence open the qcow2, during realize).

So, I'm not sure what to do... It is not a problem that both a PV backend and an emulated device are using the same qcow2 because they will never actually operate simultaneously so is there any way I can bypass the qcow2 lock check when I create the drive for my PV backend? (BTW I tried re-using the drive created for the emulated device, but that doesn't work because there is a check if a drive is already attached to something).

Any ideas?

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 14:00       ` Paul Durrant
@ 2018-11-08 15:21         ` Kevin Wolf
  2018-11-08 15:43           ` Paul Durrant
  2018-11-08 15:43           ` Paul Durrant
  2018-11-08 15:21         ` Kevin Wolf
  1 sibling, 2 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-08 15:21 UTC (permalink / raw)
  To: Paul Durrant
  Cc: 'Markus Armbruster',
	Anthony Perard, Tim Smith, Stefano Stabellini, qemu-block,
	qemu-devel, Max Reitz, xen-devel

Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Markus Armbruster [mailto:armbru@redhat.com]
> > Sent: 05 November 2018 15:58
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > Subject: Re: [Qemu-devel] xen_disk qdevification
> > 
> > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > 
> > >> -----Original Message-----
> > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > >> Sent: 02 November 2018 11:04
> > >> To: Tim Smith <tim.smith@citrix.com>
> > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> > Durrant
> > >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > improvements
> > >> for xen_disk v2)
> > >>
> > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > >> > A series of performance improvements for disks using the Xen PV ring.
> > >> >
> > >> > These have had fairly extensive testing.
> > >> >
> > >> > The batching and latency improvements together boost the throughput
> > >> > of small reads and writes by two to six percent (measured using fio
> > >> > in the guest)
> > >> >
> > >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > >> > from 25MB to 5MB in the case of a single datapath process while also
> > >> > improving performance.
> > >> >
> > >> > v2 removes some checkpatch complaints and fixes the CCs
> > >>
> > >> Completely unrelated, but since you're the first person touching
> > >> xen_disk in a while, you're my victim:
> > >>
> > >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> > >> after all those years, it still hasn't been converted to qdev. Markus
> > is
> > >> currently fixing some other not yet qdevified block device, but after
> > >> that xen_disk will be the only one left.
> > >>
> > >> A while ago, a downstream patch review found out that there are some
> > QMP
> > >> commands that would immediately crash if a xen_disk device were present
> > >> because of the lacking qdevification. This is not the code quality
> > >> standard I envision for QEMU. It's time for non-qdev devices to go.
> > >>
> > >> So if you guys are still interested in the device, could someone please
> > >> finally look into converting it?
> > >>
> > >
> > > I have a patch series to do exactly this. It's somewhat involved as I
> > > need to convert the whole PV backend infrastructure. I will try to
> > > rebase and clean up my series a.s.a.p.
> > 
> > Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> > work if you haven't done so already.
> 
> I've come across a bit of a problem that I'm not sure how best to deal
> with and so am looking for some advice.
> 
> I now have a qdevified PV disk backend but I can't bring it up because
> it fails to acquire a write lock on the qcow2 it is pointing at. This
> is because there is also an emulated IDE drive using the same qcow2.
> This does not appear to be a problem for the non-qdev xen-disk,
> presumably because it is not opening the qcow2 until the emulated
> device is unplugged and I don't really want to introduce similar
> hackery in my new backend (i.e. I want it to attach to its drive, and
> hence open the qcow2, during realize).
> 
> So, I'm not sure what to do... It is not a problem that both a PV
> backend and an emulated device are using the same qcow2 because they
> will never actually operate simultaneously so is there any way I can
> bypass the qcow2 lock check when I create the drive for my PV backend?
> (BTW I tried re-using the drive created for the emulated device, but
> that doesn't work because there is a check if a drive is already
> attached to something).
> 
> Any ideas?

I think the clean solution is to keep the BlockBackend open in xen-disk
from the beginning, but not requesting write permissions yet.

The BlockBackend is created in parse_drive(), when qdev parses the
-device drive=... option. At this point, no permissions are requested
yet. That is done in blkconf_apply_backend_options(), which is manually
called from the devices; specifically from ide_dev_initfn() in IDE, and
I assume you call the function from xen-disk as well.

xen-disk should then call this function with readonly=true, and at the
point of the handover (when the IDE device is already gone) it can call
blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
it already holds.


The other option I see would be that you simply create both devices with
share-rw=on (which results in conf->share_rw == true and therefore
shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
feels like a hack because you don't actually want to have two writers at
the same time.

Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 14:00       ` Paul Durrant
  2018-11-08 15:21         ` Kevin Wolf
@ 2018-11-08 15:21         ` Kevin Wolf
  1 sibling, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-08 15:21 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel,
	'Markus Armbruster',
	Anthony Perard, xen-devel, Max Reitz

Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Markus Armbruster [mailto:armbru@redhat.com]
> > Sent: 05 November 2018 15:58
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > Subject: Re: [Qemu-devel] xen_disk qdevification
> > 
> > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > 
> > >> -----Original Message-----
> > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > >> Sent: 02 November 2018 11:04
> > >> To: Tim Smith <tim.smith@citrix.com>
> > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> > Durrant
> > >> <Paul.Durrant@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > improvements
> > >> for xen_disk v2)
> > >>
> > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > >> > A series of performance improvements for disks using the Xen PV ring.
> > >> >
> > >> > These have had fairly extensive testing.
> > >> >
> > >> > The batching and latency improvements together boost the throughput
> > >> > of small reads and writes by two to six percent (measured using fio
> > >> > in the guest)
> > >> >
> > >> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > >> > from 25MB to 5MB in the case of a single datapath process while also
> > >> > improving performance.
> > >> >
> > >> > v2 removes some checkpatch complaints and fixes the CCs
> > >>
> > >> Completely unrelated, but since you're the first person touching
> > >> xen_disk in a while, you're my victim:
> > >>
> > >> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> > >> after all those years, it still hasn't been converted to qdev. Markus
> > is
> > >> currently fixing some other not yet qdevified block device, but after
> > >> that xen_disk will be the only one left.
> > >>
> > >> A while ago, a downstream patch review found out that there are some
> > QMP
> > >> commands that would immediately crash if a xen_disk device were present
> > >> because of the lacking qdevification. This is not the code quality
> > >> standard I envision for QEMU. It's time for non-qdev devices to go.
> > >>
> > >> So if you guys are still interested in the device, could someone please
> > >> finally look into converting it?
> > >>
> > >
> > > I have a patch series to do exactly this. It's somewhat involved as I
> > > need to convert the whole PV backend infrastructure. I will try to
> > > rebase and clean up my series a.s.a.p.
> > 
> > Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> > work if you haven't done so already.
> 
> I've come across a bit of a problem that I'm not sure how best to deal
> with and so am looking for some advice.
> 
> I now have a qdevified PV disk backend but I can't bring it up because
> it fails to acquire a write lock on the qcow2 it is pointing at. This
> is because there is also an emulated IDE drive using the same qcow2.
> This does not appear to be a problem for the non-qdev xen-disk,
> presumably because it is not opening the qcow2 until the emulated
> device is unplugged and I don't really want to introduce similar
> hackery in my new backend (i.e. I want it to attach to its drive, and
> hence open the qcow2, during realize).
> 
> So, I'm not sure what to do... It is not a problem that both a PV
> backend and an emulated device are using the same qcow2 because they
> will never actually operate simultaneously so is there any way I can
> bypass the qcow2 lock check when I create the drive for my PV backend?
> (BTW I tried re-using the drive created for the emulated device, but
> that doesn't work because there is a check if a drive is already
> attached to something).
> 
> Any ideas?

I think the clean solution is to keep the BlockBackend open in xen-disk
from the beginning, but not requesting write permissions yet.

The BlockBackend is created in parse_drive(), when qdev parses the
-device drive=... option. At this point, no permissions are requested
yet. That is done in blkconf_apply_backend_options(), which is manually
called from the devices; specifically from ide_dev_initfn() in IDE, and
I assume you call the function from xen-disk as well.

xen-disk should then call this function with readonly=true, and at the
point of the handover (when the IDE device is already gone) it can call
blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
it already holds.


The other option I see would be that you simply create both devices with
share-rw=on (which results in conf->share_rw == true and therefore
shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
feels like a hack because you don't actually want to have two writers at
the same time.

Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 15:21         ` Kevin Wolf
  2018-11-08 15:43           ` Paul Durrant
@ 2018-11-08 15:43           ` Paul Durrant
  2018-11-08 16:44             ` Paul Durrant
  2018-11-08 16:44             ` Paul Durrant
  1 sibling, 2 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 15:43 UTC (permalink / raw)
  To: 'Kevin Wolf'
  Cc: 'Markus Armbruster',
	Anthony Perard, Tim Smith, Stefano Stabellini, qemu-block,
	qemu-devel, Max Reitz, xen-devel

> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Sent: 08 November 2018 15:21
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > -----Original Message-----
> > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > Sent: 05 November 2018 15:58
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> > > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> qemu-
> > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > >
> > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > >
> > > >> -----Original Message-----
> > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > >> Sent: 02 November 2018 11:04
> > > >> To: Tim Smith <tim.smith@citrix.com>
> > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> > > Durrant
> > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>;
> > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > improvements
> > > >> for xen_disk v2)
> > > >>
> > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > >> > A series of performance improvements for disks using the Xen PV
> ring.
> > > >> >
> > > >> > These have had fairly extensive testing.
> > > >> >
> > > >> > The batching and latency improvements together boost the
> throughput
> > > >> > of small reads and writes by two to six percent (measured using
> fio
> > > >> > in the guest)
> > > >> >
> > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> heap
> > > >> > from 25MB to 5MB in the case of a single datapath process while
> also
> > > >> > improving performance.
> > > >> >
> > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > >>
> > > >> Completely unrelated, but since you're the first person touching
> > > >> xen_disk in a while, you're my victim:
> > > >>
> > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> because
> > > >> after all those years, it still hasn't been converted to qdev.
> Markus
> > > is
> > > >> currently fixing some other not yet qdevified block device, but
> after
> > > >> that xen_disk will be the only one left.
> > > >>
> > > >> A while ago, a downstream patch review found out that there are
> some
> > > QMP
> > > >> commands that would immediately crash if a xen_disk device were
> present
> > > >> because of the lacking qdevification. This is not the code quality
> > > >> standard I envision for QEMU. It's time for non-qdev devices to go.
> > > >>
> > > >> So if you guys are still interested in the device, could someone
> please
> > > >> finally look into converting it?
> > > >>
> > > >
> > > > I have a patch series to do exactly this. It's somewhat involved as
> I
> > > > need to convert the whole PV backend infrastructure. I will try to
> > > > rebase and clean up my series a.s.a.p.
> > >
> > > Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> > > work if you haven't done so already.
> >
> > I've come across a bit of a problem that I'm not sure how best to deal
> > with and so am looking for some advice.
> >
> > I now have a qdevified PV disk backend but I can't bring it up because
> > it fails to acquire a write lock on the qcow2 it is pointing at. This
> > is because there is also an emulated IDE drive using the same qcow2.
> > This does not appear to be a problem for the non-qdev xen-disk,
> > presumably because it is not opening the qcow2 until the emulated
> > device is unplugged and I don't really want to introduce similar
> > hackery in my new backend (i.e. I want it to attach to its drive, and
> > hence open the qcow2, during realize).
> >
> > So, I'm not sure what to do... It is not a problem that both a PV
> > backend and an emulated device are using the same qcow2 because they
> > will never actually operate simultaneously so is there any way I can
> > bypass the qcow2 lock check when I create the drive for my PV backend?
> > (BTW I tried re-using the drive created for the emulated device, but
> > that doesn't work because there is a check if a drive is already
> > attached to something).
> >
> > Any ideas?
> 
> I think the clean solution is to keep the BlockBackend open in xen-disk
> from the beginning, but not requesting write permissions yet.
> 
> The BlockBackend is created in parse_drive(), when qdev parses the
> -device drive=... option. At this point, no permissions are requested
> yet. That is done in blkconf_apply_backend_options(), which is manually
> called from the devices; specifically from ide_dev_initfn() in IDE, and
> I assume you call the function from xen-disk as well.

Yes, I call it during realize.

> 
> xen-disk should then call this function with readonly=true, and at the
> point of the handover (when the IDE device is already gone) it can call
> blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
> it already holds.
> 

I tried that and it works fine :-)

> 
> The other option I see would be that you simply create both devices with
> share-rw=on (which results in conf->share_rw == true and therefore
> shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> feels like a hack because you don't actually want to have two writers at
> the same time.
> 

Yes, that does indeed seem like more of a hack. The first option works so I'll go with that.

Thanks for your help.

Cheers,

  Paul




> Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 15:21         ` Kevin Wolf
@ 2018-11-08 15:43           ` Paul Durrant
  2018-11-08 15:43           ` Paul Durrant
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 15:43 UTC (permalink / raw)
  To: 'Kevin Wolf'
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel,
	'Markus Armbruster',
	Anthony Perard, xen-devel, Max Reitz

> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Sent: 08 November 2018 15:21
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] xen_disk qdevification
> 
> Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > -----Original Message-----
> > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > Sent: 05 November 2018 15:58
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith <tim.smith@citrix.com>;
> > > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> qemu-
> > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > >
> > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > >
> > > >> -----Original Message-----
> > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > >> Sent: 02 November 2018 11:04
> > > >> To: Tim Smith <tim.smith@citrix.com>
> > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>; Paul
> > > Durrant
> > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>;
> > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > improvements
> > > >> for xen_disk v2)
> > > >>
> > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > >> > A series of performance improvements for disks using the Xen PV
> ring.
> > > >> >
> > > >> > These have had fairly extensive testing.
> > > >> >
> > > >> > The batching and latency improvements together boost the
> throughput
> > > >> > of small reads and writes by two to six percent (measured using
> fio
> > > >> > in the guest)
> > > >> >
> > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> heap
> > > >> > from 25MB to 5MB in the case of a single datapath process while
> also
> > > >> > improving performance.
> > > >> >
> > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > >>
> > > >> Completely unrelated, but since you're the first person touching
> > > >> xen_disk in a while, you're my victim:
> > > >>
> > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> because
> > > >> after all those years, it still hasn't been converted to qdev.
> Markus
> > > is
> > > >> currently fixing some other not yet qdevified block device, but
> after
> > > >> that xen_disk will be the only one left.
> > > >>
> > > >> A while ago, a downstream patch review found out that there are
> some
> > > QMP
> > > >> commands that would immediately crash if a xen_disk device were
> present
> > > >> because of the lacking qdevification. This is not the code quality
> > > >> standard I envision for QEMU. It's time for non-qdev devices to go.
> > > >>
> > > >> So if you guys are still interested in the device, could someone
> please
> > > >> finally look into converting it?
> > > >>
> > > >
> > > > I have a patch series to do exactly this. It's somewhat involved as
> I
> > > > need to convert the whole PV backend infrastructure. I will try to
> > > > rebase and clean up my series a.s.a.p.
> > >
> > > Awesome!  Please coordinate with Anthony Prerard to avoid duplicating
> > > work if you haven't done so already.
> >
> > I've come across a bit of a problem that I'm not sure how best to deal
> > with and so am looking for some advice.
> >
> > I now have a qdevified PV disk backend but I can't bring it up because
> > it fails to acquire a write lock on the qcow2 it is pointing at. This
> > is because there is also an emulated IDE drive using the same qcow2.
> > This does not appear to be a problem for the non-qdev xen-disk,
> > presumably because it is not opening the qcow2 until the emulated
> > device is unplugged and I don't really want to introduce similar
> > hackery in my new backend (i.e. I want it to attach to its drive, and
> > hence open the qcow2, during realize).
> >
> > So, I'm not sure what to do... It is not a problem that both a PV
> > backend and an emulated device are using the same qcow2 because they
> > will never actually operate simultaneously so is there any way I can
> > bypass the qcow2 lock check when I create the drive for my PV backend?
> > (BTW I tried re-using the drive created for the emulated device, but
> > that doesn't work because there is a check if a drive is already
> > attached to something).
> >
> > Any ideas?
> 
> I think the clean solution is to keep the BlockBackend open in xen-disk
> from the beginning, but not requesting write permissions yet.
> 
> The BlockBackend is created in parse_drive(), when qdev parses the
> -device drive=... option. At this point, no permissions are requested
> yet. That is done in blkconf_apply_backend_options(), which is manually
> called from the devices; specifically from ide_dev_initfn() in IDE, and
> I assume you call the function from xen-disk as well.

Yes, I call it during realize.

> 
> xen-disk should then call this function with readonly=true, and at the
> point of the handover (when the IDE device is already gone) it can call
> blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
> it already holds.
> 

I tried that and it works fine :-)

> 
> The other option I see would be that you simply create both devices with
> share-rw=on (which results in conf->share_rw == true and therefore
> shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> feels like a hack because you don't actually want to have two writers at
> the same time.
> 

Yes, that does indeed seem like more of a hack. The first option works so I'll go with that.

Thanks for your help.

Cheers,

  Paul




> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 15:43           ` Paul Durrant
@ 2018-11-08 16:44             ` Paul Durrant
  2018-11-09 10:27               ` Paul Durrant
  2018-11-09 10:27               ` Paul Durrant
  2018-11-08 16:44             ` Paul Durrant
  1 sibling, 2 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 16:44 UTC (permalink / raw)
  To: Paul Durrant, 'Kevin Wolf'
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel,
	'Markus Armbruster',
	Anthony Perard, xen-devel, Max Reitz

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Paul Durrant
> Sent: 08 November 2018 15:44
> To: 'Kevin Wolf' <kwolf@redhat.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> <mreitz@redhat.com>
> Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> 
> > -----Original Message-----
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Sent: 08 November 2018 15:21
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > devel@lists.xenproject.org
> > Subject: Re: [Qemu-devel] xen_disk qdevification
> >
> > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > -----Original Message-----
> > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > Sent: 05 November 2018 15:58
> > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> <tim.smith@citrix.com>;
> > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > qemu-
> > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > >
> > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > >
> > > > >> -----Original Message-----
> > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > >> Sent: 02 November 2018 11:04
> > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> Paul
> > > > Durrant
> > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > <sstabellini@kernel.org>;
> > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > improvements
> > > > >> for xen_disk v2)
> > > > >>
> > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > >> > A series of performance improvements for disks using the Xen PV
> > ring.
> > > > >> >
> > > > >> > These have had fairly extensive testing.
> > > > >> >
> > > > >> > The batching and latency improvements together boost the
> > throughput
> > > > >> > of small reads and writes by two to six percent (measured using
> > fio
> > > > >> > in the guest)
> > > > >> >
> > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > heap
> > > > >> > from 25MB to 5MB in the case of a single datapath process while
> > also
> > > > >> > improving performance.
> > > > >> >
> > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > >>
> > > > >> Completely unrelated, but since you're the first person touching
> > > > >> xen_disk in a while, you're my victim:
> > > > >>
> > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > because
> > > > >> after all those years, it still hasn't been converted to qdev.
> > Markus
> > > > is
> > > > >> currently fixing some other not yet qdevified block device, but
> > after
> > > > >> that xen_disk will be the only one left.
> > > > >>
> > > > >> A while ago, a downstream patch review found out that there are
> > some
> > > > QMP
> > > > >> commands that would immediately crash if a xen_disk device were
> > present
> > > > >> because of the lacking qdevification. This is not the code
> quality
> > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> go.
> > > > >>
> > > > >> So if you guys are still interested in the device, could someone
> > please
> > > > >> finally look into converting it?
> > > > >>
> > > > >
> > > > > I have a patch series to do exactly this. It's somewhat involved
> as
> > I
> > > > > need to convert the whole PV backend infrastructure. I will try to
> > > > > rebase and clean up my series a.s.a.p.
> > > >
> > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> duplicating
> > > > work if you haven't done so already.
> > >
> > > I've come across a bit of a problem that I'm not sure how best to deal
> > > with and so am looking for some advice.
> > >
> > > I now have a qdevified PV disk backend but I can't bring it up because
> > > it fails to acquire a write lock on the qcow2 it is pointing at. This
> > > is because there is also an emulated IDE drive using the same qcow2.
> > > This does not appear to be a problem for the non-qdev xen-disk,
> > > presumably because it is not opening the qcow2 until the emulated
> > > device is unplugged and I don't really want to introduce similar
> > > hackery in my new backend (i.e. I want it to attach to its drive, and
> > > hence open the qcow2, during realize).
> > >
> > > So, I'm not sure what to do... It is not a problem that both a PV
> > > backend and an emulated device are using the same qcow2 because they
> > > will never actually operate simultaneously so is there any way I can
> > > bypass the qcow2 lock check when I create the drive for my PV backend?
> > > (BTW I tried re-using the drive created for the emulated device, but
> > > that doesn't work because there is a check if a drive is already
> > > attached to something).
> > >
> > > Any ideas?
> >
> > I think the clean solution is to keep the BlockBackend open in xen-disk
> > from the beginning, but not requesting write permissions yet.
> >
> > The BlockBackend is created in parse_drive(), when qdev parses the
> > -device drive=... option. At this point, no permissions are requested
> > yet. That is done in blkconf_apply_backend_options(), which is manually
> > called from the devices; specifically from ide_dev_initfn() in IDE, and
> > I assume you call the function from xen-disk as well.
> 
> Yes, I call it during realize.
> 
> >
> > xen-disk should then call this function with readonly=true, and at the
> > point of the handover (when the IDE device is already gone) it can call
> > blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
> > it already holds.
> >
> 
> I tried that and it works fine :-)

Unfortunately I spoke too soon... I still had a patch in place to disable locking checks :-(

What I'm trying to do to maintain compatibility with the existing Xen toolstack (which I think is the only feasible way to make the change avoiding chicken and egg problems) is to use a 'compat' function that creates a drive based on the information that the Xen toolstack writes into xenstore. I'm using drive_new() to do this and it is this that fails.

So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This allows me to get through drive_new() but later I fail to set the write permission with error "Block node is read-only".

> 
> >
> > The other option I see would be that you simply create both devices with
> > share-rw=on (which results in conf->share_rw == true and therefore
> > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > feels like a hack because you don't actually want to have two writers at
> > the same time.
> >
> 
> Yes, that does indeed seem like more of a hack. The first option works so
> I'll go with that.
> 

I'll now see what I can do with this idea.

 Paul


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 15:43           ` Paul Durrant
  2018-11-08 16:44             ` Paul Durrant
@ 2018-11-08 16:44             ` Paul Durrant
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-08 16:44 UTC (permalink / raw)
  To: Paul Durrant, 'Kevin Wolf'
  Cc: Stefano Stabellini, qemu-block, Tim Smith,
	'Markus Armbruster',
	qemu-devel, Anthony Perard, xen-devel, Max Reitz

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Paul Durrant
> Sent: 08 November 2018 15:44
> To: 'Kevin Wolf' <kwolf@redhat.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> <mreitz@redhat.com>
> Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> 
> > -----Original Message-----
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Sent: 08 November 2018 15:21
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > devel@lists.xenproject.org
> > Subject: Re: [Qemu-devel] xen_disk qdevification
> >
> > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > -----Original Message-----
> > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > Sent: 05 November 2018 15:58
> > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> <tim.smith@citrix.com>;
> > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > qemu-
> > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > >
> > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > >
> > > > >> -----Original Message-----
> > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > >> Sent: 02 November 2018 11:04
> > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> Paul
> > > > Durrant
> > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > <sstabellini@kernel.org>;
> > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > improvements
> > > > >> for xen_disk v2)
> > > > >>
> > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > >> > A series of performance improvements for disks using the Xen PV
> > ring.
> > > > >> >
> > > > >> > These have had fairly extensive testing.
> > > > >> >
> > > > >> > The batching and latency improvements together boost the
> > throughput
> > > > >> > of small reads and writes by two to six percent (measured using
> > fio
> > > > >> > in the guest)
> > > > >> >
> > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > heap
> > > > >> > from 25MB to 5MB in the case of a single datapath process while
> > also
> > > > >> > improving performance.
> > > > >> >
> > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > >>
> > > > >> Completely unrelated, but since you're the first person touching
> > > > >> xen_disk in a while, you're my victim:
> > > > >>
> > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > because
> > > > >> after all those years, it still hasn't been converted to qdev.
> > Markus
> > > > is
> > > > >> currently fixing some other not yet qdevified block device, but
> > after
> > > > >> that xen_disk will be the only one left.
> > > > >>
> > > > >> A while ago, a downstream patch review found out that there are
> > some
> > > > QMP
> > > > >> commands that would immediately crash if a xen_disk device were
> > present
> > > > >> because of the lacking qdevification. This is not the code
> quality
> > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> go.
> > > > >>
> > > > >> So if you guys are still interested in the device, could someone
> > please
> > > > >> finally look into converting it?
> > > > >>
> > > > >
> > > > > I have a patch series to do exactly this. It's somewhat involved
> as
> > I
> > > > > need to convert the whole PV backend infrastructure. I will try to
> > > > > rebase and clean up my series a.s.a.p.
> > > >
> > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> duplicating
> > > > work if you haven't done so already.
> > >
> > > I've come across a bit of a problem that I'm not sure how best to deal
> > > with and so am looking for some advice.
> > >
> > > I now have a qdevified PV disk backend but I can't bring it up because
> > > it fails to acquire a write lock on the qcow2 it is pointing at. This
> > > is because there is also an emulated IDE drive using the same qcow2.
> > > This does not appear to be a problem for the non-qdev xen-disk,
> > > presumably because it is not opening the qcow2 until the emulated
> > > device is unplugged and I don't really want to introduce similar
> > > hackery in my new backend (i.e. I want it to attach to its drive, and
> > > hence open the qcow2, during realize).
> > >
> > > So, I'm not sure what to do... It is not a problem that both a PV
> > > backend and an emulated device are using the same qcow2 because they
> > > will never actually operate simultaneously so is there any way I can
> > > bypass the qcow2 lock check when I create the drive for my PV backend?
> > > (BTW I tried re-using the drive created for the emulated device, but
> > > that doesn't work because there is a check if a drive is already
> > > attached to something).
> > >
> > > Any ideas?
> >
> > I think the clean solution is to keep the BlockBackend open in xen-disk
> > from the beginning, but not requesting write permissions yet.
> >
> > The BlockBackend is created in parse_drive(), when qdev parses the
> > -device drive=... option. At this point, no permissions are requested
> > yet. That is done in blkconf_apply_backend_options(), which is manually
> > called from the devices; specifically from ide_dev_initfn() in IDE, and
> > I assume you call the function from xen-disk as well.
> 
> Yes, I call it during realize.
> 
> >
> > xen-disk should then call this function with readonly=true, and at the
> > point of the handover (when the IDE device is already gone) it can call
> > blk_set_perm() to request BLK_PERM_WRITE in addition to the permissions
> > it already holds.
> >
> 
> I tried that and it works fine :-)

Unfortunately I spoke too soon... I still had a patch in place to disable locking checks :-(

What I'm trying to do to maintain compatibility with the existing Xen toolstack (which I think is the only feasible way to make the change avoiding chicken and egg problems) is to use a 'compat' function that creates a drive based on the information that the Xen toolstack writes into xenstore. I'm using drive_new() to do this and it is this that fails.

So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This allows me to get through drive_new() but later I fail to set the write permission with error "Block node is read-only".

> 
> >
> > The other option I see would be that you simply create both devices with
> > share-rw=on (which results in conf->share_rw == true and therefore
> > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > feels like a hack because you don't actually want to have two writers at
> > the same time.
> >
> 
> Yes, that does indeed seem like more of a hack. The first option works so
> I'll go with that.
> 

I'll now see what I can do with this idea.

 Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 16:44             ` Paul Durrant
@ 2018-11-09 10:27               ` Paul Durrant
  2018-11-09 10:40                 ` Kevin Wolf
  2018-11-09 10:40                 ` Kevin Wolf
  2018-11-09 10:27               ` Paul Durrant
  1 sibling, 2 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-09 10:27 UTC (permalink / raw)
  To: 'Kevin Wolf'
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel,
	'Markus Armbruster',
	Anthony Perard, xen-devel, Max Reitz

> -----Original Message-----
> From: Paul Durrant
> Sent: 08 November 2018 16:44
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Kevin Wolf'
> <kwolf@redhat.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> <mreitz@redhat.com>
> Subject: RE: [Qemu-devel] xen_disk qdevification
> 
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> Behalf
> > Of Paul Durrant
> > Sent: 08 November 2018 15:44
> > To: 'Kevin Wolf' <kwolf@redhat.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > <mreitz@redhat.com>
> > Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> >
> > > -----Original Message-----
> > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > Sent: 08 November 2018 15:21
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > > devel@lists.xenproject.org
> > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > >
> > > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > > -----Original Message-----
> > > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > > Sent: 05 November 2018 15:58
> > > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> > <tim.smith@citrix.com>;
> > > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-
> block@nongnu.org;
> > > qemu-
> > > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > > >
> > > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > > >> Sent: 02 November 2018 11:04
> > > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org;
> qemu-
> > > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> > Paul
> > > > > Durrant
> > > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > > <sstabellini@kernel.org>;
> > > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > > improvements
> > > > > >> for xen_disk v2)
> > > > > >>
> > > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > > >> > A series of performance improvements for disks using the Xen
> PV
> > > ring.
> > > > > >> >
> > > > > >> > These have had fairly extensive testing.
> > > > > >> >
> > > > > >> > The batching and latency improvements together boost the
> > > throughput
> > > > > >> > of small reads and writes by two to six percent (measured
> using
> > > fio
> > > > > >> > in the guest)
> > > > > >> >
> > > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > > heap
> > > > > >> > from 25MB to 5MB in the case of a single datapath process
> while
> > > also
> > > > > >> > improving performance.
> > > > > >> >
> > > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > > >>
> > > > > >> Completely unrelated, but since you're the first person
> touching
> > > > > >> xen_disk in a while, you're my victim:
> > > > > >>
> > > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > > because
> > > > > >> after all those years, it still hasn't been converted to qdev.
> > > Markus
> > > > > is
> > > > > >> currently fixing some other not yet qdevified block device, but
> > > after
> > > > > >> that xen_disk will be the only one left.
> > > > > >>
> > > > > >> A while ago, a downstream patch review found out that there are
> > > some
> > > > > QMP
> > > > > >> commands that would immediately crash if a xen_disk device were
> > > present
> > > > > >> because of the lacking qdevification. This is not the code
> > quality
> > > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> > go.
> > > > > >>
> > > > > >> So if you guys are still interested in the device, could
> someone
> > > please
> > > > > >> finally look into converting it?
> > > > > >>
> > > > > >
> > > > > > I have a patch series to do exactly this. It's somewhat involved
> > as
> > > I
> > > > > > need to convert the whole PV backend infrastructure. I will try
> to
> > > > > > rebase and clean up my series a.s.a.p.
> > > > >
> > > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> > duplicating
> > > > > work if you haven't done so already.
> > > >
> > > > I've come across a bit of a problem that I'm not sure how best to
> deal
> > > > with and so am looking for some advice.
> > > >
> > > > I now have a qdevified PV disk backend but I can't bring it up
> because
> > > > it fails to acquire a write lock on the qcow2 it is pointing at.
> This
> > > > is because there is also an emulated IDE drive using the same qcow2.
> > > > This does not appear to be a problem for the non-qdev xen-disk,
> > > > presumably because it is not opening the qcow2 until the emulated
> > > > device is unplugged and I don't really want to introduce similar
> > > > hackery in my new backend (i.e. I want it to attach to its drive,
> and
> > > > hence open the qcow2, during realize).
> > > >
> > > > So, I'm not sure what to do... It is not a problem that both a PV
> > > > backend and an emulated device are using the same qcow2 because they
> > > > will never actually operate simultaneously so is there any way I can
> > > > bypass the qcow2 lock check when I create the drive for my PV
> backend?
> > > > (BTW I tried re-using the drive created for the emulated device, but
> > > > that doesn't work because there is a check if a drive is already
> > > > attached to something).
> > > >
> > > > Any ideas?
> > >
> > > I think the clean solution is to keep the BlockBackend open in xen-
> disk
> > > from the beginning, but not requesting write permissions yet.
> > >
> > > The BlockBackend is created in parse_drive(), when qdev parses the
> > > -device drive=... option. At this point, no permissions are requested
> > > yet. That is done in blkconf_apply_backend_options(), which is
> manually
> > > called from the devices; specifically from ide_dev_initfn() in IDE,
> and
> > > I assume you call the function from xen-disk as well.
> >
> > Yes, I call it during realize.
> >
> > >
> > > xen-disk should then call this function with readonly=true, and at the
> > > point of the handover (when the IDE device is already gone) it can
> call
> > > blk_set_perm() to request BLK_PERM_WRITE in addition to the
> permissions
> > > it already holds.
> > >
> >
> > I tried that and it works fine :-)
> 
> Unfortunately I spoke too soon... I still had a patch in place to disable
> locking checks :-(
> 
> What I'm trying to do to maintain compatibility with the existing Xen
> toolstack (which I think is the only feasible way to make the change
> avoiding chicken and egg problems) is to use a 'compat' function that
> creates a drive based on the information that the Xen toolstack writes
> into xenstore. I'm using drive_new() to do this and it is this that fails.
> 
> So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This
> allows me to get through drive_new() but later I fail to set the write
> permission with error "Block node is read-only".
> 
> >
> > >
> > > The other option I see would be that you simply create both devices
> with
> > > share-rw=on (which results in conf->share_rw == true and therefore
> > > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > > feels like a hack because you don't actually want to have two writers
> at
> > > the same time.
> > >
> >
> > Yes, that does indeed seem like more of a hack. The first option works
> so
> > I'll go with that.
> >
> 
> I'll now see what I can do with this idea.

I think I have a reasonably neat solution, as it is restricted to my compat code and can thus go away when the Xen toolstack is re-educated to use QMP to instantiate PV backends (once they are all qdevified). I simply add "file.locking=off" to the options I pass to drive_new().

  Paul

> 
>  Paul


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-08 16:44             ` Paul Durrant
  2018-11-09 10:27               ` Paul Durrant
@ 2018-11-09 10:27               ` Paul Durrant
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-11-09 10:27 UTC (permalink / raw)
  To: 'Kevin Wolf'
  Cc: Stefano Stabellini, qemu-block, Tim Smith,
	'Markus Armbruster',
	qemu-devel, Anthony Perard, xen-devel, Max Reitz

> -----Original Message-----
> From: Paul Durrant
> Sent: 08 November 2018 16:44
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Kevin Wolf'
> <kwolf@redhat.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> Armbruster' <armbru@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> <mreitz@redhat.com>
> Subject: RE: [Qemu-devel] xen_disk qdevification
> 
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> Behalf
> > Of Paul Durrant
> > Sent: 08 November 2018 15:44
> > To: 'Kevin Wolf' <kwolf@redhat.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > <mreitz@redhat.com>
> > Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> >
> > > -----Original Message-----
> > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > Sent: 08 November 2018 15:21
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > > devel@lists.xenproject.org
> > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > >
> > > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > > -----Original Message-----
> > > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > > Sent: 05 November 2018 15:58
> > > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> > <tim.smith@citrix.com>;
> > > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-
> block@nongnu.org;
> > > qemu-
> > > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > > >
> > > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > > >> Sent: 02 November 2018 11:04
> > > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org;
> qemu-
> > > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> > Paul
> > > > > Durrant
> > > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > > <sstabellini@kernel.org>;
> > > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > > improvements
> > > > > >> for xen_disk v2)
> > > > > >>
> > > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > > >> > A series of performance improvements for disks using the Xen
> PV
> > > ring.
> > > > > >> >
> > > > > >> > These have had fairly extensive testing.
> > > > > >> >
> > > > > >> > The batching and latency improvements together boost the
> > > throughput
> > > > > >> > of small reads and writes by two to six percent (measured
> using
> > > fio
> > > > > >> > in the guest)
> > > > > >> >
> > > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > > heap
> > > > > >> > from 25MB to 5MB in the case of a single datapath process
> while
> > > also
> > > > > >> > improving performance.
> > > > > >> >
> > > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > > >>
> > > > > >> Completely unrelated, but since you're the first person
> touching
> > > > > >> xen_disk in a while, you're my victim:
> > > > > >>
> > > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > > because
> > > > > >> after all those years, it still hasn't been converted to qdev.
> > > Markus
> > > > > is
> > > > > >> currently fixing some other not yet qdevified block device, but
> > > after
> > > > > >> that xen_disk will be the only one left.
> > > > > >>
> > > > > >> A while ago, a downstream patch review found out that there are
> > > some
> > > > > QMP
> > > > > >> commands that would immediately crash if a xen_disk device were
> > > present
> > > > > >> because of the lacking qdevification. This is not the code
> > quality
> > > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> > go.
> > > > > >>
> > > > > >> So if you guys are still interested in the device, could
> someone
> > > please
> > > > > >> finally look into converting it?
> > > > > >>
> > > > > >
> > > > > > I have a patch series to do exactly this. It's somewhat involved
> > as
> > > I
> > > > > > need to convert the whole PV backend infrastructure. I will try
> to
> > > > > > rebase and clean up my series a.s.a.p.
> > > > >
> > > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> > duplicating
> > > > > work if you haven't done so already.
> > > >
> > > > I've come across a bit of a problem that I'm not sure how best to
> deal
> > > > with and so am looking for some advice.
> > > >
> > > > I now have a qdevified PV disk backend but I can't bring it up
> because
> > > > it fails to acquire a write lock on the qcow2 it is pointing at.
> This
> > > > is because there is also an emulated IDE drive using the same qcow2.
> > > > This does not appear to be a problem for the non-qdev xen-disk,
> > > > presumably because it is not opening the qcow2 until the emulated
> > > > device is unplugged and I don't really want to introduce similar
> > > > hackery in my new backend (i.e. I want it to attach to its drive,
> and
> > > > hence open the qcow2, during realize).
> > > >
> > > > So, I'm not sure what to do... It is not a problem that both a PV
> > > > backend and an emulated device are using the same qcow2 because they
> > > > will never actually operate simultaneously so is there any way I can
> > > > bypass the qcow2 lock check when I create the drive for my PV
> backend?
> > > > (BTW I tried re-using the drive created for the emulated device, but
> > > > that doesn't work because there is a check if a drive is already
> > > > attached to something).
> > > >
> > > > Any ideas?
> > >
> > > I think the clean solution is to keep the BlockBackend open in xen-
> disk
> > > from the beginning, but not requesting write permissions yet.
> > >
> > > The BlockBackend is created in parse_drive(), when qdev parses the
> > > -device drive=... option. At this point, no permissions are requested
> > > yet. That is done in blkconf_apply_backend_options(), which is
> manually
> > > called from the devices; specifically from ide_dev_initfn() in IDE,
> and
> > > I assume you call the function from xen-disk as well.
> >
> > Yes, I call it during realize.
> >
> > >
> > > xen-disk should then call this function with readonly=true, and at the
> > > point of the handover (when the IDE device is already gone) it can
> call
> > > blk_set_perm() to request BLK_PERM_WRITE in addition to the
> permissions
> > > it already holds.
> > >
> >
> > I tried that and it works fine :-)
> 
> Unfortunately I spoke too soon... I still had a patch in place to disable
> locking checks :-(
> 
> What I'm trying to do to maintain compatibility with the existing Xen
> toolstack (which I think is the only feasible way to make the change
> avoiding chicken and egg problems) is to use a 'compat' function that
> creates a drive based on the information that the Xen toolstack writes
> into xenstore. I'm using drive_new() to do this and it is this that fails.
> 
> So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This
> allows me to get through drive_new() but later I fail to set the write
> permission with error "Block node is read-only".
> 
> >
> > >
> > > The other option I see would be that you simply create both devices
> with
> > > share-rw=on (which results in conf->share_rw == true and therefore
> > > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > > feels like a hack because you don't actually want to have two writers
> at
> > > the same time.
> > >
> >
> > Yes, that does indeed seem like more of a hack. The first option works
> so
> > I'll go with that.
> >
> 
> I'll now see what I can do with this idea.

I think I have a reasonably neat solution, as it is restricted to my compat code and can thus go away when the Xen toolstack is re-educated to use QMP to instantiate PV backends (once they are all qdevified). I simply add "file.locking=off" to the options I pass to drive_new().

  Paul

> 
>  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-09 10:27               ` Paul Durrant
@ 2018-11-09 10:40                 ` Kevin Wolf
  2018-11-09 10:40                 ` Kevin Wolf
  1 sibling, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-09 10:40 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, qemu-block, Tim Smith, qemu-devel,
	'Markus Armbruster',
	Anthony Perard, xen-devel, Max Reitz

Am 09.11.2018 um 11:27 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Paul Durrant
> > Sent: 08 November 2018 16:44
> > To: Paul Durrant <Paul.Durrant@citrix.com>; 'Kevin Wolf'
> > <kwolf@redhat.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > <mreitz@redhat.com>
> > Subject: RE: [Qemu-devel] xen_disk qdevification
> > 
> > > -----Original Message-----
> > > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> > Behalf
> > > Of Paul Durrant
> > > Sent: 08 November 2018 15:44
> > > To: 'Kevin Wolf' <kwolf@redhat.com>
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > > Armbruster' <armbru@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > > <mreitz@redhat.com>
> > > Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> > >
> > > > -----Original Message-----
> > > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > Sent: 08 November 2018 15:21
> > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > > > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > > > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > > > devel@lists.xenproject.org
> > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > >
> > > > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > > > -----Original Message-----
> > > > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > > > Sent: 05 November 2018 15:58
> > > > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> > > <tim.smith@citrix.com>;
> > > > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-
> > block@nongnu.org;
> > > > qemu-
> > > > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > > > >
> > > > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > > > >
> > > > > > >> -----Original Message-----
> > > > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > > > >> Sent: 02 November 2018 11:04
> > > > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org;
> > qemu-
> > > > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> > > Paul
> > > > > > Durrant
> > > > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > > > <sstabellini@kernel.org>;
> > > > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > > > improvements
> > > > > > >> for xen_disk v2)
> > > > > > >>
> > > > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > > > >> > A series of performance improvements for disks using the Xen
> > PV
> > > > ring.
> > > > > > >> >
> > > > > > >> > These have had fairly extensive testing.
> > > > > > >> >
> > > > > > >> > The batching and latency improvements together boost the
> > > > throughput
> > > > > > >> > of small reads and writes by two to six percent (measured
> > using
> > > > fio
> > > > > > >> > in the guest)
> > > > > > >> >
> > > > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > > > heap
> > > > > > >> > from 25MB to 5MB in the case of a single datapath process
> > while
> > > > also
> > > > > > >> > improving performance.
> > > > > > >> >
> > > > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > > > >>
> > > > > > >> Completely unrelated, but since you're the first person
> > touching
> > > > > > >> xen_disk in a while, you're my victim:
> > > > > > >>
> > > > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > > > because
> > > > > > >> after all those years, it still hasn't been converted to qdev.
> > > > Markus
> > > > > > is
> > > > > > >> currently fixing some other not yet qdevified block device, but
> > > > after
> > > > > > >> that xen_disk will be the only one left.
> > > > > > >>
> > > > > > >> A while ago, a downstream patch review found out that there are
> > > > some
> > > > > > QMP
> > > > > > >> commands that would immediately crash if a xen_disk device were
> > > > present
> > > > > > >> because of the lacking qdevification. This is not the code
> > > quality
> > > > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> > > go.
> > > > > > >>
> > > > > > >> So if you guys are still interested in the device, could
> > someone
> > > > please
> > > > > > >> finally look into converting it?
> > > > > > >>
> > > > > > >
> > > > > > > I have a patch series to do exactly this. It's somewhat involved
> > > as
> > > > I
> > > > > > > need to convert the whole PV backend infrastructure. I will try
> > to
> > > > > > > rebase and clean up my series a.s.a.p.
> > > > > >
> > > > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> > > duplicating
> > > > > > work if you haven't done so already.
> > > > >
> > > > > I've come across a bit of a problem that I'm not sure how best to
> > deal
> > > > > with and so am looking for some advice.
> > > > >
> > > > > I now have a qdevified PV disk backend but I can't bring it up
> > because
> > > > > it fails to acquire a write lock on the qcow2 it is pointing at.
> > This
> > > > > is because there is also an emulated IDE drive using the same qcow2.
> > > > > This does not appear to be a problem for the non-qdev xen-disk,
> > > > > presumably because it is not opening the qcow2 until the emulated
> > > > > device is unplugged and I don't really want to introduce similar
> > > > > hackery in my new backend (i.e. I want it to attach to its drive,
> > and
> > > > > hence open the qcow2, during realize).
> > > > >
> > > > > So, I'm not sure what to do... It is not a problem that both a PV
> > > > > backend and an emulated device are using the same qcow2 because they
> > > > > will never actually operate simultaneously so is there any way I can
> > > > > bypass the qcow2 lock check when I create the drive for my PV
> > backend?
> > > > > (BTW I tried re-using the drive created for the emulated device, but
> > > > > that doesn't work because there is a check if a drive is already
> > > > > attached to something).
> > > > >
> > > > > Any ideas?
> > > >
> > > > I think the clean solution is to keep the BlockBackend open in xen-
> > disk
> > > > from the beginning, but not requesting write permissions yet.
> > > >
> > > > The BlockBackend is created in parse_drive(), when qdev parses the
> > > > -device drive=... option. At this point, no permissions are requested
> > > > yet. That is done in blkconf_apply_backend_options(), which is
> > manually
> > > > called from the devices; specifically from ide_dev_initfn() in IDE,
> > and
> > > > I assume you call the function from xen-disk as well.
> > >
> > > Yes, I call it during realize.
> > >
> > > >
> > > > xen-disk should then call this function with readonly=true, and at the
> > > > point of the handover (when the IDE device is already gone) it can
> > call
> > > > blk_set_perm() to request BLK_PERM_WRITE in addition to the
> > permissions
> > > > it already holds.
> > > >
> > >
> > > I tried that and it works fine :-)
> > 
> > Unfortunately I spoke too soon... I still had a patch in place to disable
> > locking checks :-(
> > 
> > What I'm trying to do to maintain compatibility with the existing Xen
> > toolstack (which I think is the only feasible way to make the change
> > avoiding chicken and egg problems) is to use a 'compat' function that
> > creates a drive based on the information that the Xen toolstack writes
> > into xenstore. I'm using drive_new() to do this and it is this that fails.
> > 
> > So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This
> > allows me to get through drive_new() but later I fail to set the write
> > permission with error "Block node is read-only".

drive_new() is really a legacy interface. It immediately creates a
BlockBackend and takes permissions for it. You don't want that here.
(And I'd like it to go away in a few releases, so better don't let new
code rely on it.)

If you can, it would be better to just call qmp_blockdev_add(). This
creates only a node (BlockDriverState) without a BlockBackend. You'll
get your BlockBackend from the qdev drive property.

> > > > The other option I see would be that you simply create both devices
> > with
> > > > share-rw=on (which results in conf->share_rw == true and therefore
> > > > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > > > feels like a hack because you don't actually want to have two writers
> > at
> > > > the same time.
> > > >
> > >
> > > Yes, that does indeed seem like more of a hack. The first option works
> > so
> > > I'll go with that.
> > >
> > 
> > I'll now see what I can do with this idea.
> 
> I think I have a reasonably neat solution, as it is restricted to my
> compat code and can thus go away when the Xen toolstack is re-educated
> to use QMP to instantiate PV backends (once they are all qdevified). I
> simply add "file.locking=off" to the options I pass to drive_new().

I wouldn't agree on "neat", but if you think so...

Kevin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] xen_disk qdevification
  2018-11-09 10:27               ` Paul Durrant
  2018-11-09 10:40                 ` Kevin Wolf
@ 2018-11-09 10:40                 ` Kevin Wolf
  1 sibling, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-11-09 10:40 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, qemu-block, Tim Smith,
	'Markus Armbruster',
	qemu-devel, Anthony Perard, xen-devel, Max Reitz

Am 09.11.2018 um 11:27 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Paul Durrant
> > Sent: 08 November 2018 16:44
> > To: Paul Durrant <Paul.Durrant@citrix.com>; 'Kevin Wolf'
> > <kwolf@redhat.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > Armbruster' <armbru@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > <mreitz@redhat.com>
> > Subject: RE: [Qemu-devel] xen_disk qdevification
> > 
> > > -----Original Message-----
> > > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> > Behalf
> > > Of Paul Durrant
> > > Sent: 08 November 2018 15:44
> > > To: 'Kevin Wolf' <kwolf@redhat.com>
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org;
> > > Tim Smith <tim.smith@citrix.com>; qemu-devel@nongnu.org; 'Markus
> > > Armbruster' <armbru@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Max Reitz
> > > <mreitz@redhat.com>
> > > Subject: Re: [Xen-devel] [Qemu-devel] xen_disk qdevification
> > >
> > > > -----Original Message-----
> > > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > Sent: 08 November 2018 15:21
> > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > Cc: 'Markus Armbruster' <armbru@redhat.com>; Anthony Perard
> > > > <anthony.perard@citrix.com>; Tim Smith <tim.smith@citrix.com>; Stefano
> > > > Stabellini <sstabellini@kernel.org>; qemu-block@nongnu.org; qemu-
> > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; xen-
> > > > devel@lists.xenproject.org
> > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > >
> > > > Am 08.11.2018 um 15:00 hat Paul Durrant geschrieben:
> > > > > > -----Original Message-----
> > > > > > From: Markus Armbruster [mailto:armbru@redhat.com]
> > > > > > Sent: 05 November 2018 15:58
> > > > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > > Cc: 'Kevin Wolf' <kwolf@redhat.com>; Tim Smith
> > > <tim.smith@citrix.com>;
> > > > > > Stefano Stabellini <sstabellini@kernel.org>; qemu-
> > block@nongnu.org;
> > > > qemu-
> > > > > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > > > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > > > > Subject: Re: [Qemu-devel] xen_disk qdevification
> > > > > >
> > > > > > Paul Durrant <Paul.Durrant@citrix.com> writes:
> > > > > >
> > > > > > >> -----Original Message-----
> > > > > > >> From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > > > >> Sent: 02 November 2018 11:04
> > > > > > >> To: Tim Smith <tim.smith@citrix.com>
> > > > > > >> Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org;
> > qemu-
> > > > > > >> block@nongnu.org; Anthony Perard <anthony.perard@citrix.com>;
> > > Paul
> > > > > > Durrant
> > > > > > >> <Paul.Durrant@citrix.com>; Stefano Stabellini
> > > > <sstabellini@kernel.org>;
> > > > > > >> Max Reitz <mreitz@redhat.com>; armbru@redhat.com
> > > > > > >> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance
> > > > > > improvements
> > > > > > >> for xen_disk v2)
> > > > > > >>
> > > > > > >> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > > > > >> > A series of performance improvements for disks using the Xen
> > PV
> > > > ring.
> > > > > > >> >
> > > > > > >> > These have had fairly extensive testing.
> > > > > > >> >
> > > > > > >> > The batching and latency improvements together boost the
> > > > throughput
> > > > > > >> > of small reads and writes by two to six percent (measured
> > using
> > > > fio
> > > > > > >> > in the guest)
> > > > > > >> >
> > > > > > >> > Avoiding repeated calls to posix_memalign() reduced the dirty
> > > > heap
> > > > > > >> > from 25MB to 5MB in the case of a single datapath process
> > while
> > > > also
> > > > > > >> > improving performance.
> > > > > > >> >
> > > > > > >> > v2 removes some checkpatch complaints and fixes the CCs
> > > > > > >>
> > > > > > >> Completely unrelated, but since you're the first person
> > touching
> > > > > > >> xen_disk in a while, you're my victim:
> > > > > > >>
> > > > > > >> At KVM Forum we discussed sending a patch to deprecate xen_disk
> > > > because
> > > > > > >> after all those years, it still hasn't been converted to qdev.
> > > > Markus
> > > > > > is
> > > > > > >> currently fixing some other not yet qdevified block device, but
> > > > after
> > > > > > >> that xen_disk will be the only one left.
> > > > > > >>
> > > > > > >> A while ago, a downstream patch review found out that there are
> > > > some
> > > > > > QMP
> > > > > > >> commands that would immediately crash if a xen_disk device were
> > > > present
> > > > > > >> because of the lacking qdevification. This is not the code
> > > quality
> > > > > > >> standard I envision for QEMU. It's time for non-qdev devices to
> > > go.
> > > > > > >>
> > > > > > >> So if you guys are still interested in the device, could
> > someone
> > > > please
> > > > > > >> finally look into converting it?
> > > > > > >>
> > > > > > >
> > > > > > > I have a patch series to do exactly this. It's somewhat involved
> > > as
> > > > I
> > > > > > > need to convert the whole PV backend infrastructure. I will try
> > to
> > > > > > > rebase and clean up my series a.s.a.p.
> > > > > >
> > > > > > Awesome!  Please coordinate with Anthony Prerard to avoid
> > > duplicating
> > > > > > work if you haven't done so already.
> > > > >
> > > > > I've come across a bit of a problem that I'm not sure how best to
> > deal
> > > > > with and so am looking for some advice.
> > > > >
> > > > > I now have a qdevified PV disk backend but I can't bring it up
> > because
> > > > > it fails to acquire a write lock on the qcow2 it is pointing at.
> > This
> > > > > is because there is also an emulated IDE drive using the same qcow2.
> > > > > This does not appear to be a problem for the non-qdev xen-disk,
> > > > > presumably because it is not opening the qcow2 until the emulated
> > > > > device is unplugged and I don't really want to introduce similar
> > > > > hackery in my new backend (i.e. I want it to attach to its drive,
> > and
> > > > > hence open the qcow2, during realize).
> > > > >
> > > > > So, I'm not sure what to do... It is not a problem that both a PV
> > > > > backend and an emulated device are using the same qcow2 because they
> > > > > will never actually operate simultaneously so is there any way I can
> > > > > bypass the qcow2 lock check when I create the drive for my PV
> > backend?
> > > > > (BTW I tried re-using the drive created for the emulated device, but
> > > > > that doesn't work because there is a check if a drive is already
> > > > > attached to something).
> > > > >
> > > > > Any ideas?
> > > >
> > > > I think the clean solution is to keep the BlockBackend open in xen-
> > disk
> > > > from the beginning, but not requesting write permissions yet.
> > > >
> > > > The BlockBackend is created in parse_drive(), when qdev parses the
> > > > -device drive=... option. At this point, no permissions are requested
> > > > yet. That is done in blkconf_apply_backend_options(), which is
> > manually
> > > > called from the devices; specifically from ide_dev_initfn() in IDE,
> > and
> > > > I assume you call the function from xen-disk as well.
> > >
> > > Yes, I call it during realize.
> > >
> > > >
> > > > xen-disk should then call this function with readonly=true, and at the
> > > > point of the handover (when the IDE device is already gone) it can
> > call
> > > > blk_set_perm() to request BLK_PERM_WRITE in addition to the
> > permissions
> > > > it already holds.
> > > >
> > >
> > > I tried that and it works fine :-)
> > 
> > Unfortunately I spoke too soon... I still had a patch in place to disable
> > locking checks :-(
> > 
> > What I'm trying to do to maintain compatibility with the existing Xen
> > toolstack (which I think is the only feasible way to make the change
> > avoiding chicken and egg problems) is to use a 'compat' function that
> > creates a drive based on the information that the Xen toolstack writes
> > into xenstore. I'm using drive_new() to do this and it is this that fails.
> > 
> > So, I have tried setting BDRV_OPT_READ_ONLY and BDRV_OPT_FORCE_SHARE. This
> > allows me to get through drive_new() but later I fail to set the write
> > permission with error "Block node is read-only".

drive_new() is really a legacy interface. It immediately creates a
BlockBackend and takes permissions for it. You don't want that here.
(And I'd like it to go away in a few releases, so better don't let new
code rely on it.)

If you can, it would be better to just call qmp_blockdev_add(). This
creates only a node (BlockDriverState) without a BlockBackend. You'll
get your BlockBackend from the qdev drive property.

> > > > The other option I see would be that you simply create both devices
> > with
> > > > share-rw=on (which results in conf->share_rw == true and therefore
> > > > shared BLK_PERM_WRITE in blkconf_apply_backend_options()), but that
> > > > feels like a hack because you don't actually want to have two writers
> > at
> > > > the same time.
> > > >
> > >
> > > Yes, that does indeed seem like more of a hack. The first option works
> > so
> > > I'll go with that.
> > >
> > 
> > I'll now see what I can do with this idea.
> 
> I think I have a reasonably neat solution, as it is restricted to my
> compat code and can thus go away when the Xen toolstack is re-educated
> to use QMP to instantiate PV backends (once they are all qdevified). I
> simply add "file.locking=off" to the options I pass to drive_new().

I wouldn't agree on "neat", but if you think so...

Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
  2018-11-02 11:13   ` Paul Durrant
  2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
@ 2018-12-12  8:59   ` Olaf Hering
  2018-12-12  9:22       ` Paul Durrant
                       ` (4 more replies)
  2018-12-12  8:59   ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
  3 siblings, 5 replies; 49+ messages in thread
From: Olaf Hering @ 2018-12-12  8:59 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Tim Smith, Stefano Stabellini, qemu-block, armbru, qemu-devel,
	Max Reitz, Paul Durrant, Anthony Perard, xen-devel

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

On Fri, Nov 02, Kevin Wolf wrote:

> A while ago, a downstream patch review found out that there are some QMP
> commands that would immediately crash if a xen_disk device were present
> because of the lacking qdevification. This is not the code quality
> standard I envision for QEMU. It's time for non-qdev devices to go.

Do you have that backwards by any chance? IMO the presence of assert()
contributes to bad code quality, not the drivers that trigger those
asserts. It is bad enough that two QEMU releases went out while being in
bad shape.

Anyway, hopefully Paul or whoever will find the time and energy to
convert the code at some point.

Olaf

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
                     ` (2 preceding siblings ...)
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
@ 2018-12-12  8:59   ` Olaf Hering
  3 siblings, 0 replies; 49+ messages in thread
From: Olaf Hering @ 2018-12-12  8:59 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Stefano Stabellini, qemu-block, Tim Smith, armbru, qemu-devel,
	Max Reitz, Paul Durrant, Anthony Perard, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 653 bytes --]

On Fri, Nov 02, Kevin Wolf wrote:

> A while ago, a downstream patch review found out that there are some QMP
> commands that would immediately crash if a xen_disk device were present
> because of the lacking qdevification. This is not the code quality
> standard I envision for QEMU. It's time for non-qdev devices to go.

Do you have that backwards by any chance? IMO the presence of assert()
contributes to bad code quality, not the drivers that trigger those
asserts. It is bad enough that two QEMU releases went out while being in
bad shape.

Anyway, hopefully Paul or whoever will find the time and energy to
convert the code at some point.

Olaf

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
@ 2018-12-12  9:22       ` Paul Durrant
  2018-12-12 12:03     ` [Qemu-devel] [Xen-devel] " Kevin Wolf
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-12-12  9:22 UTC (permalink / raw)
  To: 'Olaf Hering', Kevin Wolf
  Cc: Tim Smith, Stefano Stabellini, qemu-block, armbru, qemu-devel,
	Max Reitz, Anthony Perard, xen-devel

> -----Original Message-----
> From: Olaf Hering [mailto:olaf@aepfle.de]
> Sent: 12 December 2018 09:00
> To: Kevin Wolf <kwolf@redhat.com>
> Cc: Tim Smith <tim.smith@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; qemu-block@nongnu.org; armbru@redhat.com; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Anthony Perard <anthony.perard@citrix.com>;
> xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] xen_disk qdevification (was: [PATCH 0/3]
> Performance improvements for xen_disk v2)
> 
> On Fri, Nov 02, Kevin Wolf wrote:
> 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts. It is bad enough that two QEMU releases went out while being in
> bad shape.
> 
> Anyway, hopefully Paul or whoever will find the time and energy to
> convert the code at some point.

It's done. V4 of my series has acks from the Xen maintainers. I think it needs some other acks from block maintainers but it's basically ready to go in (and I've verified that no assert is tripped by xentop at least). Also I hope to post the re-based patches from Tim (one of which fixes the memory issues) later today.

  Paul

> 
> Olaf

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
@ 2018-12-12  9:22       ` Paul Durrant
  0 siblings, 0 replies; 49+ messages in thread
From: Paul Durrant @ 2018-12-12  9:22 UTC (permalink / raw)
  To: 'Olaf Hering', Kevin Wolf
  Cc: Stefano Stabellini, qemu-block, Tim Smith, armbru, qemu-devel,
	Max Reitz, Anthony Perard, xen-devel

> -----Original Message-----
> From: Olaf Hering [mailto:olaf@aepfle.de]
> Sent: 12 December 2018 09:00
> To: Kevin Wolf <kwolf@redhat.com>
> Cc: Tim Smith <tim.smith@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; qemu-block@nongnu.org; armbru@redhat.com; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Anthony Perard <anthony.perard@citrix.com>;
> xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] xen_disk qdevification (was: [PATCH 0/3]
> Performance improvements for xen_disk v2)
> 
> On Fri, Nov 02, Kevin Wolf wrote:
> 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts. It is bad enough that two QEMU releases went out while being in
> bad shape.
> 
> Anyway, hopefully Paul or whoever will find the time and energy to
> convert the code at some point.

It's done. V4 of my series has acks from the Xen maintainers. I think it needs some other acks from block maintainers but it's basically ready to go in (and I've verified that no assert is tripped by xentop at least). Also I hope to post the re-based patches from Tim (one of which fixes the memory issues) later today.

  Paul

> 
> Olaf
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
  2018-12-12  9:22       ` Paul Durrant
@ 2018-12-12 12:03     ` Kevin Wolf
  2018-12-12 12:03     ` Kevin Wolf
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-12-12 12:03 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Tim Smith, Stefano Stabellini, qemu-block, armbru, qemu-devel,
	Max Reitz, Paul Durrant, Anthony Perard, xen-devel

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

Am 12.12.2018 um 09:59 hat Olaf Hering geschrieben:
> On Fri, Nov 02, Kevin Wolf wrote:
> 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts.

You like shooting the messenger, it seems? Bugs aren't bad, only
catching them is?

But anyway, in this case, I seem to remember it was a plain old
segfault, not a failed assertion.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
  2018-12-12  9:22       ` Paul Durrant
  2018-12-12 12:03     ` [Qemu-devel] [Xen-devel] " Kevin Wolf
@ 2018-12-12 12:03     ` Kevin Wolf
  2018-12-12 12:04     ` [Qemu-devel] xen_disk qdevification Markus Armbruster
  2018-12-12 12:04     ` [Qemu-devel] [Xen-devel] " Markus Armbruster
  4 siblings, 0 replies; 49+ messages in thread
From: Kevin Wolf @ 2018-12-12 12:03 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Stefano Stabellini, qemu-block, Tim Smith, armbru, qemu-devel,
	Max Reitz, Paul Durrant, Anthony Perard, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 753 bytes --]

Am 12.12.2018 um 09:59 hat Olaf Hering geschrieben:
> On Fri, Nov 02, Kevin Wolf wrote:
> 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts.

You like shooting the messenger, it seems? Bugs aren't bad, only
catching them is?

But anyway, in this case, I seem to remember it was a plain old
segfault, not a failed assertion.

Kevin

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel] [Xen-devel] xen_disk qdevification
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
                       ` (3 preceding siblings ...)
  2018-12-12 12:04     ` [Qemu-devel] xen_disk qdevification Markus Armbruster
@ 2018-12-12 12:04     ` Markus Armbruster
  4 siblings, 0 replies; 49+ messages in thread
From: Markus Armbruster @ 2018-12-12 12:04 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Kevin Wolf, Stefano Stabellini, qemu-block, Tim Smith,
	qemu-devel, Max Reitz, Paul Durrant, Anthony Perard, xen-devel

Olaf Hering <olaf@aepfle.de> writes:

> On Fri, Nov 02, Kevin Wolf wrote:
>
>> A while ago, a downstream patch review found out that there are some QMP
>> commands that would immediately crash if a xen_disk device were present
>> because of the lacking qdevification. This is not the code quality
>> standard I envision for QEMU. It's time for non-qdev devices to go.
>
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts. It is bad enough that two QEMU releases went out while being in
> bad shape.

Converting block devices to the qdev infrastructure (introduced in 2009)
has been a longwinded affair.  We've repeatedly reminded people in
charge of the stragglers to update their code.

Neglecting to update code to current infrastructure creates a burden and
a risk.  The burden is on the maintainers of the infrastructure: they
get to drag along outmoded infrastructure.  The risk is on the users of
the code: it becomes a special case, and eventually acquires its special
bugs.

Putting the blame for them entirely on the maintainers of the
infrastructure is not fair.

> Anyway, hopefully Paul or whoever will find the time and energy to
> convert the code at some point.

Yes, Paul's taking care of it.  Much appreciated.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Qemu-devel]  xen_disk qdevification
  2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
                       ` (2 preceding siblings ...)
  2018-12-12 12:03     ` Kevin Wolf
@ 2018-12-12 12:04     ` Markus Armbruster
  2018-12-12 12:04     ` [Qemu-devel] [Xen-devel] " Markus Armbruster
  4 siblings, 0 replies; 49+ messages in thread
From: Markus Armbruster @ 2018-12-12 12:04 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Kevin Wolf, Stefano Stabellini, qemu-block, Tim Smith,
	qemu-devel, Max Reitz, Paul Durrant, Anthony Perard, xen-devel

Olaf Hering <olaf@aepfle.de> writes:

> On Fri, Nov 02, Kevin Wolf wrote:
>
>> A while ago, a downstream patch review found out that there are some QMP
>> commands that would immediately crash if a xen_disk device were present
>> because of the lacking qdevification. This is not the code quality
>> standard I envision for QEMU. It's time for non-qdev devices to go.
>
> Do you have that backwards by any chance? IMO the presence of assert()
> contributes to bad code quality, not the drivers that trigger those
> asserts. It is bad enough that two QEMU releases went out while being in
> bad shape.

Converting block devices to the qdev infrastructure (introduced in 2009)
has been a longwinded affair.  We've repeatedly reminded people in
charge of the stragglers to update their code.

Neglecting to update code to current infrastructure creates a burden and
a risk.  The burden is on the maintainers of the infrastructure: they
get to drag along outmoded infrastructure.  The risk is on the users of
the code: it becomes a special case, and eventually acquires its special
bugs.

Putting the blame for them entirely on the maintainers of the
infrastructure is not fair.

> Anyway, hopefully Paul or whoever will find the time and energy to
> convert the code at some point.

Yes, Paul's taking care of it.  Much appreciated.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2018-12-12 12:04 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02 10:00 [Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2 Tim Smith
2018-11-02 10:00 ` [PATCH 1/3] Improve xen_disk batching behaviour Tim Smith
2018-11-02 10:00 ` [Qemu-devel] " Tim Smith
2018-11-02 11:14   ` Paul Durrant
2018-11-02 11:14   ` Paul Durrant
2018-11-02 13:53   ` Anthony PERARD
2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
2018-11-02 10:01 ` [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency Tim Smith
2018-11-02 11:14   ` Paul Durrant
2018-11-02 11:14     ` Paul Durrant
2018-11-02 13:53   ` Anthony PERARD
2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
2018-11-02 10:01 ` Tim Smith
2018-11-02 10:01 ` [PATCH 3/3] Avoid repeated memory allocation in xen_disk Tim Smith
2018-11-02 10:01 ` [Qemu-devel] " Tim Smith
2018-11-02 11:15   ` Paul Durrant
2018-11-02 11:15     ` Paul Durrant
2018-11-02 13:53   ` [Qemu-devel] " Anthony PERARD
2018-11-02 13:53     ` Anthony PERARD
2018-11-02 11:04 ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Kevin Wolf
2018-11-02 11:04 ` [Qemu-devel] " Kevin Wolf
2018-11-02 11:13   ` Paul Durrant
2018-11-02 11:13   ` [Qemu-devel] " Paul Durrant
2018-11-02 12:14     ` Kevin Wolf
2018-11-02 12:14     ` [Qemu-devel] " Kevin Wolf
2018-11-05 15:57     ` [Qemu-devel] xen_disk qdevification Markus Armbruster
2018-11-05 15:57       ` Markus Armbruster
2018-11-05 16:15       ` Paul Durrant
2018-11-05 16:15         ` Paul Durrant
2018-11-08 14:00       ` Paul Durrant
2018-11-08 14:00       ` Paul Durrant
2018-11-08 15:21         ` Kevin Wolf
2018-11-08 15:43           ` Paul Durrant
2018-11-08 15:43           ` Paul Durrant
2018-11-08 16:44             ` Paul Durrant
2018-11-09 10:27               ` Paul Durrant
2018-11-09 10:40                 ` Kevin Wolf
2018-11-09 10:40                 ` Kevin Wolf
2018-11-09 10:27               ` Paul Durrant
2018-11-08 16:44             ` Paul Durrant
2018-11-08 15:21         ` Kevin Wolf
2018-12-12  8:59   ` [Qemu-devel] [Xen-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering
2018-12-12  9:22     ` Paul Durrant
2018-12-12  9:22       ` Paul Durrant
2018-12-12 12:03     ` [Qemu-devel] [Xen-devel] " Kevin Wolf
2018-12-12 12:03     ` Kevin Wolf
2018-12-12 12:04     ` [Qemu-devel] xen_disk qdevification Markus Armbruster
2018-12-12 12:04     ` [Qemu-devel] [Xen-devel] " Markus Armbruster
2018-12-12  8:59   ` xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2) Olaf Hering

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.