qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 for-5.0 0/7] block-copy improvements: part I
@ 2019-11-27 18:08 Vladimir Sementsov-Ogievskiy
  2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

Hi all!

This is a first part of my
  [RFC 00/24] backup performance: block_status + async

Patches are mostly separate by their idea, but sending them all in
separate is inefficient.

Vladimir Sementsov-Ogievskiy (7):
  block/block-copy: specialcase first copy_range request
  block/block-copy: use block_status
  block/block-copy: factor out block_copy_find_inflight_req
  block/block-copy: refactor interfaces to use bytes instead of end
  block/block-copy: rename start to offset in interfaces
  block/block-copy: reduce intersecting request lock
  block/block-copy: hide structure definitions

 include/block/block-copy.h |  57 +-----
 block/backup-top.c         |   6 +-
 block/backup.c             |  27 +--
 block/block-copy.c         | 379 ++++++++++++++++++++++++++++---------
 block/trace-events         |   1 +
 5 files changed, 316 insertions(+), 154 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 1/7] block/block-copy: specialcase first copy_range request
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29  7:38   ` Andrey Shinkevich
  2020-02-07 17:28   ` Max Reitz
  2019-11-27 18:08 ` [PATCH v2 2/7] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

In block_copy_do_copy we fallback to read+write if copy_range failed.
In this case copy_size is larger than defined for buffered IO, and
there is corresponding commit. Still, backup copies data cluster by
cluster, and most of requests are limited to one cluster anyway, so the
only source of this one bad-limited request is copy-before-write
operation.

Further patch will move backup to use block_copy directly, than for
cases where copy_range is not supported, first request will be
oversized in each backup. It's not good, let's change it now.

Fix is simple: just limit first copy_range request like buffer-based
request. If it succeed, set larger copy_range limit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/block-copy.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 79798a1567..8602e2cae7 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -70,16 +70,19 @@ void block_copy_state_free(BlockCopyState *s)
     g_free(s);
 }
 
+static uint32_t block_copy_max_transfer(BdrvChild *source, BdrvChild *target)
+{
+    return MIN_NON_ZERO(INT_MAX,
+                        MIN_NON_ZERO(source->bs->bl.max_transfer,
+                                     target->bs->bl.max_transfer));
+}
+
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                      int64_t cluster_size,
                                      BdrvRequestFlags write_flags, Error **errp)
 {
     BlockCopyState *s;
     BdrvDirtyBitmap *copy_bitmap;
-    uint32_t max_transfer =
-            MIN_NON_ZERO(INT_MAX,
-                         MIN_NON_ZERO(source->bs->bl.max_transfer,
-                                      target->bs->bl.max_transfer));
 
     copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
                                            errp);
@@ -99,7 +102,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         .mem = shres_create(BLOCK_COPY_MAX_MEM),
     };
 
-    if (max_transfer < cluster_size) {
+    if (block_copy_max_transfer(source, target) < cluster_size) {
         /*
          * copy_range does not respect max_transfer. We don't want to bother
          * with requests smaller than block-copy cluster size, so fallback to
@@ -114,12 +117,11 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         s->copy_size = cluster_size;
     } else {
         /*
-         * copy_range does not respect max_transfer (it's a TODO), so we factor
-         * that in here.
+         * We enable copy-range, but keep small copy_size, until first
+         * successful copy_range (look at block_copy_do_copy).
          */
         s->use_copy_range = true;
-        s->copy_size = MIN(MAX(cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
-                           QEMU_ALIGN_DOWN(max_transfer, cluster_size));
+        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
     }
 
     QLIST_INIT(&s->inflight_reqs);
@@ -168,7 +170,21 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
             s->use_copy_range = false;
             s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
             /* Fallback to read+write with allocated buffer */
-        } else {
+        } else if (s->use_copy_range) {
+            /*
+             * Successful copy-range. Now increase copy_size.
+             * copy_range does not respect max_transfer (it's a TODO), so we
+             * factor that in here.
+             *
+             * Note: we double-check s->use_copy_range for the case when
+             * parallel block-copy request unset it during previous
+             * bdrv_co_copy_range call.
+             */
+            s->copy_size =
+                    MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
+                        QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
+                                                                s->target),
+                                        s->cluster_size));
             goto out;
         }
     }
@@ -176,7 +192,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     /*
      * In case of failed copy_range request above, we may proceed with buffered
      * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
-     * be properly limited, so don't care too much.
+     * be properly limited, so don't care too much. Moreover the most possible
+     * case (copy_range is unsupported for the configuration, so the very first
+     * copy_range request fails) is handled by setting large copy_size only
+     * after first successful copy_range.
      */
 
     bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 2/7] block/block-copy: use block_status
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29  9:12   ` Andrey Shinkevich
  2020-02-07 17:46   ` Max Reitz
  2019-11-27 18:08 ` [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

Use bdrv_block_status_above to chose effective chunk size and to handle
zeroes effectively.

This substitutes checking for just being allocated or not, and drops
old code path for it. Assistance by backup job is dropped too, as
caching block-status information is more difficult than just caching
is-allocated information in our dirty bitmap, and backup job is not
good place for this caching anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/block-copy.c | 67 +++++++++++++++++++++++++++++++++++++---------
 block/trace-events |  1 +
 2 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 8602e2cae7..74295d93d5 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -152,7 +152,7 @@ void block_copy_set_callbacks(
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
                                            int64_t start, int64_t end,
-                                           bool *error_is_read)
+                                           bool zeroes, bool *error_is_read)
 {
     int ret;
     int nbytes = MIN(end, s->len) - start;
@@ -162,6 +162,18 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     assert(QEMU_IS_ALIGNED(end, s->cluster_size));
     assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
 
+    if (zeroes) {
+        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
+                                    ~BDRV_REQ_WRITE_COMPRESSED);
+        if (ret < 0) {
+            trace_block_copy_write_zeroes_fail(s, start, ret);
+            if (error_is_read) {
+                *error_is_read = false;
+            }
+        }
+        return ret;
+    }
+
     if (s->use_copy_range) {
         ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
                                  0, s->write_flags);
@@ -225,6 +237,34 @@ out:
     return ret;
 }
 
+static int block_copy_block_status(BlockCopyState *s, int64_t offset,
+                                   int64_t bytes, int64_t *pnum)
+{
+    int64_t num;
+    BlockDriverState *base;
+    int ret;
+
+    if (s->skip_unallocated && s->source->bs->backing) {
+        base = s->source->bs->backing->bs;
+    } else {
+        base = NULL;
+    }
+
+    ret = bdrv_block_status_above(s->source->bs, base, offset, bytes, &num,
+                                  NULL, NULL);
+    if (ret < 0 || num < s->cluster_size) {
+        num = s->cluster_size;
+        ret = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_DATA;
+    } else if (offset + num == s->len) {
+        num = QEMU_ALIGN_UP(num, s->cluster_size);
+    } else {
+        num = QEMU_ALIGN_DOWN(num, s->cluster_size);
+    }
+
+    *pnum = num;
+    return ret;
+}
+
 /*
  * Check if the cluster starting at offset is allocated or not.
  * return via pnum the number of contiguous clusters sharing this allocation.
@@ -301,7 +341,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
 {
     int ret = 0;
     int64_t end = bytes + start; /* bytes */
-    int64_t status_bytes;
     BlockCopyInFlightReq req;
 
     /*
@@ -318,7 +357,7 @@ int coroutine_fn block_copy(BlockCopyState *s,
     block_copy_inflight_req_begin(s, &req, start, end);
 
     while (start < end) {
-        int64_t next_zero, chunk_end;
+        int64_t next_zero, chunk_end, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
             trace_block_copy_skip(s, start);
@@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
             chunk_end = next_zero;
         }
 
-        if (s->skip_unallocated) {
-            ret = block_copy_reset_unallocated(s, start, &status_bytes);
-            if (ret == 0) {
-                trace_block_copy_skip_range(s, start, status_bytes);
-                start += status_bytes;
-                continue;
-            }
-            /* Clamp to known allocated region */
-            chunk_end = MIN(chunk_end, start + status_bytes);
+        ret = block_copy_block_status(s, start, chunk_end - start,
+                                      &status_bytes);
+        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
+            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
+            s->progress_reset_callback(s->progress_opaque);
+            trace_block_copy_skip_range(s, start, status_bytes);
+            start += status_bytes;
+            continue;
         }
 
+        chunk_end = MIN(chunk_end, start + status_bytes);
+
         trace_block_copy_process(s, start);
 
         bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
 
         co_get_from_shres(s->mem, chunk_end - start);
-        ret = block_copy_do_copy(s, start, chunk_end, error_is_read);
+        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
+                                 error_is_read);
         co_put_to_shres(s->mem, chunk_end - start);
         if (ret < 0) {
             bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
diff --git a/block/trace-events b/block/trace-events
index 6ba86decca..346537a1d2 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -48,6 +48,7 @@ block_copy_process(void *bcs, int64_t start) "bcs %p start %"PRId64
 block_copy_copy_range_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 block_copy_read_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 block_copy_write_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
+block_copy_write_zeroes_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 
 # ../blockdev.c
 qmp_block_job_cancel(void *job) "job %p"
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
  2019-11-27 18:08 ` [PATCH v2 2/7] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29  9:25   ` Andrey Shinkevich
  2020-02-07 17:50   ` Max Reitz
  2019-11-27 18:08 ` [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

Split block_copy_find_inflight_req to be used in seprate.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/block-copy.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 74295d93d5..94e7e855ef 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -24,23 +24,30 @@
 #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
+static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
+                                                          int64_t start,
+                                                          int64_t end)
+{
+    BlockCopyInFlightReq *req;
+
+    QLIST_FOREACH(req, &s->inflight_reqs, list) {
+        if (end > req->start_byte && start < req->end_byte) {
+            return req;
+        }
+    }
+
+    return NULL;
+}
+
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
                                                        int64_t start,
                                                        int64_t end)
 {
     BlockCopyInFlightReq *req;
-    bool waited;
-
-    do {
-        waited = false;
-        QLIST_FOREACH(req, &s->inflight_reqs, list) {
-            if (end > req->start_byte && start < req->end_byte) {
-                qemu_co_queue_wait(&req->wait_queue, NULL);
-                waited = true;
-                break;
-            }
-        }
-    } while (waited);
+
+    while ((req = block_copy_find_inflight_req(s, start, end))) {
+        qemu_co_queue_wait(&req->wait_queue, NULL);
+    }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2019-11-27 18:08 ` [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29 17:12   ` Andrey Shinkevich
  2020-02-07 18:01   ` Max Reitz
  2019-11-27 18:08 ` [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

We have a lot of "chunk_end - start" invocations, let's switch to
bytes/cur_bytes scheme instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h |  4 +--
 block/block-copy.c         | 68 ++++++++++++++++++++------------------
 2 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 0a161724d7..7321b3d305 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -19,8 +19,8 @@
 #include "qemu/co-shared-resource.h"
 
 typedef struct BlockCopyInFlightReq {
-    int64_t start_byte;
-    int64_t end_byte;
+    int64_t start;
+    int64_t bytes;
     QLIST_ENTRY(BlockCopyInFlightReq) list;
     CoQueue wait_queue; /* coroutines blocked on this request */
 } BlockCopyInFlightReq;
diff --git a/block/block-copy.c b/block/block-copy.c
index 94e7e855ef..cc273b6cb8 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -26,12 +26,12 @@
 
 static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
                                                           int64_t start,
-                                                          int64_t end)
+                                                          int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
     QLIST_FOREACH(req, &s->inflight_reqs, list) {
-        if (end > req->start_byte && start < req->end_byte) {
+        if (start + bytes > req->start && start < req->start + req->bytes) {
             return req;
         }
     }
@@ -41,21 +41,21 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
 
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
                                                        int64_t start,
-                                                       int64_t end)
+                                                       int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
-    while ((req = block_copy_find_inflight_req(s, start, end))) {
+    while ((req = block_copy_find_inflight_req(s, start, bytes))) {
         qemu_co_queue_wait(&req->wait_queue, NULL);
     }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
-                                          int64_t start, int64_t end)
+                                          int64_t start, int64_t bytes)
 {
-    req->start_byte = start;
-    req->end_byte = end;
+    req->start = start;
+    req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
 }
@@ -150,24 +150,26 @@ void block_copy_set_callbacks(
 /*
  * block_copy_do_copy
  *
- * Do copy of cluser-aligned chunk. @end is allowed to exceed s->len only to
- * cover last cluster when s->len is not aligned to clusters.
+ * Do copy of cluser-aligned chunk. Requested region is allowed to exceed s->len
+ * only to cover last cluster when s->len is not aligned to clusters.
  *
  * No sync here: nor bitmap neighter intersecting requests handling, only copy.
  *
  * Returns 0 on success.
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
-                                           int64_t start, int64_t end,
+                                           int64_t start, int64_t bytes,
                                            bool zeroes, bool *error_is_read)
 {
     int ret;
-    int nbytes = MIN(end, s->len) - start;
+    int nbytes = MIN(start + bytes, s->len) - start;
     void *bounce_buffer = NULL;
 
+    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
     assert(QEMU_IS_ALIGNED(start, s->cluster_size));
-    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
-    assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
+    assert(start + bytes <= s->len ||
+           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
 
     if (zeroes) {
         ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
@@ -347,7 +349,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
                             bool *error_is_read)
 {
     int ret = 0;
-    int64_t end = bytes + start; /* bytes */
     BlockCopyInFlightReq req;
 
     /*
@@ -358,58 +359,59 @@ int coroutine_fn block_copy(BlockCopyState *s,
            bdrv_get_aio_context(s->target->bs));
 
     assert(QEMU_IS_ALIGNED(start, s->cluster_size));
-    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
     block_copy_wait_inflight_reqs(s, start, bytes);
-    block_copy_inflight_req_begin(s, &req, start, end);
+    block_copy_inflight_req_begin(s, &req, start, bytes);
 
-    while (start < end) {
-        int64_t next_zero, chunk_end, status_bytes;
+    while (bytes) {
+        int64_t next_zero, cur_bytes, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
             trace_block_copy_skip(s, start);
             start += s->cluster_size;
+            bytes -= s->cluster_size;
             continue; /* already copied */
         }
 
-        chunk_end = MIN(end, start + s->copy_size);
+        cur_bytes = MIN(bytes, s->copy_size);
 
         next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
-                                                chunk_end - start);
+                                                cur_bytes);
         if (next_zero >= 0) {
             assert(next_zero > start); /* start is dirty */
-            assert(next_zero < chunk_end); /* no need to do MIN() */
-            chunk_end = next_zero;
+            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
+            cur_bytes = next_zero - start;
         }
 
-        ret = block_copy_block_status(s, start, chunk_end - start,
-                                      &status_bytes);
+        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
             bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
             s->progress_reset_callback(s->progress_opaque);
             trace_block_copy_skip_range(s, start, status_bytes);
             start += status_bytes;
+            bytes -= status_bytes;
             continue;
         }
 
-        chunk_end = MIN(chunk_end, start + status_bytes);
+        cur_bytes = MIN(cur_bytes, status_bytes);
 
         trace_block_copy_process(s, start);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
+        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
 
-        co_get_from_shres(s->mem, chunk_end - start);
-        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
+        co_get_from_shres(s->mem, cur_bytes);
+        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
-        co_put_to_shres(s->mem, chunk_end - start);
+        co_put_to_shres(s->mem, cur_bytes);
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
+            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
             break;
         }
 
-        s->progress_bytes_callback(chunk_end - start, s->progress_opaque);
-        start = chunk_end;
-        ret = 0;
+        s->progress_bytes_callback(cur_bytes, s->progress_opaque);
+        start += cur_bytes;
+        bytes -= cur_bytes;
     }
 
     block_copy_inflight_req_end(&req);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2019-11-27 18:08 ` [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29 17:37   ` Andrey Shinkevich
  2020-02-07 18:04   ` Max Reitz
  2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

offset/bytes pair is more usual naming in block layer, let's use it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h |  2 +-
 block/block-copy.c         | 80 +++++++++++++++++++-------------------
 2 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 7321b3d305..d96b097267 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -19,7 +19,7 @@
 #include "qemu/co-shared-resource.h"
 
 typedef struct BlockCopyInFlightReq {
-    int64_t start;
+    int64_t offset;
     int64_t bytes;
     QLIST_ENTRY(BlockCopyInFlightReq) list;
     CoQueue wait_queue; /* coroutines blocked on this request */
diff --git a/block/block-copy.c b/block/block-copy.c
index cc273b6cb8..20068cd699 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -25,13 +25,13 @@
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
 static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
-                                                          int64_t start,
+                                                          int64_t offset,
                                                           int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
     QLIST_FOREACH(req, &s->inflight_reqs, list) {
-        if (start + bytes > req->start && start < req->start + req->bytes) {
+        if (offset + bytes > req->offset && offset < req->offset + req->bytes) {
             return req;
         }
     }
@@ -40,21 +40,21 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
 }
 
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
-                                                       int64_t start,
+                                                       int64_t offset,
                                                        int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
-    while ((req = block_copy_find_inflight_req(s, start, bytes))) {
+    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
         qemu_co_queue_wait(&req->wait_queue, NULL);
     }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
-                                          int64_t start, int64_t bytes)
+                                          int64_t offset, int64_t bytes)
 {
-    req->start = start;
+    req->offset = offset;
     req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
@@ -158,24 +158,24 @@ void block_copy_set_callbacks(
  * Returns 0 on success.
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
-                                           int64_t start, int64_t bytes,
+                                           int64_t offset, int64_t bytes,
                                            bool zeroes, bool *error_is_read)
 {
     int ret;
-    int nbytes = MIN(start + bytes, s->len) - start;
+    int nbytes = MIN(offset + bytes, s->len) - offset;
     void *bounce_buffer = NULL;
 
-    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
-    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
+    assert(offset >= 0 && bytes > 0 && INT64_MAX - offset >= bytes);
+    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
-    assert(start + bytes <= s->len ||
-           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
+    assert(offset + bytes <= s->len ||
+           offset + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
 
     if (zeroes) {
-        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
+        ret = bdrv_co_pwrite_zeroes(s->target, offset, nbytes, s->write_flags &
                                     ~BDRV_REQ_WRITE_COMPRESSED);
         if (ret < 0) {
-            trace_block_copy_write_zeroes_fail(s, start, ret);
+            trace_block_copy_write_zeroes_fail(s, offset, ret);
             if (error_is_read) {
                 *error_is_read = false;
             }
@@ -184,10 +184,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     }
 
     if (s->use_copy_range) {
-        ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
+        ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
                                  0, s->write_flags);
         if (ret < 0) {
-            trace_block_copy_copy_range_fail(s, start, ret);
+            trace_block_copy_copy_range_fail(s, offset, ret);
             s->use_copy_range = false;
             s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
             /* Fallback to read+write with allocated buffer */
@@ -221,19 +221,19 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
 
     bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
 
-    ret = bdrv_co_pread(s->source, start, nbytes, bounce_buffer, 0);
+    ret = bdrv_co_pread(s->source, offset, nbytes, bounce_buffer, 0);
     if (ret < 0) {
-        trace_block_copy_read_fail(s, start, ret);
+        trace_block_copy_read_fail(s, offset, ret);
         if (error_is_read) {
             *error_is_read = true;
         }
         goto out;
     }
 
-    ret = bdrv_co_pwrite(s->target, start, nbytes, bounce_buffer,
+    ret = bdrv_co_pwrite(s->target, offset, nbytes, bounce_buffer,
                          s->write_flags);
     if (ret < 0) {
-        trace_block_copy_write_fail(s, start, ret);
+        trace_block_copy_write_fail(s, offset, ret);
         if (error_is_read) {
             *error_is_read = false;
         }
@@ -345,7 +345,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 }
 
 int coroutine_fn block_copy(BlockCopyState *s,
-                            int64_t start, uint64_t bytes,
+                            int64_t offset, uint64_t bytes,
                             bool *error_is_read)
 {
     int ret = 0;
@@ -358,59 +358,59 @@ int coroutine_fn block_copy(BlockCopyState *s,
     assert(bdrv_get_aio_context(s->source->bs) ==
            bdrv_get_aio_context(s->target->bs));
 
-    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    block_copy_wait_inflight_reqs(s, start, bytes);
-    block_copy_inflight_req_begin(s, &req, start, bytes);
+    block_copy_wait_inflight_reqs(s, offset, bytes);
+    block_copy_inflight_req_begin(s, &req, offset, bytes);
 
     while (bytes) {
         int64_t next_zero, cur_bytes, status_bytes;
 
-        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
-            trace_block_copy_skip(s, start);
-            start += s->cluster_size;
+        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
+            trace_block_copy_skip(s, offset);
+            offset += s->cluster_size;
             bytes -= s->cluster_size;
             continue; /* already copied */
         }
 
         cur_bytes = MIN(bytes, s->copy_size);
 
-        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
+        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
                                                 cur_bytes);
         if (next_zero >= 0) {
-            assert(next_zero > start); /* start is dirty */
-            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
-            cur_bytes = next_zero - start;
+            assert(next_zero > offset); /* offset is dirty */
+            assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
+            cur_bytes = next_zero - offset;
         }
 
-        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
+        ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
-            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
+            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
             s->progress_reset_callback(s->progress_opaque);
-            trace_block_copy_skip_range(s, start, status_bytes);
-            start += status_bytes;
+            trace_block_copy_skip_range(s, offset, status_bytes);
+            offset += status_bytes;
             bytes -= status_bytes;
             continue;
         }
 
         cur_bytes = MIN(cur_bytes, status_bytes);
 
-        trace_block_copy_process(s, start);
+        trace_block_copy_process(s, offset);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
+        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
 
         co_get_from_shres(s->mem, cur_bytes);
-        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
+        ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
         co_put_to_shres(s->mem, cur_bytes);
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
+            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
             break;
         }
 
         s->progress_bytes_callback(cur_bytes, s->progress_opaque);
-        start += cur_bytes;
+        offset += cur_bytes;
         bytes -= cur_bytes;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2019-11-27 18:08 ` [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-29 20:05   ` Andrey Shinkevich
                     ` (2 more replies)
  2019-11-27 18:08 ` [PATCH v2 7/7] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
  2019-12-19  9:01 ` [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  7 siblings, 3 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

Currently, block_copy operation lock the whole requested region. But
there is no reason to lock clusters, which are already copied, it will
disturb other parallel block_copy requests for no reason.

Let's instead do the following:

Lock only sub-region, which we are going to operate on. Then, after
copying all dirty sub-regions, we should wait for intersecting
requests block-copy, if they failed, we should retry these new dirty
clusters.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 95 insertions(+), 21 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 20068cd699..aca44b13fb 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
     return NULL;
 }
 
-static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
-                                                       int64_t offset,
-                                                       int64_t bytes)
+/*
+ * If there are no intersecting requests return false. Otherwise, wait for the
+ * first found intersecting request to finish and return true.
+ */
+static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
+                                             int64_t end)
 {
-    BlockCopyInFlightReq *req;
+    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
 
-    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
-        qemu_co_queue_wait(&req->wait_queue, NULL);
+    if (!req) {
+        return false;
     }
+
+    qemu_co_queue_wait(&req->wait_queue, NULL);
+
+    return true;
 }
 
+/* Called only on full-dirty region */
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
                                           int64_t offset, int64_t bytes)
 {
+    assert(!block_copy_find_inflight_req(s, offset, bytes));
+
+    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+
     req->offset = offset;
     req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
 }
 
-static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
+static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
+        BlockCopyInFlightReq *req, int64_t new_bytes)
 {
+    if (new_bytes == req->bytes) {
+        return;
+    }
+
+    assert(new_bytes > 0 && new_bytes < req->bytes);
+
+    bdrv_set_dirty_bitmap(s->copy_bitmap,
+                          req->offset + new_bytes, req->bytes - new_bytes);
+
+    req->bytes = new_bytes;
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
+                                                     BlockCopyInFlightReq *req,
+                                                     int ret)
+{
+    if (ret < 0) {
+        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
+    }
     QLIST_REMOVE(req, list);
     qemu_co_queue_restart_all(&req->wait_queue);
 }
@@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
     return ret;
 }
 
-int coroutine_fn block_copy(BlockCopyState *s,
-                            int64_t offset, uint64_t bytes,
-                            bool *error_is_read)
+/*
+ * block_copy_dirty_clusters
+ *
+ * Copy dirty clusters in @start/@bytes range.
+ * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
+ * clusters found and -errno on failure.
+ */
+static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
+                                                  int64_t offset, int64_t bytes,
+                                                  bool *error_is_read)
 {
     int ret = 0;
-    BlockCopyInFlightReq req;
+    bool found_dirty = false;
 
     /*
      * block_copy() user is responsible for keeping source and target in same
@@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    block_copy_wait_inflight_reqs(s, offset, bytes);
-    block_copy_inflight_req_begin(s, &req, offset, bytes);
-
     while (bytes) {
+        BlockCopyInFlightReq req;
         int64_t next_zero, cur_bytes, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
@@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
             continue; /* already copied */
         }
 
+        found_dirty = true;
+
         cur_bytes = MIN(bytes, s->copy_size);
 
         next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
@@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
             assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
             cur_bytes = next_zero - offset;
         }
+        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
 
         ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
+        block_copy_inflight_req_shrink(s, &req, status_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
-            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
+            block_copy_inflight_req_end(s, &req, 0);
             s->progress_reset_callback(s->progress_opaque);
             trace_block_copy_skip_range(s, offset, status_bytes);
             offset += status_bytes;
@@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
 
         trace_block_copy_process(s, offset);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
-
         co_get_from_shres(s->mem, cur_bytes);
         ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
         co_put_to_shres(s->mem, cur_bytes);
+        block_copy_inflight_req_end(s, &req, ret);
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
-            break;
+            return ret;
         }
 
         s->progress_bytes_callback(cur_bytes, s->progress_opaque);
@@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
         bytes -= cur_bytes;
     }
 
-    block_copy_inflight_req_end(&req);
+    return found_dirty;
+}
 
-    return ret;
+int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
+                            bool *error_is_read)
+{
+    while (true) {
+        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
+
+        if (ret < 0) {
+            /*
+             * IO operation failed, which means the whole block_copy request
+             * failed.
+             */
+            return ret;
+        }
+        if (ret) {
+            /*
+             * Something was copied, which means that there were yield points
+             * and some new dirty bits may appered (due to failed parallel
+             * block-copy requests).
+             */
+            continue;
+        }
+
+        /*
+         * Here ret == 0, which means that there is no dirty clusters in
+         * requested region.
+         */
+
+        if (!block_copy_wait_one(s, start, bytes)) {
+            /* No dirty bits and nothing to wait: the whole request is done */
+            break;
+        }
+    }
+
+    return 0;
 }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 7/7] block/block-copy: hide structure definitions
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
@ 2019-11-27 18:08 ` Vladimir Sementsov-Ogievskiy
  2020-01-30 18:58   ` Andrey Shinkevich
  2020-02-17 14:04   ` Max Reitz
  2019-12-19  9:01 ` [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  7 siblings, 2 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-11-27 18:08 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den, jsnow

Hide structure definitions and add explicit API instead, to keep an
eye on the scope of the shared fields.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h | 57 +++------------------------------
 block/backup-top.c         |  6 ++--
 block/backup.c             | 27 ++++++++--------
 block/block-copy.c         | 64 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 86 insertions(+), 68 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index d96b097267..753fa663ac 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -18,61 +18,9 @@
 #include "block/block.h"
 #include "qemu/co-shared-resource.h"
 
-typedef struct BlockCopyInFlightReq {
-    int64_t offset;
-    int64_t bytes;
-    QLIST_ENTRY(BlockCopyInFlightReq) list;
-    CoQueue wait_queue; /* coroutines blocked on this request */
-} BlockCopyInFlightReq;
-
 typedef void (*ProgressBytesCallbackFunc)(int64_t bytes, void *opaque);
 typedef void (*ProgressResetCallbackFunc)(void *opaque);
-typedef struct BlockCopyState {
-    /*
-     * BdrvChild objects are not owned or managed by block-copy. They are
-     * provided by block-copy user and user is responsible for appropriate
-     * permissions on these children.
-     */
-    BdrvChild *source;
-    BdrvChild *target;
-    BdrvDirtyBitmap *copy_bitmap;
-    int64_t cluster_size;
-    bool use_copy_range;
-    int64_t copy_size;
-    uint64_t len;
-    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
-
-    BdrvRequestFlags write_flags;
-
-    /*
-     * skip_unallocated:
-     *
-     * Used by sync=top jobs, which first scan the source node for unallocated
-     * areas and clear them in the copy_bitmap.  During this process, the bitmap
-     * is thus not fully initialized: It may still have bits set for areas that
-     * are unallocated and should actually not be copied.
-     *
-     * This is indicated by skip_unallocated.
-     *
-     * In this case, block_copy() will query the source’s allocation status,
-     * skip unallocated regions, clear them in the copy_bitmap, and invoke
-     * block_copy_reset_unallocated() every time it does.
-     */
-    bool skip_unallocated;
-
-    /* progress_bytes_callback: called when some copying progress is done. */
-    ProgressBytesCallbackFunc progress_bytes_callback;
-
-    /*
-     * progress_reset_callback: called when some bytes reset from copy_bitmap
-     * (see @skip_unallocated above). The callee is assumed to recalculate how
-     * many bytes remain based on the dirty bit count of copy_bitmap.
-     */
-    ProgressResetCallbackFunc progress_reset_callback;
-    void *progress_opaque;
-
-    SharedResource *mem;
-} BlockCopyState;
+typedef struct BlockCopyState BlockCopyState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                      int64_t cluster_size,
@@ -93,4 +41,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
                             bool *error_is_read);
 
+BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s);
+void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip);
+
 #endif /* BLOCK_COPY_H */
diff --git a/block/backup-top.c b/block/backup-top.c
index 7cdb1f8eba..1026628b57 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -38,6 +38,7 @@ typedef struct BDRVBackupTopState {
     BlockCopyState *bcs;
     BdrvChild *target;
     bool active;
+    int64_t cluster_size;
 } BDRVBackupTopState;
 
 static coroutine_fn int backup_top_co_preadv(
@@ -51,8 +52,8 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, uint64_t offset,
                                        uint64_t bytes)
 {
     BDRVBackupTopState *s = bs->opaque;
-    uint64_t end = QEMU_ALIGN_UP(offset + bytes, s->bcs->cluster_size);
-    uint64_t off = QEMU_ALIGN_DOWN(offset, s->bcs->cluster_size);
+    uint64_t end = QEMU_ALIGN_UP(offset + bytes, s->cluster_size);
+    uint64_t off = QEMU_ALIGN_DOWN(offset, s->cluster_size);
 
     return block_copy(s->bcs, off, end - off, NULL);
 }
@@ -227,6 +228,7 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
         goto failed_after_append;
     }
 
+    state->cluster_size = cluster_size;
     state->bcs = block_copy_state_new(top->backing, state->target,
                                       cluster_size, write_flags, &local_err);
     if (local_err) {
diff --git a/block/backup.c b/block/backup.c
index cf62b1a38c..acab0d08da 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -48,6 +48,7 @@ typedef struct BackupBlockJob {
     int64_t cluster_size;
 
     BlockCopyState *bcs;
+    BdrvDirtyBitmap *bcs_bitmap;
 } BackupBlockJob;
 
 static const BlockJobDriver backup_job_driver;
@@ -63,7 +64,7 @@ static void backup_progress_bytes_callback(int64_t bytes, void *opaque)
 static void backup_progress_reset_callback(void *opaque)
 {
     BackupBlockJob *s = opaque;
-    uint64_t estimate = bdrv_get_dirty_count(s->bcs->copy_bitmap);
+    uint64_t estimate = bdrv_get_dirty_count(s->bcs_bitmap);
 
     job_progress_set_remaining(&s->common.job, estimate);
 }
@@ -111,8 +112,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
 
     if (ret < 0 && job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) {
         /* If we failed and synced, merge in the bits we didn't copy: */
-        bdrv_dirty_bitmap_merge_internal(bm, job->bcs->copy_bitmap,
-                                         NULL, true);
+        bdrv_dirty_bitmap_merge_internal(bm, job->bcs_bitmap, NULL, true);
     }
 }
 
@@ -151,7 +151,7 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
         return;
     }
 
-    bdrv_set_dirty_bitmap(backup_job->bcs->copy_bitmap, 0, backup_job->len);
+    bdrv_set_dirty_bitmap(backup_job->bcs_bitmap, 0, backup_job->len);
 }
 
 static BlockErrorAction backup_error_action(BackupBlockJob *job,
@@ -196,7 +196,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
     BdrvDirtyBitmapIter *bdbi;
     int ret = 0;
 
-    bdbi = bdrv_dirty_iter_new(job->bcs->copy_bitmap);
+    bdbi = bdrv_dirty_iter_new(job->bcs_bitmap);
     while ((offset = bdrv_dirty_iter_next(bdbi)) != -1) {
         do {
             if (yield_and_check(job)) {
@@ -216,13 +216,13 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
     return ret;
 }
 
-static void backup_init_copy_bitmap(BackupBlockJob *job)
+static void backup_init_bcs_bitmap(BackupBlockJob *job)
 {
     bool ret;
     uint64_t estimate;
 
     if (job->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
-        ret = bdrv_dirty_bitmap_merge_internal(job->bcs->copy_bitmap,
+        ret = bdrv_dirty_bitmap_merge_internal(job->bcs_bitmap,
                                                job->sync_bitmap,
                                                NULL, true);
         assert(ret);
@@ -232,12 +232,12 @@ static void backup_init_copy_bitmap(BackupBlockJob *job)
              * We can't hog the coroutine to initialize this thoroughly.
              * Set a flag and resume work when we are able to yield safely.
              */
-            job->bcs->skip_unallocated = true;
+            block_copy_set_skip_unallocated(job->bcs, true);
         }
-        bdrv_set_dirty_bitmap(job->bcs->copy_bitmap, 0, job->len);
+        bdrv_set_dirty_bitmap(job->bcs_bitmap, 0, job->len);
     }
 
-    estimate = bdrv_get_dirty_count(job->bcs->copy_bitmap);
+    estimate = bdrv_get_dirty_count(job->bcs_bitmap);
     job_progress_set_remaining(&job->common.job, estimate);
 }
 
@@ -246,7 +246,7 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
     int ret = 0;
 
-    backup_init_copy_bitmap(s);
+    backup_init_bcs_bitmap(s);
 
     if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
         int64_t offset = 0;
@@ -265,12 +265,12 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
 
             offset += count;
         }
-        s->bcs->skip_unallocated = false;
+        block_copy_set_skip_unallocated(s->bcs, false);
     }
 
     if (s->sync_mode == MIRROR_SYNC_MODE_NONE) {
         /*
-         * All bits are set in copy_bitmap to allow any cluster to be copied.
+         * All bits are set in bcs_bitmap to allow any cluster to be copied.
          * This does not actually require them to be copied.
          */
         while (!job_is_cancelled(job)) {
@@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     job->sync_bitmap = sync_bitmap;
     job->bitmap_mode = bitmap_mode;
     job->bcs = bcs;
+    job->bcs_bitmap = block_copy_dirty_bitmap(bcs);
     job->cluster_size = cluster_size;
     job->len = len;
 
diff --git a/block/block-copy.c b/block/block-copy.c
index aca44b13fb..7e14e86a2d 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -24,6 +24,60 @@
 #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
+typedef struct BlockCopyInFlightReq {
+    int64_t offset;
+    int64_t bytes;
+    QLIST_ENTRY(BlockCopyInFlightReq) list;
+    CoQueue wait_queue; /* coroutines blocked on this request */
+} BlockCopyInFlightReq;
+
+typedef struct BlockCopyState {
+    /*
+     * BdrvChild objects are not owned or managed by block-copy. They are
+     * provided by block-copy user and user is responsible for appropriate
+     * permissions on these children.
+     */
+    BdrvChild *source;
+    BdrvChild *target;
+    BdrvDirtyBitmap *copy_bitmap;
+    int64_t cluster_size;
+    bool use_copy_range;
+    int64_t copy_size;
+    uint64_t len;
+    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
+
+    BdrvRequestFlags write_flags;
+
+    /*
+     * skip_unallocated:
+     *
+     * Used by sync=top jobs, which first scan the source node for unallocated
+     * areas and clear them in the copy_bitmap.  During this process, the bitmap
+     * is thus not fully initialized: It may still have bits set for areas that
+     * are unallocated and should actually not be copied.
+     *
+     * This is indicated by skip_unallocated.
+     *
+     * In this case, block_copy() will query the source’s allocation status,
+     * skip unallocated regions, clear them in the copy_bitmap, and invoke
+     * block_copy_reset_unallocated() every time it does.
+     */
+    bool skip_unallocated;
+
+    /* progress_bytes_callback: called when some copying progress is done. */
+    ProgressBytesCallbackFunc progress_bytes_callback;
+
+    /*
+     * progress_reset_callback: called when some bytes reset from copy_bitmap
+     * (see @skip_unallocated above). The callee is assumed to recalculate how
+     * many bytes remain based on the dirty bit count of copy_bitmap.
+     */
+    ProgressResetCallbackFunc progress_reset_callback;
+    void *progress_opaque;
+
+    SharedResource *mem;
+} BlockCopyState;
+
 static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
                                                           int64_t offset,
                                                           int64_t bytes)
@@ -492,3 +546,13 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
 
     return 0;
 }
+
+BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s)
+{
+    return s->copy_bitmap;
+}
+
+void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip)
+{
+    s->skip_unallocated = skip;
+}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 for-5.0 0/7] block-copy improvements: part I
  2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2019-11-27 18:08 ` [PATCH v2 7/7] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
@ 2019-12-19  9:01 ` Vladimir Sementsov-Ogievskiy
  2020-01-20  9:09   ` Vladimir Sementsov-Ogievskiy
  7 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-12-19  9:01 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz

ping. Series applies on current master

27.11.2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> This is a first part of my
>    [RFC 00/24] backup performance: block_status + async
> 
> Patches are mostly separate by their idea, but sending them all in
> separate is inefficient.
> 
> Vladimir Sementsov-Ogievskiy (7):
>    block/block-copy: specialcase first copy_range request
>    block/block-copy: use block_status
>    block/block-copy: factor out block_copy_find_inflight_req
>    block/block-copy: refactor interfaces to use bytes instead of end
>    block/block-copy: rename start to offset in interfaces
>    block/block-copy: reduce intersecting request lock
>    block/block-copy: hide structure definitions
> 
>   include/block/block-copy.h |  57 +-----
>   block/backup-top.c         |   6 +-
>   block/backup.c             |  27 +--
>   block/block-copy.c         | 379 ++++++++++++++++++++++++++++---------
>   block/trace-events         |   1 +
>   5 files changed, 316 insertions(+), 154 deletions(-)
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 for-5.0 0/7] block-copy improvements: part I
  2019-12-19  9:01 ` [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
@ 2020-01-20  9:09   ` Vladimir Sementsov-Ogievskiy
  2020-02-07 18:05     ` Max Reitz
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-01-20  9:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz

ping

19.12.2019 12:01, Vladimir Sementsov-Ogievskiy wrote:
> ping. Series applies on current master
> 
> 27.11.2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>>
>> This is a first part of my
>>    [RFC 00/24] backup performance: block_status + async
>>
>> Patches are mostly separate by their idea, but sending them all in
>> separate is inefficient.
>>
>> Vladimir Sementsov-Ogievskiy (7):
>>    block/block-copy: specialcase first copy_range request
>>    block/block-copy: use block_status
>>    block/block-copy: factor out block_copy_find_inflight_req
>>    block/block-copy: refactor interfaces to use bytes instead of end
>>    block/block-copy: rename start to offset in interfaces
>>    block/block-copy: reduce intersecting request lock
>>    block/block-copy: hide structure definitions
>>
>>   include/block/block-copy.h |  57 +-----
>>   block/backup-top.c         |   6 +-
>>   block/backup.c             |  27 +--
>>   block/block-copy.c         | 379 ++++++++++++++++++++++++++++---------
>>   block/trace-events         |   1 +
>>   5 files changed, 316 insertions(+), 154 deletions(-)
>>
> 
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/7] block/block-copy: specialcase first copy_range request
  2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
@ 2020-01-29  7:38   ` Andrey Shinkevich
  2020-02-07 17:28   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29  7:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> In block_copy_do_copy we fallback to read+write if copy_range failed.
> In this case copy_size is larger than defined for buffered IO, and
> there is corresponding commit. Still, backup copies data cluster by
> cluster, and most of requests are limited to one cluster anyway, so the
> only source of this one bad-limited request is copy-before-write
> operation.
> 
> Further patch will move backup to use block_copy directly, than for
> cases where copy_range is not supported, first request will be
> oversized in each backup. It's not good, let's change it now.
> 
> Fix is simple: just limit first copy_range request like buffer-based
> request. If it succeed, set larger copy_range limit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/block-copy.c | 41 ++++++++++++++++++++++++++++++-----------
>   1 file changed, 30 insertions(+), 11 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 79798a1567..8602e2cae7 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -70,16 +70,19 @@ void block_copy_state_free(BlockCopyState *s)
>       g_free(s);
>   }
>   
> +static uint32_t block_copy_max_transfer(BdrvChild *source, BdrvChild *target)
> +{
> +    return MIN_NON_ZERO(INT_MAX,
> +                        MIN_NON_ZERO(source->bs->bl.max_transfer,
> +                                     target->bs->bl.max_transfer));
> +}
> +
>   BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>                                        int64_t cluster_size,
>                                        BdrvRequestFlags write_flags, Error **errp)
>   {
>       BlockCopyState *s;
>       BdrvDirtyBitmap *copy_bitmap;
> -    uint32_t max_transfer =
> -            MIN_NON_ZERO(INT_MAX,
> -                         MIN_NON_ZERO(source->bs->bl.max_transfer,
> -                                      target->bs->bl.max_transfer));
>   
>       copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
>                                              errp);
> @@ -99,7 +102,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
>       };
>   
> -    if (max_transfer < cluster_size) {
> +    if (block_copy_max_transfer(source, target) < cluster_size) {
>           /*
>            * copy_range does not respect max_transfer. We don't want to bother
>            * with requests smaller than block-copy cluster size, so fallback to
> @@ -114,12 +117,11 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>           s->copy_size = cluster_size;
>       } else {
>           /*
> -         * copy_range does not respect max_transfer (it's a TODO), so we factor
> -         * that in here.
> +         * We enable copy-range, but keep small copy_size, until first
> +         * successful copy_range (look at block_copy_do_copy).
>            */
>           s->use_copy_range = true;
> -        s->copy_size = MIN(MAX(cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
> -                           QEMU_ALIGN_DOWN(max_transfer, cluster_size));
> +        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>       }
>   
>       QLIST_INIT(&s->inflight_reqs);
> @@ -168,7 +170,21 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>               s->use_copy_range = false;
>               s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>               /* Fallback to read+write with allocated buffer */
> -        } else {
> +        } else if (s->use_copy_range) {
> +            /*
> +             * Successful copy-range. Now increase copy_size.
> +             * copy_range does not respect max_transfer (it's a TODO), so we
> +             * factor that in here.
> +             *
> +             * Note: we double-check s->use_copy_range for the case when
> +             * parallel block-copy request unset it during previous
unsets

> +             * bdrv_co_copy_range call.
> +             */
> +            s->copy_size =
> +                    MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
> +                        QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
> +                                                                s->target),
> +                                        s->cluster_size));
>               goto out;
>           }
>       }
> @@ -176,7 +192,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>       /*
>        * In case of failed copy_range request above, we may proceed with buffered
>        * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
> -     * be properly limited, so don't care too much.
> +     * be properly limited, so don't care too much. Moreover the most possible
> +     * case (copy_range is unsupported for the configuration, so the very first
> +     * copy_range request fails) is handled by setting large copy_size only
> +     * after first successful copy_range.
>        */
>   
>       bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
> 

It would be good to make that clear where the large size of the first 
request comes from and where it is checked with failure.

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 2/7] block/block-copy: use block_status
  2019-11-27 18:08 ` [PATCH v2 2/7] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
@ 2020-01-29  9:12   ` Andrey Shinkevich
  2020-02-07 17:46   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29  9:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Use bdrv_block_status_above to chose effective chunk size and to handle
> zeroes effectively.
> 
> This substitutes checking for just being allocated or not, and drops
> old code path for it. Assistance by backup job is dropped too, as
> caching block-status information is more difficult than just caching
> is-allocated information in our dirty bitmap, and backup job is not
> good place for this caching anyway.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/block-copy.c | 67 +++++++++++++++++++++++++++++++++++++---------
>   block/trace-events |  1 +
>   2 files changed, 55 insertions(+), 13 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 8602e2cae7..74295d93d5 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -152,7 +152,7 @@ void block_copy_set_callbacks(
>    */
>   static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>                                              int64_t start, int64_t end,
> -                                           bool *error_is_read)
> +                                           bool zeroes, bool *error_is_read)
>   {
>       int ret;
>       int nbytes = MIN(end, s->len) - start;
> @@ -162,6 +162,18 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>       assert(QEMU_IS_ALIGNED(end, s->cluster_size));
>       assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
>   
> +    if (zeroes) {
> +        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
> +                                    ~BDRV_REQ_WRITE_COMPRESSED);
> +        if (ret < 0) {
> +            trace_block_copy_write_zeroes_fail(s, start, ret);
> +            if (error_is_read) {
> +                *error_is_read = false;
> +            }
> +        }
> +        return ret;
> +    }
> +
>       if (s->use_copy_range) {
>           ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
>                                    0, s->write_flags);
> @@ -225,6 +237,34 @@ out:
>       return ret;
>   }
>   
> +static int block_copy_block_status(BlockCopyState *s, int64_t offset,
> +                                   int64_t bytes, int64_t *pnum)
> +{
> +    int64_t num;
> +    BlockDriverState *base;
> +    int ret;
> +
> +    if (s->skip_unallocated && s->source->bs->backing) {
> +        base = s->source->bs->backing->bs;
> +    } else {
> +        base = NULL;
> +    }
> +
> +    ret = bdrv_block_status_above(s->source->bs, base, offset, bytes, &num,
> +                                  NULL, NULL);
> +    if (ret < 0 || num < s->cluster_size) {
> +        num = s->cluster_size;

/* Let the cluster be copied in case of error too */
> +        ret = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_DATA;
> +    } else if (offset + num == s->len) {
> +        num = QEMU_ALIGN_UP(num, s->cluster_size);
> +    } else {
> +        num = QEMU_ALIGN_DOWN(num, s->cluster_size);
> +    }
> +
> +    *pnum = num;
> +    return ret;
> +}
> +
>   /*
>    * Check if the cluster starting at offset is allocated or not.
>    * return via pnum the number of contiguous clusters sharing this allocation.
> @@ -301,7 +341,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
>   {
>       int ret = 0;
>       int64_t end = bytes + start; /* bytes */
> -    int64_t status_bytes;
>       BlockCopyInFlightReq req;
>   
>       /*
> @@ -318,7 +357,7 @@ int coroutine_fn block_copy(BlockCopyState *s,
>       block_copy_inflight_req_begin(s, &req, start, end);
>   
>       while (start < end) {
> -        int64_t next_zero, chunk_end;
> +        int64_t next_zero, chunk_end, status_bytes;
>   
>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
>               trace_block_copy_skip(s, start);
> @@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
>               chunk_end = next_zero;
>           }
>   
> -        if (s->skip_unallocated) {
> -            ret = block_copy_reset_unallocated(s, start, &status_bytes);
> -            if (ret == 0) {
> -                trace_block_copy_skip_range(s, start, status_bytes);
> -                start += status_bytes;
> -                continue;
> -            }
> -            /* Clamp to known allocated region */
> -            chunk_end = MIN(chunk_end, start + status_bytes);
> +        ret = block_copy_block_status(s, start, chunk_end - start,
> +                                      &status_bytes);
> +        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> +            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
> +            s->progress_reset_callback(s->progress_opaque);
> +            trace_block_copy_skip_range(s, start, status_bytes);
> +            start += status_bytes;
> +            continue;
>           }
>   
> +        chunk_end = MIN(chunk_end, start + status_bytes);
> +
>           trace_block_copy_process(s, start);
>   
>           bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
>   
>           co_get_from_shres(s->mem, chunk_end - start);
> -        ret = block_copy_do_copy(s, start, chunk_end, error_is_read);
> +        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
> +                                 error_is_read);
>           co_put_to_shres(s->mem, chunk_end - start);
>           if (ret < 0) {
>               bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
> diff --git a/block/trace-events b/block/trace-events
> index 6ba86decca..346537a1d2 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -48,6 +48,7 @@ block_copy_process(void *bcs, int64_t start) "bcs %p start %"PRId64
>   block_copy_copy_range_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
>   block_copy_read_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
>   block_copy_write_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
> +block_copy_write_zeroes_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
>   
>   # ../blockdev.c
>   qmp_block_job_cancel(void *job) "job %p"
> 

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req
  2019-11-27 18:08 ` [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req Vladimir Sementsov-Ogievskiy
@ 2020-01-29  9:25   ` Andrey Shinkevich
  2020-02-07 17:50   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29  9:25 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Split block_copy_find_inflight_req to be used in seprate.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/block-copy.c | 31 +++++++++++++++++++------------
>   1 file changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 74295d93d5..94e7e855ef 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -24,23 +24,30 @@
>   #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
>   #define BLOCK_COPY_MAX_MEM (128 * MiB)
>   
> +static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
> +                                                          int64_t start,
> +                                                          int64_t end)
> +{
> +    BlockCopyInFlightReq *req;
> +
> +    QLIST_FOREACH(req, &s->inflight_reqs, list) {
> +        if (end > req->start_byte && start < req->end_byte) {
> +            return req;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +
>   static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>                                                          int64_t start,
>                                                          int64_t end)
>   {
>       BlockCopyInFlightReq *req;
> -    bool waited;
> -
> -    do {
> -        waited = false;
> -        QLIST_FOREACH(req, &s->inflight_reqs, list) {
> -            if (end > req->start_byte && start < req->end_byte) {
> -                qemu_co_queue_wait(&req->wait_queue, NULL);
> -                waited = true;
> -                break;
> -            }
> -        }
> -    } while (waited);
> +
> +    while ((req = block_copy_find_inflight_req(s, start, end))) {
> +        qemu_co_queue_wait(&req->wait_queue, NULL);
> +    }
>   }
>   
>   static void block_copy_inflight_req_begin(BlockCopyState *s,
> 

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end
  2019-11-27 18:08 ` [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
@ 2020-01-29 17:12   ` Andrey Shinkevich
  2020-02-05 11:36     ` Vladimir Sementsov-Ogievskiy
  2020-02-07 18:01   ` Max Reitz
  1 sibling, 1 reply; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29 17:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> We have a lot of "chunk_end - start" invocations, let's switch to
> bytes/cur_bytes scheme instead.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block-copy.h |  4 +--
>   block/block-copy.c         | 68 ++++++++++++++++++++------------------
>   2 files changed, 37 insertions(+), 35 deletions(-)
> 
> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
> index 0a161724d7..7321b3d305 100644
> --- a/include/block/block-copy.h
> +++ b/include/block/block-copy.h
> @@ -19,8 +19,8 @@
>   #include "qemu/co-shared-resource.h"
>   
>   typedef struct BlockCopyInFlightReq {
> -    int64_t start_byte;
> -    int64_t end_byte;
> +    int64_t start;
> +    int64_t bytes;
>       QLIST_ENTRY(BlockCopyInFlightReq) list;
>       CoQueue wait_queue; /* coroutines blocked on this request */
>   } BlockCopyInFlightReq;
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 94e7e855ef..cc273b6cb8 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -26,12 +26,12 @@
>   
>   static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>                                                             int64_t start,
> -                                                          int64_t end)
> +                                                          int64_t bytes)
>   {
>       BlockCopyInFlightReq *req;
>   
>       QLIST_FOREACH(req, &s->inflight_reqs, list) {
> -        if (end > req->start_byte && start < req->end_byte) {
> +        if (start + bytes > req->start && start < req->start + req->bytes) {
>               return req;
>           }
>       }
> @@ -41,21 +41,21 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>   
>   static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>                                                          int64_t start,
> -                                                       int64_t end)
> +                                                       int64_t bytes)
>   {
>       BlockCopyInFlightReq *req;
>   
> -    while ((req = block_copy_find_inflight_req(s, start, end))) {
> +    while ((req = block_copy_find_inflight_req(s, start, bytes))) {
>           qemu_co_queue_wait(&req->wait_queue, NULL);
>       }
>   }
>   
>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                             BlockCopyInFlightReq *req,
> -                                          int64_t start, int64_t end)
> +                                          int64_t start, int64_t bytes)
>   {
> -    req->start_byte = start;
> -    req->end_byte = end;
> +    req->start = start;
> +    req->bytes = bytes;
>       qemu_co_queue_init(&req->wait_queue);
>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>   }
> @@ -150,24 +150,26 @@ void block_copy_set_callbacks(
>   /*
>    * block_copy_do_copy
>    *
> - * Do copy of cluser-aligned chunk. @end is allowed to exceed s->len only to
> - * cover last cluster when s->len is not aligned to clusters.
> + * Do copy of cluser-aligned chunk. Requested region is allowed to exceed s->len

cluster-...

> + * only to cover last cluster when s->len is not aligned to clusters.
>    *
>    * No sync here: nor bitmap neighter intersecting requests handling, only copy.
>    *
>    * Returns 0 on success.
>    */
>   static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
> -                                           int64_t start, int64_t end,
> +                                           int64_t start, int64_t bytes,
>                                              bool zeroes, bool *error_is_read)
>   {
>       int ret;
> -    int nbytes = MIN(end, s->len) - start;
> +    int nbytes = MIN(start + bytes, s->len) - start;
>       void *bounce_buffer = NULL;
>   
> +    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
>       assert(QEMU_IS_ALIGNED(start, s->cluster_size));
> -    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
> -    assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
> +    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
> +    assert(start + bytes <= s->len ||

The '<=' looks correct but I wonder how only '<' worked without 
assertion failure before. Was the s->len never aligned to the cluster size?

> +           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
>   
>       if (zeroes) {
>           ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
> @@ -347,7 +349,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
>                               bool *error_is_read)
>   {
>       int ret = 0;
> -    int64_t end = bytes + start; /* bytes */
>       BlockCopyInFlightReq req;
>   
>       /*
> @@ -358,58 +359,59 @@ int coroutine_fn block_copy(BlockCopyState *s,
>              bdrv_get_aio_context(s->target->bs));
>   
>       assert(QEMU_IS_ALIGNED(start, s->cluster_size));
> -    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
> +    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
>       block_copy_wait_inflight_reqs(s, start, bytes);
> -    block_copy_inflight_req_begin(s, &req, start, end);
> +    block_copy_inflight_req_begin(s, &req, start, bytes);
>   
> -    while (start < end) {
> -        int64_t next_zero, chunk_end, status_bytes;
> +    while (bytes) {
> +        int64_t next_zero, cur_bytes, status_bytes;
>   
>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
>               trace_block_copy_skip(s, start);
>               start += s->cluster_size;
> +            bytes -= s->cluster_size;
>               continue; /* already copied */
>           }
>   
> -        chunk_end = MIN(end, start + s->copy_size);
> +        cur_bytes = MIN(bytes, s->copy_size);
>   
>           next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
> -                                                chunk_end - start);
> +                                                cur_bytes);
>           if (next_zero >= 0) {
>               assert(next_zero > start); /* start is dirty */
> -            assert(next_zero < chunk_end); /* no need to do MIN() */
> -            chunk_end = next_zero;
> +            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
> +            cur_bytes = next_zero - start;
>           }
>   
> -        ret = block_copy_block_status(s, start, chunk_end - start,
> -                                      &status_bytes);
> +        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>               bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
>               s->progress_reset_callback(s->progress_opaque);
>               trace_block_copy_skip_range(s, start, status_bytes);
>               start += status_bytes;
> +            bytes -= status_bytes;
>               continue;
>           }
>   
> -        chunk_end = MIN(chunk_end, start + status_bytes);
> +        cur_bytes = MIN(cur_bytes, status_bytes);
>   
>           trace_block_copy_process(s, start);
>   
> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
> +        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
>   
> -        co_get_from_shres(s->mem, chunk_end - start);
> -        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
> +        co_get_from_shres(s->mem, cur_bytes);
> +        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
>                                    error_is_read);
> -        co_put_to_shres(s->mem, chunk_end - start);
> +        co_put_to_shres(s->mem, cur_bytes);
>           if (ret < 0) {
> -            bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
> +            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
>               break;
>           }
>   
> -        s->progress_bytes_callback(chunk_end - start, s->progress_opaque);
> -        start = chunk_end;
> -        ret = 0;
> +        s->progress_bytes_callback(cur_bytes, s->progress_opaque);
> +        start += cur_bytes;
> +        bytes -= cur_bytes;
>       }
>   
>       block_copy_inflight_req_end(&req);
> 

At the first glance, we would only benefit from the 'while' loop by that 
substitution to reduce the number of addition operations.

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces
  2019-11-27 18:08 ` [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
@ 2020-01-29 17:37   ` Andrey Shinkevich
  2020-02-07 18:04   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29 17:37 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> offset/bytes pair is more usual naming in block layer, let's use it.

more common ... than start/end (or start/bytes) ...

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block-copy.h |  2 +-
>   block/block-copy.c         | 80 +++++++++++++++++++-------------------
>   2 files changed, 41 insertions(+), 41 deletions(-)
> 
> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
> index 7321b3d305..d96b097267 100644
> --- a/include/block/block-copy.h
> +++ b/include/block/block-copy.h
> @@ -19,7 +19,7 @@
>   #include "qemu/co-shared-resource.h"
>   
>   typedef struct BlockCopyInFlightReq {
> -    int64_t start;
> +    int64_t offset;
>       int64_t bytes;
>       QLIST_ENTRY(BlockCopyInFlightReq) list;
>       CoQueue wait_queue; /* coroutines blocked on this request */
> diff --git a/block/block-copy.c b/block/block-copy.c
> index cc273b6cb8..20068cd699 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -25,13 +25,13 @@
>   #define BLOCK_COPY_MAX_MEM (128 * MiB)
>   
>   static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
> -                                                          int64_t start,
> +                                                          int64_t offset,
>                                                             int64_t bytes)
>   {
>       BlockCopyInFlightReq *req;
>   
>       QLIST_FOREACH(req, &s->inflight_reqs, list) {
> -        if (start + bytes > req->start && start < req->start + req->bytes) {
> +        if (offset + bytes > req->offset && offset < req->offset + req->bytes) {
>               return req;
>           }
>       }
> @@ -40,21 +40,21 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>   }
>   
>   static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
> -                                                       int64_t start,
> +                                                       int64_t offset,
>                                                          int64_t bytes)
>   {
>       BlockCopyInFlightReq *req;
>   
> -    while ((req = block_copy_find_inflight_req(s, start, bytes))) {
> +    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>           qemu_co_queue_wait(&req->wait_queue, NULL);
>       }
>   }
>   
>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                             BlockCopyInFlightReq *req,
> -                                          int64_t start, int64_t bytes)
> +                                          int64_t offset, int64_t bytes)
>   {
> -    req->start = start;
> +    req->offset = offset;
>       req->bytes = bytes;
>       qemu_co_queue_init(&req->wait_queue);
>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
> @@ -158,24 +158,24 @@ void block_copy_set_callbacks(
>    * Returns 0 on success.
>    */
>   static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
> -                                           int64_t start, int64_t bytes,
> +                                           int64_t offset, int64_t bytes,
>                                              bool zeroes, bool *error_is_read)
>   {
>       int ret;
> -    int nbytes = MIN(start + bytes, s->len) - start;
> +    int nbytes = MIN(offset + bytes, s->len) - offset;
>       void *bounce_buffer = NULL;
>   
> -    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
> -    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
> +    assert(offset >= 0 && bytes > 0 && INT64_MAX - offset >= bytes);
> +    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
> -    assert(start + bytes <= s->len ||
> -           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
> +    assert(offset + bytes <= s->len ||
> +           offset + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
>   
>       if (zeroes) {
> -        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
> +        ret = bdrv_co_pwrite_zeroes(s->target, offset, nbytes, s->write_flags &
>                                       ~BDRV_REQ_WRITE_COMPRESSED);
>           if (ret < 0) {
> -            trace_block_copy_write_zeroes_fail(s, start, ret);
> +            trace_block_copy_write_zeroes_fail(s, offset, ret);
>               if (error_is_read) {
>                   *error_is_read = false;
>               }
> @@ -184,10 +184,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>       }
>   
>       if (s->use_copy_range) {
> -        ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
> +        ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
>                                    0, s->write_flags);
>           if (ret < 0) {
> -            trace_block_copy_copy_range_fail(s, start, ret);
> +            trace_block_copy_copy_range_fail(s, offset, ret);
>               s->use_copy_range = false;
>               s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>               /* Fallback to read+write with allocated buffer */
> @@ -221,19 +221,19 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>   
>       bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
>   
> -    ret = bdrv_co_pread(s->source, start, nbytes, bounce_buffer, 0);
> +    ret = bdrv_co_pread(s->source, offset, nbytes, bounce_buffer, 0);
>       if (ret < 0) {
> -        trace_block_copy_read_fail(s, start, ret);
> +        trace_block_copy_read_fail(s, offset, ret);
>           if (error_is_read) {
>               *error_is_read = true;
>           }
>           goto out;
>       }
>   
> -    ret = bdrv_co_pwrite(s->target, start, nbytes, bounce_buffer,
> +    ret = bdrv_co_pwrite(s->target, offset, nbytes, bounce_buffer,
>                            s->write_flags);
>       if (ret < 0) {
> -        trace_block_copy_write_fail(s, start, ret);
> +        trace_block_copy_write_fail(s, offset, ret);
>           if (error_is_read) {
>               *error_is_read = false;
>           }
> @@ -345,7 +345,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>   }
>   
>   int coroutine_fn block_copy(BlockCopyState *s,
> -                            int64_t start, uint64_t bytes,
> +                            int64_t offset, uint64_t bytes,
>                               bool *error_is_read)
>   {
>       int ret = 0;
> @@ -358,59 +358,59 @@ int coroutine_fn block_copy(BlockCopyState *s,
>       assert(bdrv_get_aio_context(s->source->bs) ==
>              bdrv_get_aio_context(s->target->bs));
>   
> -    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
> +    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    block_copy_wait_inflight_reqs(s, start, bytes);
> -    block_copy_inflight_req_begin(s, &req, start, bytes);
> +    block_copy_wait_inflight_reqs(s, offset, bytes);
> +    block_copy_inflight_req_begin(s, &req, offset, bytes);
>   
>       while (bytes) {
>           int64_t next_zero, cur_bytes, status_bytes;
>   
> -        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
> -            trace_block_copy_skip(s, start);
> -            start += s->cluster_size;
> +        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
> +            trace_block_copy_skip(s, offset);
> +            offset += s->cluster_size;
>               bytes -= s->cluster_size;
>               continue; /* already copied */
>           }
>   
>           cur_bytes = MIN(bytes, s->copy_size);
>   
> -        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
> +        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
>                                                   cur_bytes);
>           if (next_zero >= 0) {
> -            assert(next_zero > start); /* start is dirty */
> -            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
> -            cur_bytes = next_zero - start;
> +            assert(next_zero > offset); /* offset is dirty */
> +            assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
> +            cur_bytes = next_zero - offset;
>           }
>   
> -        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
> +        ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
> +            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
>               s->progress_reset_callback(s->progress_opaque);
> -            trace_block_copy_skip_range(s, start, status_bytes);
> -            start += status_bytes;
> +            trace_block_copy_skip_range(s, offset, status_bytes);
> +            offset += status_bytes;
>               bytes -= status_bytes;
>               continue;
>           }
>   
>           cur_bytes = MIN(cur_bytes, status_bytes);
>   
> -        trace_block_copy_process(s, start);
> +        trace_block_copy_process(s, offset);
>   
> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
> +        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>   
>           co_get_from_shres(s->mem, cur_bytes);
> -        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
> +        ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>                                    error_is_read);
>           co_put_to_shres(s->mem, cur_bytes);
>           if (ret < 0) {
> -            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
> +            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>               break;
>           }
>   
>           s->progress_bytes_callback(cur_bytes, s->progress_opaque);
> -        start += cur_bytes;
> +        offset += cur_bytes;
>           bytes -= cur_bytes;
>       }
>   
> 

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
@ 2020-01-29 20:05   ` Andrey Shinkevich
  2020-01-30 13:45     ` Vladimir Sementsov-Ogievskiy
  2020-01-30 15:53   ` Andrey Shinkevich
  2020-02-17 13:38   ` Max Reitz
  2 siblings, 1 reply; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-29 20:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Currently, block_copy operation lock the whole requested region. But
> there is no reason to lock clusters, which are already copied, it will
> disturb other parallel block_copy requests for no reason.
> 
> Let's instead do the following:
> 
> Lock only sub-region, which we are going to operate on. Then, after
> copying all dirty sub-regions, we should wait for intersecting
> requests block-copy, if they failed, we should retry these new dirty
> clusters.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 95 insertions(+), 21 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 20068cd699..aca44b13fb 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>       return NULL;
>   }
>   
> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
> -                                                       int64_t offset,
> -                                                       int64_t bytes)
> +/*
> + * If there are no intersecting requests return false. Otherwise, wait for the
> + * first found intersecting request to finish and return true.
> + */
> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
> +                                             int64_t end)
>   {
> -    BlockCopyInFlightReq *req;
> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>   
> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
> -        qemu_co_queue_wait(&req->wait_queue, NULL);
> +    if (!req) {
> +        return false;
>       }
> +
> +    qemu_co_queue_wait(&req->wait_queue, NULL);
> +
> +    return true;
>   }
>   
> +/* Called only on full-dirty region */
>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                             BlockCopyInFlightReq *req,
>                                             int64_t offset, int64_t bytes)
>   {
> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
> +
> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +
>       req->offset = offset;
>       req->bytes = bytes;
>       qemu_co_queue_init(&req->wait_queue);
>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>   }
>   
> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>   {
> +    if (new_bytes == req->bytes) {
> +        return;
> +    }
> +
> +    assert(new_bytes > 0 && new_bytes < req->bytes);
> +
> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
> +                          req->offset + new_bytes, req->bytes - new_bytes);
> +
> +    req->bytes = new_bytes;
> +    qemu_co_queue_restart_all(&req->wait_queue);

Won't we get the performance degradation with that function frequent call?

> +}
> +
> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
> +                                                     BlockCopyInFlightReq *req,
> +                                                     int ret)
> +{
> +    if (ret < 0) {
> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
> +    }
>       QLIST_REMOVE(req, list);
>       qemu_co_queue_restart_all(&req->wait_queue);
>   }
> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>       return ret;
>   }
>   
> -int coroutine_fn block_copy(BlockCopyState *s,
> -                            int64_t offset, uint64_t bytes,
> -                            bool *error_is_read)
> +/*
> + * block_copy_dirty_clusters
> + *
> + * Copy dirty clusters in @start/@bytes range.

%s/start/offset/ ?

> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
> + * clusters found and -errno on failure.
> + */
> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
> +                                                  int64_t offset, int64_t bytes,
> +                                                  bool *error_is_read)
>   {
>       int ret = 0;
> -    BlockCopyInFlightReq req;
> +    bool found_dirty = false;
>   
>       /*
>        * block_copy() user is responsible for keeping source and target in same
> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    block_copy_wait_inflight_reqs(s, offset, bytes);
> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
> -
>       while (bytes) {
> +        BlockCopyInFlightReq req;
>           int64_t next_zero, cur_bytes, status_bytes;
>   
>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>               continue; /* already copied */
>           }
>   
> +        found_dirty = true;
> +
>           cur_bytes = MIN(bytes, s->copy_size);
>   
>           next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>               assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>               cur_bytes = next_zero - offset;
>           }
> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>   
>           ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
> +            block_copy_inflight_req_end(s, &req, 0);
>               s->progress_reset_callback(s->progress_opaque);
>               trace_block_copy_skip_range(s, offset, status_bytes);
>               offset += status_bytes;
> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>   
>           trace_block_copy_process(s, offset);
>   
> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -
>           co_get_from_shres(s->mem, cur_bytes);
>           ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>                                    error_is_read);
>           co_put_to_shres(s->mem, cur_bytes);
> +        block_copy_inflight_req_end(s, &req, ret);
>           if (ret < 0) {
> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -            break;
> +            return ret;
>           }
>   
>           s->progress_bytes_callback(cur_bytes, s->progress_opaque);
> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>           bytes -= cur_bytes;
>       }
>   
> -    block_copy_inflight_req_end(&req);
> +    return found_dirty;
> +}
>   
> -    return ret;
> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
> +                            bool *error_is_read)
> +{
> +    while (true) {
> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
> +
> +        if (ret < 0) {
> +            /*
> +             * IO operation failed, which means the whole block_copy request
> +             * failed.
> +             */
> +            return ret;
> +        }
> +        if (ret) {
> +            /*
> +             * Something was copied, which means that there were yield points
> +             * and some new dirty bits may appered (due to failed parallel
> +             * block-copy requests).
> +             */
> +            continue;
> +        }
> +
> +        /*
> +         * Here ret == 0, which means that there is no dirty clusters in

there is no dirty cluster in

> +         * requested region.
> +         */
> +
> +        if (!block_copy_wait_one(s, start, bytes)) {
> +            /* No dirty bits and nothing to wait: the whole request is done */
> +            break;
> +        }
> +    }
> +
> +    return 0;
>   }
> 

-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-01-29 20:05   ` Andrey Shinkevich
@ 2020-01-30 13:45     ` Vladimir Sementsov-Ogievskiy
  2020-01-30 16:24       ` Andrey Shinkevich
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-01-30 13:45 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz

29.01.2020 23:05, Andrey Shinkevich wrote:
> 
> 
> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>> Currently, block_copy operation lock the whole requested region. But
>> there is no reason to lock clusters, which are already copied, it will
>> disturb other parallel block_copy requests for no reason.
>>
>> Let's instead do the following:
>>
>> Lock only sub-region, which we are going to operate on. Then, after
>> copying all dirty sub-regions, we should wait for intersecting
>> requests block-copy, if they failed, we should retry these new dirty
>> clusters.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>    block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>>    1 file changed, 95 insertions(+), 21 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 20068cd699..aca44b13fb 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>        return NULL;
>>    }
>>    
>> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>> -                                                       int64_t offset,
>> -                                                       int64_t bytes)
>> +/*
>> + * If there are no intersecting requests return false. Otherwise, wait for the
>> + * first found intersecting request to finish and return true.
>> + */
>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
>> +                                             int64_t end)
>>    {
>> -    BlockCopyInFlightReq *req;
>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>>    
>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>> +    if (!req) {
>> +        return false;
>>        }
>> +
>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>> +
>> +    return true;
>>    }
>>    
>> +/* Called only on full-dirty region */
>>    static void block_copy_inflight_req_begin(BlockCopyState *s,
>>                                              BlockCopyInFlightReq *req,
>>                                              int64_t offset, int64_t bytes)
>>    {
>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>> +
>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>> +
>>        req->offset = offset;
>>        req->bytes = bytes;
>>        qemu_co_queue_init(&req->wait_queue);
>>        QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>    }
>>    
>> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>    {
>> +    if (new_bytes == req->bytes) {
>> +        return;
>> +    }
>> +
>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>> +
>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>> +                          req->offset + new_bytes, req->bytes - new_bytes);
>> +
>> +    req->bytes = new_bytes;
>> +    qemu_co_queue_restart_all(&req->wait_queue);
> 
> Won't we get the performance degradation with that function frequent call?

Why do you think so? In IO most of performance depends on disk speed and how
we organize requests sequence. The whole original series shows performance improvement.

This patch reduces lock around request, locking only the part we are working on now,
this is for better interactivity. After calling block-status, the request is shrinked
to possibly unlock some other requests, waiting on the tail of our request.. Do you
have a better suggestion on this synchronization?

> 
>> +}
>> +
>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>> +                                                     BlockCopyInFlightReq *req,
>> +                                                     int ret)
>> +{
>> +    if (ret < 0) {
>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>> +    }
>>        QLIST_REMOVE(req, list);
>>        qemu_co_queue_restart_all(&req->wait_queue);
>>    }
>> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>        return ret;
>>    }
>>    
>> -int coroutine_fn block_copy(BlockCopyState *s,
>> -                            int64_t offset, uint64_t bytes,
>> -                            bool *error_is_read)
>> +/*
>> + * block_copy_dirty_clusters
>> + *
>> + * Copy dirty clusters in @start/@bytes range.
> 
> %s/start/offset/ ?
> 
>> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
>> + * clusters found and -errno on failure.
>> + */
>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>> +                                                  int64_t offset, int64_t bytes,
>> +                                                  bool *error_is_read)
>>    {
>>        int ret = 0;
>> -    BlockCopyInFlightReq req;
>> +    bool found_dirty = false;
>>    
>>        /*
>>         * block_copy() user is responsible for keeping source and target in same
>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>        assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>        assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>    
>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>> -
>>        while (bytes) {
>> +        BlockCopyInFlightReq req;
>>            int64_t next_zero, cur_bytes, status_bytes;
>>    
>>            if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>                continue; /* already copied */
>>            }
>>    
>> +        found_dirty = true;
>> +
>>            cur_bytes = MIN(bytes, s->copy_size);
>>    
>>            next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>                assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>>                cur_bytes = next_zero - offset;
>>            }
>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>    
>>            ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>>            if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
>> +            block_copy_inflight_req_end(s, &req, 0);
>>                s->progress_reset_callback(s->progress_opaque);
>>                trace_block_copy_skip_range(s, offset, status_bytes);
>>                offset += status_bytes;
>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>    
>>            trace_block_copy_process(s, offset);
>>    
>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -
>>            co_get_from_shres(s->mem, cur_bytes);
>>            ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>>                                     error_is_read);
>>            co_put_to_shres(s->mem, cur_bytes);
>> +        block_copy_inflight_req_end(s, &req, ret);
>>            if (ret < 0) {
>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -            break;
>> +            return ret;
>>            }
>>    
>>            s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>            bytes -= cur_bytes;
>>        }
>>    
>> -    block_copy_inflight_req_end(&req);
>> +    return found_dirty;
>> +}
>>    
>> -    return ret;
>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>> +                            bool *error_is_read)
>> +{
>> +    while (true) {
>> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
>> +
>> +        if (ret < 0) {
>> +            /*
>> +             * IO operation failed, which means the whole block_copy request
>> +             * failed.
>> +             */
>> +            return ret;
>> +        }
>> +        if (ret) {
>> +            /*
>> +             * Something was copied, which means that there were yield points
>> +             * and some new dirty bits may appered (due to failed parallel
>> +             * block-copy requests).
>> +             */
>> +            continue;
>> +        }
>> +
>> +        /*
>> +         * Here ret == 0, which means that there is no dirty clusters in
> 
> there is no dirty cluster in
> 
>> +         * requested region.
>> +         */
>> +
>> +        if (!block_copy_wait_one(s, start, bytes)) {
>> +            /* No dirty bits and nothing to wait: the whole request is done */
>> +            break;
>> +        }
>> +    }
>> +
>> +    return 0;
>>    }
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
  2020-01-29 20:05   ` Andrey Shinkevich
@ 2020-01-30 15:53   ` Andrey Shinkevich
  2020-01-30 16:05     ` Vladimir Sementsov-Ogievskiy
  2020-02-17 13:38   ` Max Reitz
  2 siblings, 1 reply; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-30 15:53 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, den, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Currently, block_copy operation lock the whole requested region. But
> there is no reason to lock clusters, which are already copied, it will
> disturb other parallel block_copy requests for no reason.
> 
> Let's instead do the following:
> 
> Lock only sub-region, which we are going to operate on. Then, after
> copying all dirty sub-regions, we should wait for intersecting
> requests block-copy, if they failed, we should retry these new dirty
> clusters.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 95 insertions(+), 21 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 20068cd699..aca44b13fb 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>       return NULL;
>   }
>   
> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
> -                                                       int64_t offset,
> -                                                       int64_t bytes)
> +/*
> + * If there are no intersecting requests return false. Otherwise, wait for the
> + * first found intersecting request to finish and return true.
> + */
> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
> +                                             int64_t end)
>   {
> -    BlockCopyInFlightReq *req;
> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>   
> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
> -        qemu_co_queue_wait(&req->wait_queue, NULL);
> +    if (!req) {
> +        return false;
>       }
> +
> +    qemu_co_queue_wait(&req->wait_queue, NULL);
> +
> +    return true;
>   }
>   
> +/* Called only on full-dirty region */
>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                             BlockCopyInFlightReq *req,
>                                             int64_t offset, int64_t bytes)
>   {
> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
> +
> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +
>       req->offset = offset;
>       req->bytes = bytes;
>       qemu_co_queue_init(&req->wait_queue);
>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>   }
>   
> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>   {
> +    if (new_bytes == req->bytes) {
> +        return;
> +    }
> +
> +    assert(new_bytes > 0 && new_bytes < req->bytes);
> +
> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
> +                          req->offset + new_bytes, req->bytes - new_bytes);
> +
> +    req->bytes = new_bytes;
> +    qemu_co_queue_restart_all(&req->wait_queue);
> +}
> +
> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
> +                                                     BlockCopyInFlightReq *req,
> +                                                     int ret)

I suggest substituting the 'int ret' parameter with someone like 'bool 
set_dirty' and pass the expression 'ret < 0' as the input parameter in 
calling function. The 'int ret' is used in QEMU as a local return 
variable traditionally.

> +{
> +    if (ret < 0) {
> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
> +    }
>       QLIST_REMOVE(req, list);
>       qemu_co_queue_restart_all(&req->wait_queue);
>   }
> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>       return ret;
>   }
>   
> -int coroutine_fn block_copy(BlockCopyState *s,
> -                            int64_t offset, uint64_t bytes,
> -                            bool *error_is_read)
> +/*
> + * block_copy_dirty_clusters
> + *
> + * Copy dirty clusters in @start/@bytes range.
> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
> + * clusters found and -errno on failure.
> + */
> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
> +                                                  int64_t offset, int64_t bytes,
> +                                                  bool *error_is_read)
>   {
>       int ret = 0;
> -    BlockCopyInFlightReq req;
> +    bool found_dirty = false;
>   
>       /*
>        * block_copy() user is responsible for keeping source and target in same
> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    block_copy_wait_inflight_reqs(s, offset, bytes);
> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
> -
>       while (bytes) {
> +        BlockCopyInFlightReq req;
>           int64_t next_zero, cur_bytes, status_bytes;
>   
>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>               continue; /* already copied */
>           }
>   
> +        found_dirty = true;
> +
>           cur_bytes = MIN(bytes, s->copy_size);
>   
>           next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>               assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>               cur_bytes = next_zero - offset;
>           }
> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>   
>           ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
> +            block_copy_inflight_req_end(s, &req, 0);
>               s->progress_reset_callback(s->progress_opaque);
>               trace_block_copy_skip_range(s, offset, status_bytes);
>               offset += status_bytes;
> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>   
>           trace_block_copy_process(s, offset);
>   
> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -
>           co_get_from_shres(s->mem, cur_bytes);
>           ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>                                    error_is_read);
>           co_put_to_shres(s->mem, cur_bytes);
> +        block_copy_inflight_req_end(s, &req, ret);
>           if (ret < 0) {
> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -            break;
> +            return ret;
>           }
>   
>           s->progress_bytes_callback(cur_bytes, s->progress_opaque);
> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>           bytes -= cur_bytes;
>       }
>   
> -    block_copy_inflight_req_end(&req);
> +    return found_dirty;
> +}
>   
> -    return ret;
> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
> +                            bool *error_is_read)
> +{
> +    while (true) {
> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
> +
> +        if (ret < 0) {
> +            /*
> +             * IO operation failed, which means the whole block_copy request
> +             * failed.
> +             */
> +            return ret;
> +        }
> +        if (ret) {
> +            /*
> +             * Something was copied, which means that there were yield points
> +             * and some new dirty bits may appered (due to failed parallel
> +             * block-copy requests).
> +             */
> +            continue;
> +        }
> +
> +        /*
> +         * Here ret == 0, which means that there is no dirty clusters in
> +         * requested region.
> +         */
> +
> +        if (!block_copy_wait_one(s, start, bytes)) {
> +            /* No dirty bits and nothing to wait: the whole request is done */
> +            break;
> +        }
> +    }
> +
> +    return 0;
>   }
> 

-- 
With the best regards,
Andrey Shinkevich


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-01-30 15:53   ` Andrey Shinkevich
@ 2020-01-30 16:05     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-01-30 16:05 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel, mreitz

30.01.2020 18:53, Andrey Shinkevich wrote:
> 
> 
> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>> Currently, block_copy operation lock the whole requested region. But
>> there is no reason to lock clusters, which are already copied, it will
>> disturb other parallel block_copy requests for no reason.
>>
>> Let's instead do the following:
>>
>> Lock only sub-region, which we are going to operate on. Then, after
>> copying all dirty sub-regions, we should wait for intersecting
>> requests block-copy, if they failed, we should retry these new dirty
>> clusters.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 95 insertions(+), 21 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 20068cd699..aca44b13fb 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>       return NULL;
>>   }
>> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>> -                                                       int64_t offset,
>> -                                                       int64_t bytes)
>> +/*
>> + * If there are no intersecting requests return false. Otherwise, wait for the
>> + * first found intersecting request to finish and return true.
>> + */
>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
>> +                                             int64_t end)
>>   {
>> -    BlockCopyInFlightReq *req;
>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>> +    if (!req) {
>> +        return false;
>>       }
>> +
>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>> +
>> +    return true;
>>   }
>> +/* Called only on full-dirty region */
>>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>>                                             BlockCopyInFlightReq *req,
>>                                             int64_t offset, int64_t bytes)
>>   {
>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>> +
>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>> +
>>       req->offset = offset;
>>       req->bytes = bytes;
>>       qemu_co_queue_init(&req->wait_queue);
>>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>   }
>> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>   {
>> +    if (new_bytes == req->bytes) {
>> +        return;
>> +    }
>> +
>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>> +
>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>> +                          req->offset + new_bytes, req->bytes - new_bytes);
>> +
>> +    req->bytes = new_bytes;
>> +    qemu_co_queue_restart_all(&req->wait_queue);
>> +}
>> +
>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>> +                                                     BlockCopyInFlightReq *req,
>> +                                                     int ret)
> 
> I suggest substituting the 'int ret' parameter with someone like 'bool set_dirty' and pass the expression 'ret < 0' as the input parameter in calling function. The 'int ret' is used in QEMU as a local return variable traditionally.


"int ret" is normal practice also for completion functions, look at bdrv_co_io_em_complete, bdrv_co_write_req_finish, do git grep 'int ret)'

> 
>> +{
>> +    if (ret < 0) {
>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>> +    }
>>       QLIST_REMOVE(req, list);
>>       qemu_co_queue_restart_all(&req->wait_queue);
>>   }
>> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>       return ret;
>>   }
>> -int coroutine_fn block_copy(BlockCopyState *s,
>> -                            int64_t offset, uint64_t bytes,
>> -                            bool *error_is_read)
>> +/*
>> + * block_copy_dirty_clusters
>> + *
>> + * Copy dirty clusters in @start/@bytes range.
>> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
>> + * clusters found and -errno on failure.
>> + */
>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>> +                                                  int64_t offset, int64_t bytes,
>> +                                                  bool *error_is_read)
>>   {
>>       int ret = 0;
>> -    BlockCopyInFlightReq req;
>> +    bool found_dirty = false;
>>       /*
>>        * block_copy() user is responsible for keeping source and target in same
>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>> -
>>       while (bytes) {
>> +        BlockCopyInFlightReq req;
>>           int64_t next_zero, cur_bytes, status_bytes;
>>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               continue; /* already copied */
>>           }
>> +        found_dirty = true;
>> +
>>           cur_bytes = MIN(bytes, s->copy_size);
>>           next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>>               cur_bytes = next_zero - offset;
>>           }
>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>           ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
>> +            block_copy_inflight_req_end(s, &req, 0);
>>               s->progress_reset_callback(s->progress_opaque);
>>               trace_block_copy_skip_range(s, offset, status_bytes);
>>               offset += status_bytes;
>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>           trace_block_copy_process(s, offset);
>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -
>>           co_get_from_shres(s->mem, cur_bytes);
>>           ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>>                                    error_is_read);
>>           co_put_to_shres(s->mem, cur_bytes);
>> +        block_copy_inflight_req_end(s, &req, ret);
>>           if (ret < 0) {
>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -            break;
>> +            return ret;
>>           }
>>           s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>           bytes -= cur_bytes;
>>       }
>> -    block_copy_inflight_req_end(&req);
>> +    return found_dirty;
>> +}
>> -    return ret;
>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>> +                            bool *error_is_read)
>> +{
>> +    while (true) {
>> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
>> +
>> +        if (ret < 0) {
>> +            /*
>> +             * IO operation failed, which means the whole block_copy request
>> +             * failed.
>> +             */
>> +            return ret;
>> +        }
>> +        if (ret) {
>> +            /*
>> +             * Something was copied, which means that there were yield points
>> +             * and some new dirty bits may appered (due to failed parallel
>> +             * block-copy requests).
>> +             */
>> +            continue;
>> +        }
>> +
>> +        /*
>> +         * Here ret == 0, which means that there is no dirty clusters in
>> +         * requested region.
>> +         */
>> +
>> +        if (!block_copy_wait_one(s, start, bytes)) {
>> +            /* No dirty bits and nothing to wait: the whole request is done */
>> +            break;
>> +        }
>> +    }
>> +
>> +    return 0;
>>   }
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-01-30 13:45     ` Vladimir Sementsov-Ogievskiy
@ 2020-01-30 16:24       ` Andrey Shinkevich
  2020-01-30 17:09         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-30 16:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 30/01/2020 16:45, Vladimir Sementsov-Ogievskiy wrote:
> 29.01.2020 23:05, Andrey Shinkevich wrote:
>>
>>
>> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>>> Currently, block_copy operation lock the whole requested region. But
>>> there is no reason to lock clusters, which are already copied, it will
>>> disturb other parallel block_copy requests for no reason.
>>>
>>> Let's instead do the following:
>>>
>>> Lock only sub-region, which we are going to operate on. Then, after
>>> copying all dirty sub-regions, we should wait for intersecting
>>> requests block-copy, if they failed, we should retry these new dirty
>>> clusters.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>    block/block-copy.c | 116 
>>> +++++++++++++++++++++++++++++++++++++--------
>>>    1 file changed, 95 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 20068cd699..aca44b13fb 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq 
>>> *block_copy_find_inflight_req(BlockCopyState *s,
>>>        return NULL;
>>>    }
>>> -static void coroutine_fn 
>>> block_copy_wait_inflight_reqs(BlockCopyState *s,
>>> -                                                       int64_t offset,
>>> -                                                       int64_t bytes)
>>> +/*
>>> + * If there are no intersecting requests return false. Otherwise, 
>>> wait for the
>>> + * first found intersecting request to finish and return true.
>>> + */
>>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, 
>>> int64_t start,
>>> +                                             int64_t end)
>>>    {
>>> -    BlockCopyInFlightReq *req;
>>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, 
>>> start, end);
>>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>>> +    if (!req) {
>>> +        return false;
>>>        }
>>> +
>>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>>> +
>>> +    return true;
>>>    }
>>> +/* Called only on full-dirty region */
>>>    static void block_copy_inflight_req_begin(BlockCopyState *s,
>>>                                              BlockCopyInFlightReq *req,
>>>                                              int64_t offset, int64_t 
>>> bytes)
>>>    {
>>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>>> +
>>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>>> +
>>>        req->offset = offset;
>>>        req->bytes = bytes;
>>>        qemu_co_queue_init(&req->wait_queue);
>>>        QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>>    }
>>> -static void coroutine_fn 
>>> block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>>> +static void coroutine_fn 
>>> block_copy_inflight_req_shrink(BlockCopyState *s,
>>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>>    {
>>> +    if (new_bytes == req->bytes) {
>>> +        return;
>>> +    }
>>> +
>>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>>> +
>>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>>> +                          req->offset + new_bytes, req->bytes - 
>>> new_bytes);
>>> +
>>> +    req->bytes = new_bytes;
>>> +    qemu_co_queue_restart_all(&req->wait_queue);
>>
>> Won't we get the performance degradation with that function frequent 
>> call?
> 
> Why do you think so? In IO most of performance depends on disk speed and 
> how
> we organize requests sequence. The whole original series shows 
> performance improvement.
> 
> This patch reduces lock around request, locking only the part we are 
> working on now,
> this is for better interactivity. After calling block-status, the 
> request is shrinked
> to possibly unlock some other requests, waiting on the tail of our 
> request.. Do you
> have a better suggestion on this synchronization?
> 

I cannot answer right away. One need to measure the performance in each 
case.

Andrey
>>
>>> +}
>>> +
>>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>>> +                                                     
>>> BlockCopyInFlightReq *req,
>>> +                                                     int ret)
>>> +{
>>> +    if (ret < 0) {
>>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>>> +    }
>>>        QLIST_REMOVE(req, list);
>>>        qemu_co_queue_restart_all(&req->wait_queue);
>>>    }
>>> @@ -344,12 +377,19 @@ int64_t 
>>> block_copy_reset_unallocated(BlockCopyState *s,
>>>        return ret;
>>>    }
>>> -int coroutine_fn block_copy(BlockCopyState *s,
>>> -                            int64_t offset, uint64_t bytes,
>>> -                            bool *error_is_read)
>>> +/*
>>> + * block_copy_dirty_clusters
>>> + *
>>> + * Copy dirty clusters in @start/@bytes range.
>>
>> %s/start/offset/ ?
>>
>>> + * Returns 1 if dirty clusters found and successfully copied, 0 if 
>>> no dirty
>>> + * clusters found and -errno on failure.
>>> + */
>>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>>> +                                                  int64_t offset, 
>>> int64_t bytes,
>>> +                                                  bool *error_is_read)
>>>    {
>>>        int ret = 0;
>>> -    BlockCopyInFlightReq req;
>>> +    bool found_dirty = false;
>>>        /*
>>>         * block_copy() user is responsible for keeping source and 
>>> target in same
>>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>        assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>        assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>>> -
>>>        while (bytes) {
>>> +        BlockCopyInFlightReq req;
>>>            int64_t next_zero, cur_bytes, status_bytes;
>>>            if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>                continue; /* already copied */
>>>            }
>>> +        found_dirty = true;
>>> +
>>>            cur_bytes = MIN(bytes, s->copy_size);
>>>            next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, 
>>> offset,
>>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>                assert(next_zero < offset + cur_bytes); /* no need to 
>>> do MIN() */
>>>                cur_bytes = next_zero - offset;
>>>            }
>>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>>            ret = block_copy_block_status(s, offset, cur_bytes, 
>>> &status_bytes);
>>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>>>            if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, 
>>> status_bytes);
>>> +            block_copy_inflight_req_end(s, &req, 0);
>>>                s->progress_reset_callback(s->progress_opaque);
>>>                trace_block_copy_skip_range(s, offset, status_bytes);
>>>                offset += status_bytes;
>>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>            trace_block_copy_process(s, offset);
>>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>> -
>>>            co_get_from_shres(s->mem, cur_bytes);
>>>            ret = block_copy_do_copy(s, offset, cur_bytes, ret & 
>>> BDRV_BLOCK_ZERO,
>>>                                     error_is_read);
>>>            co_put_to_shres(s->mem, cur_bytes);
>>> +        block_copy_inflight_req_end(s, &req, ret);
>>>            if (ret < 0) {
>>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>> -            break;
>>> +            return ret;
>>>            }
>>>            s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>            bytes -= cur_bytes;
>>>        }
>>> -    block_copy_inflight_req_end(&req);
>>> +    return found_dirty;
>>> +}
>>> -    return ret;
>>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, 
>>> uint64_t bytes,
>>> +                            bool *error_is_read)
>>> +{
>>> +    while (true) {
>>> +        int ret = block_copy_dirty_clusters(s, start, bytes, 
>>> error_is_read);
>>> +
>>> +        if (ret < 0) {
>>> +            /*
>>> +             * IO operation failed, which means the whole block_copy 
>>> request
>>> +             * failed.
>>> +             */
>>> +            return ret;
>>> +        }
>>> +        if (ret) {
>>> +            /*
>>> +             * Something was copied, which means that there were 
>>> yield points
>>> +             * and some new dirty bits may appered (due to failed 
>>> parallel
>>> +             * block-copy requests).
>>> +             */
>>> +            continue;
>>> +        }
>>> +
>>> +        /*
>>> +         * Here ret == 0, which means that there is no dirty 
>>> clusters in
>>
>> there is no dirty cluster in
>>
>>> +         * requested region.
>>> +         */
>>> +
>>> +        if (!block_copy_wait_one(s, start, bytes)) {
>>> +            /* No dirty bits and nothing to wait: the whole request 
>>> is done */
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    return 0;
>>>    }
>>>
>>
> 
> 

-- 
With the best regards,
Andrey Shinkevich



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-01-30 16:24       ` Andrey Shinkevich
@ 2020-01-30 17:09         ` Vladimir Sementsov-Ogievskiy
  2020-01-30 18:00           ` Andrey Shinkevich
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-01-30 17:09 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz

30.01.2020 19:24, Andrey Shinkevich wrote:
> 
> 
> On 30/01/2020 16:45, Vladimir Sementsov-Ogievskiy wrote:
>> 29.01.2020 23:05, Andrey Shinkevich wrote:
>>>
>>>
>>> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>>>> Currently, block_copy operation lock the whole requested region. But
>>>> there is no reason to lock clusters, which are already copied, it will
>>>> disturb other parallel block_copy requests for no reason.
>>>>
>>>> Let's instead do the following:
>>>>
>>>> Lock only sub-region, which we are going to operate on. Then, after
>>>> copying all dirty sub-regions, we should wait for intersecting
>>>> requests block-copy, if they failed, we should retry these new dirty
>>>> clusters.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>>>>    1 file changed, 95 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>>> index 20068cd699..aca44b13fb 100644
>>>> --- a/block/block-copy.c
>>>> +++ b/block/block-copy.c
>>>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>>>        return NULL;
>>>>    }
>>>> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>>>> -                                                       int64_t offset,
>>>> -                                                       int64_t bytes)
>>>> +/*
>>>> + * If there are no intersecting requests return false. Otherwise, wait for the
>>>> + * first found intersecting request to finish and return true.
>>>> + */
>>>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
>>>> +                                             int64_t end)
>>>>    {
>>>> -    BlockCopyInFlightReq *req;
>>>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>>>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>>>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>>>> +    if (!req) {
>>>> +        return false;
>>>>        }
>>>> +
>>>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>>>> +
>>>> +    return true;
>>>>    }
>>>> +/* Called only on full-dirty region */
>>>>    static void block_copy_inflight_req_begin(BlockCopyState *s,
>>>>                                              BlockCopyInFlightReq *req,
>>>>                                              int64_t offset, int64_t bytes)
>>>>    {
>>>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>>>> +
>>>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>>>> +
>>>>        req->offset = offset;
>>>>        req->bytes = bytes;
>>>>        qemu_co_queue_init(&req->wait_queue);
>>>>        QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>>>    }
>>>> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>>>> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
>>>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>>>    {
>>>> +    if (new_bytes == req->bytes) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>>>> +
>>>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>>>> +                          req->offset + new_bytes, req->bytes - new_bytes);
>>>> +
>>>> +    req->bytes = new_bytes;
>>>> +    qemu_co_queue_restart_all(&req->wait_queue);
>>>
>>> Won't we get the performance degradation with that function frequent call?
>>
>> Why do you think so? In IO most of performance depends on disk speed and how
>> we organize requests sequence. The whole original series shows performance improvement.
>>
>> This patch reduces lock around request, locking only the part we are working on now,
>> this is for better interactivity. After calling block-status, the request is shrinked
>> to possibly unlock some other requests, waiting on the tail of our request.. Do you
>> have a better suggestion on this synchronization?
>>
> 
> I cannot answer right away. One need to measure the performance in each case.

Measurements are done in the original (big) series
  "[RFC 00/24] backup performance: block_status + async"
  <20191115141444.24155-1-vsementsov@virtuozzo.com>
  https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg02335.html

and this series (part I) is a first step to it.

Currently, improving interactivity of parallel block-copy requests is not very
meaningful, as all requests from backup are one-cluster-sized, and improving
performance of parallel intersecting guest writes is a strange thing.

But finally, backup is refactored to be the only one call of block_copy for the
whole disk, so this new locking of sub-requests becomes critical.

The only degradation I see, is that with this qemu_co_queue_restart_all, we may
restart requests, which still intersects with shrinked request, they wake up,
and go to the block_copy_wait_one again for nothing. This may be improved by
dropping coroutine-queue, and use own list of waiting requests, and wake-up
only requests which do not intersect with shrinked request. But I think, that it
is premature optimization which complicates the code. So, it may be done later
if needed. But I think that it doesn't worth it.

> 
> Andrey
>>>
>>>> +}
>>>> +
>>>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>>>> + BlockCopyInFlightReq *req,
>>>> +                                                     int ret)
>>>> +{
>>>> +    if (ret < 0) {
>>>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>>>> +    }
>>>>        QLIST_REMOVE(req, list);
>>>>        qemu_co_queue_restart_all(&req->wait_queue);
>>>>    }
>>>> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>>>        return ret;
>>>>    }
>>>> -int coroutine_fn block_copy(BlockCopyState *s,
>>>> -                            int64_t offset, uint64_t bytes,
>>>> -                            bool *error_is_read)
>>>> +/*
>>>> + * block_copy_dirty_clusters
>>>> + *
>>>> + * Copy dirty clusters in @start/@bytes range.
>>>
>>> %s/start/offset/ ?
>>>
>>>> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
>>>> + * clusters found and -errno on failure.
>>>> + */
>>>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>>>> +                                                  int64_t offset, int64_t bytes,
>>>> +                                                  bool *error_is_read)
>>>>    {
>>>>        int ret = 0;
>>>> -    BlockCopyInFlightReq req;
>>>> +    bool found_dirty = false;
>>>>        /*
>>>>         * block_copy() user is responsible for keeping source and target in same
>>>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>        assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>>        assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>>>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>>>> -
>>>>        while (bytes) {
>>>> +        BlockCopyInFlightReq req;
>>>>            int64_t next_zero, cur_bytes, status_bytes;
>>>>            if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>>>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>                continue; /* already copied */
>>>>            }
>>>> +        found_dirty = true;
>>>> +
>>>>            cur_bytes = MIN(bytes, s->copy_size);
>>>>            next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
>>>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>                assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>>>>                cur_bytes = next_zero - offset;
>>>>            }
>>>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>>>            ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
>>>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>>>>            if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
>>>> +            block_copy_inflight_req_end(s, &req, 0);
>>>>                s->progress_reset_callback(s->progress_opaque);
>>>>                trace_block_copy_skip_range(s, offset, status_bytes);
>>>>                offset += status_bytes;
>>>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>            trace_block_copy_process(s, offset);
>>>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>>> -
>>>>            co_get_from_shres(s->mem, cur_bytes);
>>>>            ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>>>>                                     error_is_read);
>>>>            co_put_to_shres(s->mem, cur_bytes);
>>>> +        block_copy_inflight_req_end(s, &req, ret);
>>>>            if (ret < 0) {
>>>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>>> -            break;
>>>> +            return ret;
>>>>            }
>>>>            s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>>>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>            bytes -= cur_bytes;
>>>>        }
>>>> -    block_copy_inflight_req_end(&req);
>>>> +    return found_dirty;
>>>> +}
>>>> -    return ret;
>>>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>>>> +                            bool *error_is_read)
>>>> +{
>>>> +    while (true) {
>>>> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
>>>> +
>>>> +        if (ret < 0) {
>>>> +            /*
>>>> +             * IO operation failed, which means the whole block_copy request
>>>> +             * failed.
>>>> +             */
>>>> +            return ret;
>>>> +        }
>>>> +        if (ret) {
>>>> +            /*
>>>> +             * Something was copied, which means that there were yield points
>>>> +             * and some new dirty bits may appered (due to failed parallel
>>>> +             * block-copy requests).
>>>> +             */
>>>> +            continue;
>>>> +        }
>>>> +
>>>> +        /*
>>>> +         * Here ret == 0, which means that there is no dirty clusters in
>>>
>>> there is no dirty cluster in
>>>
>>>> +         * requested region.
>>>> +         */
>>>> +
>>>> +        if (!block_copy_wait_one(s, start, bytes)) {
>>>> +            /* No dirty bits and nothing to wait: the whole request is done */
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>>    }
>>>>
>>>
>>
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-01-30 17:09         ` Vladimir Sementsov-Ogievskiy
@ 2020-01-30 18:00           ` Andrey Shinkevich
  0 siblings, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-30 18:00 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz



On 30/01/2020 20:09, Vladimir Sementsov-Ogievskiy wrote:
> 30.01.2020 19:24, Andrey Shinkevich wrote:
>>
>>
>> On 30/01/2020 16:45, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.01.2020 23:05, Andrey Shinkevich wrote:
>>>>
>>>>
>>>> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Currently, block_copy operation lock the whole requested region. But
>>>>> there is no reason to lock clusters, which are already copied, it will
>>>>> disturb other parallel block_copy requests for no reason.
>>>>>
>>>>> Let's instead do the following:
>>>>>
>>>>> Lock only sub-region, which we are going to operate on. Then, after
>>>>> copying all dirty sub-regions, we should wait for intersecting
>>>>> requests block-copy, if they failed, we should retry these new dirty
>>>>> clusters.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>    block/block-copy.c | 116 
>>>>> +++++++++++++++++++++++++++++++++++++--------
>>>>>    1 file changed, 95 insertions(+), 21 deletions(-)
>>>>>
>>>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>>>> index 20068cd699..aca44b13fb 100644
>>>>> --- a/block/block-copy.c
>>>>> +++ b/block/block-copy.c
>>>>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq 
>>>>> *block_copy_find_inflight_req(BlockCopyState *s,
>>>>>        return NULL;
>>>>>    }
>>>>> -static void coroutine_fn 
>>>>> block_copy_wait_inflight_reqs(BlockCopyState *s,
>>>>> -                                                       int64_t 
>>>>> offset,
>>>>> -                                                       int64_t bytes)
>>>>> +/*
>>>>> + * If there are no intersecting requests return false. Otherwise, 
>>>>> wait for the
>>>>> + * first found intersecting request to finish and return true.
>>>>> + */
>>>>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, 
>>>>> int64_t start,
>>>>> +                                             int64_t end)
>>>>>    {
>>>>> -    BlockCopyInFlightReq *req;
>>>>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, 
>>>>> start, end);
>>>>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>>>>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>>>>> +    if (!req) {
>>>>> +        return false;
>>>>>        }
>>>>> +
>>>>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>>>>> +
>>>>> +    return true;
>>>>>    }
>>>>> +/* Called only on full-dirty region */
>>>>>    static void block_copy_inflight_req_begin(BlockCopyState *s,
>>>>>                                              BlockCopyInFlightReq 
>>>>> *req,
>>>>>                                              int64_t offset, 
>>>>> int64_t bytes)
>>>>>    {
>>>>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>>>>> +
>>>>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>>>>> +
>>>>>        req->offset = offset;
>>>>>        req->bytes = bytes;
>>>>>        qemu_co_queue_init(&req->wait_queue);
>>>>>        QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>>>>    }
>>>>> -static void coroutine_fn 
>>>>> block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>>>>> +static void coroutine_fn 
>>>>> block_copy_inflight_req_shrink(BlockCopyState *s,
>>>>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>>>>    {
>>>>> +    if (new_bytes == req->bytes) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>>>>> +
>>>>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>>>>> +                          req->offset + new_bytes, req->bytes - 
>>>>> new_bytes);
>>>>> +
>>>>> +    req->bytes = new_bytes;
>>>>> +    qemu_co_queue_restart_all(&req->wait_queue);
>>>>
>>>> Won't we get the performance degradation with that function frequent 
>>>> call?
>>>
>>> Why do you think so? In IO most of performance depends on disk speed 
>>> and how
>>> we organize requests sequence. The whole original series shows 
>>> performance improvement.
>>>
>>> This patch reduces lock around request, locking only the part we are 
>>> working on now,
>>> this is for better interactivity. After calling block-status, the 
>>> request is shrinked
>>> to possibly unlock some other requests, waiting on the tail of our 
>>> request.. Do you
>>> have a better suggestion on this synchronization?
>>>
>>
>> I cannot answer right away. One need to measure the performance in 
>> each case.
> 
> Measurements are done in the original (big) series
>   "[RFC 00/24] backup performance: block_status + async"
>   <20191115141444.24155-1-vsementsov@virtuozzo.com>
>   https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg02335.html
> 
> and this series (part I) is a first step to it.
> 
> Currently, improving interactivity of parallel block-copy requests is 
> not very
> meaningful, as all requests from backup are one-cluster-sized, and 
> improving
> performance of parallel intersecting guest writes is a strange thing.
> 
> But finally, backup is refactored to be the only one call of block_copy 
> for the
> whole disk, so this new locking of sub-requests becomes critical.
> 
> The only degradation I see, is that with this qemu_co_queue_restart_all, 
> we may
> restart requests, which still intersects with shrinked request, they 
> wake up,
> and go to the block_copy_wait_one again for nothing. This may be 
> improved by
> dropping coroutine-queue, and use own list of waiting requests, and wake-up
> only requests which do not intersect with shrinked request. But I think, 
> that it
> is premature optimization which complicates the code. So, it may be done 
> later
> if needed. But I think that it doesn't worth it.
> 
>>
>> Andrey
>>>>
>>>>> +}
>>>>> +
>>>>> +static void coroutine_fn 
>>>>> block_copy_inflight_req_end(BlockCopyState *s,
>>>>> + BlockCopyInFlightReq *req,
>>>>> +                                                     int ret)
>>>>> +{
>>>>> +    if (ret < 0) {
>>>>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, 
>>>>> req->bytes);
>>>>> +    }
>>>>>        QLIST_REMOVE(req, list);
>>>>>        qemu_co_queue_restart_all(&req->wait_queue);
>>>>>    }
>>>>> @@ -344,12 +377,19 @@ int64_t 
>>>>> block_copy_reset_unallocated(BlockCopyState *s,
>>>>>        return ret;
>>>>>    }
>>>>> -int coroutine_fn block_copy(BlockCopyState *s,
>>>>> -                            int64_t offset, uint64_t bytes,
>>>>> -                            bool *error_is_read)
>>>>> +/*
>>>>> + * block_copy_dirty_clusters
>>>>> + *
>>>>> + * Copy dirty clusters in @start/@bytes range.
>>>>
>>>> %s/start/offset/ ?
>>>>
>>>>> + * Returns 1 if dirty clusters found and successfully copied, 0 if 
>>>>> no dirty
>>>>> + * clusters found and -errno on failure.
>>>>> + */
>>>>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>>>>> +                                                  int64_t offset, 
>>>>> int64_t bytes,
>>>>> +                                                  bool 
>>>>> *error_is_read)
>>>>>    {
>>>>>        int ret = 0;
>>>>> -    BlockCopyInFlightReq req;
>>>>> +    bool found_dirty = false;
>>>>>        /*
>>>>>         * block_copy() user is responsible for keeping source and 
>>>>> target in same
>>>>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>>        assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>>>        assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>>>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>>>>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>>>>> -
>>>>>        while (bytes) {
>>>>> +        BlockCopyInFlightReq req;
>>>>>            int64_t next_zero, cur_bytes, status_bytes;
>>>>>            if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>>>>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>>                continue; /* already copied */
>>>>>            }
>>>>> +        found_dirty = true;
>>>>> +
>>>>>            cur_bytes = MIN(bytes, s->copy_size);
>>>>>            next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, 
>>>>> offset,
>>>>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>>                assert(next_zero < offset + cur_bytes); /* no need 
>>>>> to do MIN() */
>>>>>                cur_bytes = next_zero - offset;
>>>>>            }
>>>>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>>>>            ret = block_copy_block_status(s, offset, cur_bytes, 
>>>>> &status_bytes);
>>>>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
>>>>>            if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>>>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, 
>>>>> status_bytes);
>>>>> +            block_copy_inflight_req_end(s, &req, 0);
>>>>>                s->progress_reset_callback(s->progress_opaque);
>>>>>                trace_block_copy_skip_range(s, offset, status_bytes);
>>>>>                offset += status_bytes;
>>>>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>>            trace_block_copy_process(s, offset);
>>>>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>>>> -
>>>>>            co_get_from_shres(s->mem, cur_bytes);
>>>>>            ret = block_copy_do_copy(s, offset, cur_bytes, ret & 
>>>>> BDRV_BLOCK_ZERO,
>>>>>                                     error_is_read);
>>>>>            co_put_to_shres(s->mem, cur_bytes);
>>>>> +        block_copy_inflight_req_end(s, &req, ret);
>>>>>            if (ret < 0) {
>>>>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>>>>> -            break;
>>>>> +            return ret;
>>>>>            }
>>>>>            s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>>>>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>>>            bytes -= cur_bytes;
>>>>>        }
>>>>> -    block_copy_inflight_req_end(&req);
>>>>> +    return found_dirty;
>>>>> +}
>>>>> -    return ret;
>>>>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, 
>>>>> uint64_t bytes,
>>>>> +                            bool *error_is_read)
>>>>> +{
>>>>> +    while (true) {
>>>>> +        int ret = block_copy_dirty_clusters(s, start, bytes, 
>>>>> error_is_read);
>>>>> +
>>>>> +        if (ret < 0) {
>>>>> +            /*
>>>>> +             * IO operation failed, which means the whole 
>>>>> block_copy request
>>>>> +             * failed.
>>>>> +             */
>>>>> +            return ret;
>>>>> +        }
>>>>> +        if (ret) {
>>>>> +            /*
>>>>> +             * Something was copied, which means that there were 
>>>>> yield points
>>>>> +             * and some new dirty bits may appered (due to failed 
>>>>> parallel
>>>>> +             * block-copy requests).
>>>>> +             */
>>>>> +            continue;
>>>>> +        }
>>>>> +
>>>>> +        /*
>>>>> +         * Here ret == 0, which means that there is no dirty 
>>>>> clusters in
>>>>
>>>> there is no dirty cluster in
>>>>
>>>>> +         * requested region.
>>>>> +         */
>>>>> +
>>>>> +        if (!block_copy_wait_one(s, start, bytes)) {
>>>>> +            /* No dirty bits and nothing to wait: the whole 
>>>>> request is done */
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>>    }
>>>>>
>>>>
>>>
>>>
>>
> 
> 

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 7/7] block/block-copy: hide structure definitions
  2019-11-27 18:08 ` [PATCH v2 7/7] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
@ 2020-01-30 18:58   ` Andrey Shinkevich
  2020-02-17 14:04   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Andrey Shinkevich @ 2020-01-30 18:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, den, jsnow, qemu-devel, mreitz



On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
> Hide structure definitions and add explicit API instead, to keep an
> eye on the scope of the shared fields.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/block-copy.h | 57 +++------------------------------
>   block/backup-top.c         |  6 ++--
>   block/backup.c             | 27 ++++++++--------
>   block/block-copy.c         | 64 ++++++++++++++++++++++++++++++++++++++
>   4 files changed, 86 insertions(+), 68 deletions(-)
> 
> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
> index d96b097267..753fa663ac 100644
> --- a/include/block/block-copy.h
> +++ b/include/block/block-copy.h
> @@ -18,61 +18,9 @@
>   #include "block/block.h"
>   #include "qemu/co-shared-resource.h"
>   
> -typedef struct BlockCopyInFlightReq {
> -    int64_t offset;
> -    int64_t bytes;
> -    QLIST_ENTRY(BlockCopyInFlightReq) list;
> -    CoQueue wait_queue; /* coroutines blocked on this request */
> -} BlockCopyInFlightReq;
> -
>   typedef void (*ProgressBytesCallbackFunc)(int64_t bytes, void *opaque);
>   typedef void (*ProgressResetCallbackFunc)(void *opaque);
> -typedef struct BlockCopyState {
> -    /*
> -     * BdrvChild objects are not owned or managed by block-copy. They are
> -     * provided by block-copy user and user is responsible for appropriate
> -     * permissions on these children.
> -     */
> -    BdrvChild *source;
> -    BdrvChild *target;
> -    BdrvDirtyBitmap *copy_bitmap;
> -    int64_t cluster_size;
> -    bool use_copy_range;
> -    int64_t copy_size;
> -    uint64_t len;
> -    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
> -
> -    BdrvRequestFlags write_flags;
> -
> -    /*
> -     * skip_unallocated:
> -     *
> -     * Used by sync=top jobs, which first scan the source node for unallocated
> -     * areas and clear them in the copy_bitmap.  During this process, the bitmap
> -     * is thus not fully initialized: It may still have bits set for areas that
> -     * are unallocated and should actually not be copied.
> -     *
> -     * This is indicated by skip_unallocated.
> -     *
> -     * In this case, block_copy() will query the source’s allocation status,
> -     * skip unallocated regions, clear them in the copy_bitmap, and invoke
> -     * block_copy_reset_unallocated() every time it does.
> -     */
> -    bool skip_unallocated;
> -
> -    /* progress_bytes_callback: called when some copying progress is done. */
> -    ProgressBytesCallbackFunc progress_bytes_callback;
> -
> -    /*
> -     * progress_reset_callback: called when some bytes reset from copy_bitmap
> -     * (see @skip_unallocated above). The callee is assumed to recalculate how
> -     * many bytes remain based on the dirty bit count of copy_bitmap.
> -     */
> -    ProgressResetCallbackFunc progress_reset_callback;
> -    void *progress_opaque;
> -
> -    SharedResource *mem;
> -} BlockCopyState;
> +typedef struct BlockCopyState BlockCopyState;
>   
>   BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>                                        int64_t cluster_size,
> @@ -93,4 +41,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>   int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>                               bool *error_is_read);
>   
> +BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s);
> +void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip);
> +
>   #endif /* BLOCK_COPY_H */
> diff --git a/block/backup-top.c b/block/backup-top.c
> index 7cdb1f8eba..1026628b57 100644
> --- a/block/backup-top.c
> +++ b/block/backup-top.c
> @@ -38,6 +38,7 @@ typedef struct BDRVBackupTopState {
>       BlockCopyState *bcs;
>       BdrvChild *target;
>       bool active;
> +    int64_t cluster_size;
>   } BDRVBackupTopState;
>   
>   static coroutine_fn int backup_top_co_preadv(
> @@ -51,8 +52,8 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, uint64_t offset,
>                                          uint64_t bytes)
>   {
>       BDRVBackupTopState *s = bs->opaque;
> -    uint64_t end = QEMU_ALIGN_UP(offset + bytes, s->bcs->cluster_size);
> -    uint64_t off = QEMU_ALIGN_DOWN(offset, s->bcs->cluster_size);
> +    uint64_t end = QEMU_ALIGN_UP(offset + bytes, s->cluster_size);
> +    uint64_t off = QEMU_ALIGN_DOWN(offset, s->cluster_size);
>   
>       return block_copy(s->bcs, off, end - off, NULL);
>   }
> @@ -227,6 +228,7 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>           goto failed_after_append;
>       }
>   
> +    state->cluster_size = cluster_size;
>       state->bcs = block_copy_state_new(top->backing, state->target,
>                                         cluster_size, write_flags, &local_err);
>       if (local_err) {
> diff --git a/block/backup.c b/block/backup.c
> index cf62b1a38c..acab0d08da 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -48,6 +48,7 @@ typedef struct BackupBlockJob {
>       int64_t cluster_size;
>   
>       BlockCopyState *bcs;
> +    BdrvDirtyBitmap *bcs_bitmap;
>   } BackupBlockJob;
>   
>   static const BlockJobDriver backup_job_driver;
> @@ -63,7 +64,7 @@ static void backup_progress_bytes_callback(int64_t bytes, void *opaque)
>   static void backup_progress_reset_callback(void *opaque)
>   {
>       BackupBlockJob *s = opaque;
> -    uint64_t estimate = bdrv_get_dirty_count(s->bcs->copy_bitmap);
> +    uint64_t estimate = bdrv_get_dirty_count(s->bcs_bitmap);
>   
>       job_progress_set_remaining(&s->common.job, estimate);
>   }
> @@ -111,8 +112,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
>   
>       if (ret < 0 && job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) {
>           /* If we failed and synced, merge in the bits we didn't copy: */
> -        bdrv_dirty_bitmap_merge_internal(bm, job->bcs->copy_bitmap,
> -                                         NULL, true);
> +        bdrv_dirty_bitmap_merge_internal(bm, job->bcs_bitmap, NULL, true);
>       }
>   }
>   
> @@ -151,7 +151,7 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
>           return;
>       }
>   
> -    bdrv_set_dirty_bitmap(backup_job->bcs->copy_bitmap, 0, backup_job->len);
> +    bdrv_set_dirty_bitmap(backup_job->bcs_bitmap, 0, backup_job->len);
>   }
>   
>   static BlockErrorAction backup_error_action(BackupBlockJob *job,
> @@ -196,7 +196,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
>       BdrvDirtyBitmapIter *bdbi;
>       int ret = 0;
>   
> -    bdbi = bdrv_dirty_iter_new(job->bcs->copy_bitmap);
> +    bdbi = bdrv_dirty_iter_new(job->bcs_bitmap);
>       while ((offset = bdrv_dirty_iter_next(bdbi)) != -1) {
>           do {
>               if (yield_and_check(job)) {
> @@ -216,13 +216,13 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
>       return ret;
>   }
>   
> -static void backup_init_copy_bitmap(BackupBlockJob *job)
> +static void backup_init_bcs_bitmap(BackupBlockJob *job)
>   {
>       bool ret;
>       uint64_t estimate;
>   
>       if (job->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
> -        ret = bdrv_dirty_bitmap_merge_internal(job->bcs->copy_bitmap,
> +        ret = bdrv_dirty_bitmap_merge_internal(job->bcs_bitmap,
>                                                  job->sync_bitmap,
>                                                  NULL, true);
>           assert(ret);
> @@ -232,12 +232,12 @@ static void backup_init_copy_bitmap(BackupBlockJob *job)
>                * We can't hog the coroutine to initialize this thoroughly.
>                * Set a flag and resume work when we are able to yield safely.
>                */
> -            job->bcs->skip_unallocated = true;
> +            block_copy_set_skip_unallocated(job->bcs, true);
>           }
> -        bdrv_set_dirty_bitmap(job->bcs->copy_bitmap, 0, job->len);
> +        bdrv_set_dirty_bitmap(job->bcs_bitmap, 0, job->len);
>       }
>   
> -    estimate = bdrv_get_dirty_count(job->bcs->copy_bitmap);
> +    estimate = bdrv_get_dirty_count(job->bcs_bitmap);
>       job_progress_set_remaining(&job->common.job, estimate);
>   }
>   
> @@ -246,7 +246,7 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
>       BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
>       int ret = 0;
>   
> -    backup_init_copy_bitmap(s);
> +    backup_init_bcs_bitmap(s);
>   
>       if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
>           int64_t offset = 0;
> @@ -265,12 +265,12 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
>   
>               offset += count;
>           }
> -        s->bcs->skip_unallocated = false;
> +        block_copy_set_skip_unallocated(s->bcs, false);
>       }
>   
>       if (s->sync_mode == MIRROR_SYNC_MODE_NONE) {
>           /*
> -         * All bits are set in copy_bitmap to allow any cluster to be copied.
> +         * All bits are set in bcs_bitmap to allow any cluster to be copied.
>            * This does not actually require them to be copied.
>            */
>           while (!job_is_cancelled(job)) {
> @@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>       job->sync_bitmap = sync_bitmap;
>       job->bitmap_mode = bitmap_mode;
>       job->bcs = bcs;
> +    job->bcs_bitmap = block_copy_dirty_bitmap(bcs);
>       job->cluster_size = cluster_size;
>       job->len = len;
>   
> diff --git a/block/block-copy.c b/block/block-copy.c
> index aca44b13fb..7e14e86a2d 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -24,6 +24,60 @@
>   #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
>   #define BLOCK_COPY_MAX_MEM (128 * MiB)
>   
> +typedef struct BlockCopyInFlightReq {
> +    int64_t offset;
> +    int64_t bytes;
> +    QLIST_ENTRY(BlockCopyInFlightReq) list;
> +    CoQueue wait_queue; /* coroutines blocked on this request */
> +} BlockCopyInFlightReq;
> +
> +typedef struct BlockCopyState {
> +    /*
> +     * BdrvChild objects are not owned or managed by block-copy. They are
> +     * provided by block-copy user and user is responsible for appropriate
> +     * permissions on these children.
> +     */
> +    BdrvChild *source;
> +    BdrvChild *target;
> +    BdrvDirtyBitmap *copy_bitmap;
> +    int64_t cluster_size;
> +    bool use_copy_range;
> +    int64_t copy_size;
> +    uint64_t len;
> +    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
> +
> +    BdrvRequestFlags write_flags;
> +
> +    /*
> +     * skip_unallocated:
> +     *
> +     * Used by sync=top jobs, which first scan the source node for unallocated
> +     * areas and clear them in the copy_bitmap.  During this process, the bitmap
> +     * is thus not fully initialized: It may still have bits set for areas that
> +     * are unallocated and should actually not be copied.
> +     *
> +     * This is indicated by skip_unallocated.
> +     *
> +     * In this case, block_copy() will query the source’s allocation status,
> +     * skip unallocated regions, clear them in the copy_bitmap, and invoke
> +     * block_copy_reset_unallocated() every time it does.
> +     */
> +    bool skip_unallocated;
> +
> +    /* progress_bytes_callback: called when some copying progress is done. */
> +    ProgressBytesCallbackFunc progress_bytes_callback;
> +
> +    /*
> +     * progress_reset_callback: called when some bytes reset from copy_bitmap
> +     * (see @skip_unallocated above). The callee is assumed to recalculate how
> +     * many bytes remain based on the dirty bit count of copy_bitmap.
> +     */
> +    ProgressResetCallbackFunc progress_reset_callback;
> +    void *progress_opaque;
> +
> +    SharedResource *mem;
> +} BlockCopyState;
> +
>   static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>                                                             int64_t offset,
>                                                             int64_t bytes)
> @@ -492,3 +546,13 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>   
>       return 0;
>   }
> +
> +BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s)
> +{
> +    return s->copy_bitmap;
> +}
> +
> +void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip)
> +{
> +    s->skip_unallocated = skip;
> +}
> 

The idea of API is OK but we have got the duplicated members in the 
nested structures.

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end
  2020-01-29 17:12   ` Andrey Shinkevich
@ 2020-02-05 11:36     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-05 11:36 UTC (permalink / raw)
  To: Andrey Shinkevich, qemu-block
  Cc: kwolf, Denis Lunev, jsnow, qemu-devel, mreitz

29.01.2020 20:12, Andrey Shinkevich wrote:
> 
> 
> On 27/11/2019 21:08, Vladimir Sementsov-Ogievskiy wrote:
>> We have a lot of "chunk_end - start" invocations, let's switch to
>> bytes/cur_bytes scheme instead.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>    include/block/block-copy.h |  4 +--
>>    block/block-copy.c         | 68 ++++++++++++++++++++------------------
>>    2 files changed, 37 insertions(+), 35 deletions(-)
>>
>> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
>> index 0a161724d7..7321b3d305 100644
>> --- a/include/block/block-copy.h
>> +++ b/include/block/block-copy.h
>> @@ -19,8 +19,8 @@
>>    #include "qemu/co-shared-resource.h"
>>    
>>    typedef struct BlockCopyInFlightReq {
>> -    int64_t start_byte;
>> -    int64_t end_byte;
>> +    int64_t start;
>> +    int64_t bytes;
>>        QLIST_ENTRY(BlockCopyInFlightReq) list;
>>        CoQueue wait_queue; /* coroutines blocked on this request */
>>    } BlockCopyInFlightReq;
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 94e7e855ef..cc273b6cb8 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -26,12 +26,12 @@
>>    
>>    static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>                                                              int64_t start,
>> -                                                          int64_t end)
>> +                                                          int64_t bytes)
>>    {
>>        BlockCopyInFlightReq *req;
>>    
>>        QLIST_FOREACH(req, &s->inflight_reqs, list) {
>> -        if (end > req->start_byte && start < req->end_byte) {
>> +        if (start + bytes > req->start && start < req->start + req->bytes) {
>>                return req;
>>            }
>>        }
>> @@ -41,21 +41,21 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>    
>>    static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>>                                                           int64_t start,
>> -                                                       int64_t end)
>> +                                                       int64_t bytes)
>>    {
>>        BlockCopyInFlightReq *req;
>>    
>> -    while ((req = block_copy_find_inflight_req(s, start, end))) {
>> +    while ((req = block_copy_find_inflight_req(s, start, bytes))) {
>>            qemu_co_queue_wait(&req->wait_queue, NULL);
>>        }
>>    }
>>    
>>    static void block_copy_inflight_req_begin(BlockCopyState *s,
>>                                              BlockCopyInFlightReq *req,
>> -                                          int64_t start, int64_t end)
>> +                                          int64_t start, int64_t bytes)
>>    {
>> -    req->start_byte = start;
>> -    req->end_byte = end;
>> +    req->start = start;
>> +    req->bytes = bytes;
>>        qemu_co_queue_init(&req->wait_queue);
>>        QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>    }
>> @@ -150,24 +150,26 @@ void block_copy_set_callbacks(
>>    /*
>>     * block_copy_do_copy
>>     *
>> - * Do copy of cluser-aligned chunk. @end is allowed to exceed s->len only to
>> - * cover last cluster when s->len is not aligned to clusters.
>> + * Do copy of cluser-aligned chunk. Requested region is allowed to exceed s->len
> 
> cluster-...
> 
>> + * only to cover last cluster when s->len is not aligned to clusters.
>>     *
>>     * No sync here: nor bitmap neighter intersecting requests handling, only copy.
>>     *
>>     * Returns 0 on success.
>>     */
>>    static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>> -                                           int64_t start, int64_t end,
>> +                                           int64_t start, int64_t bytes,
>>                                               bool zeroes, bool *error_is_read)
>>    {
>>        int ret;
>> -    int nbytes = MIN(end, s->len) - start;
>> +    int nbytes = MIN(start + bytes, s->len) - start;
>>        void *bounce_buffer = NULL;
>>    
>> +    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
>>        assert(QEMU_IS_ALIGNED(start, s->cluster_size));
>> -    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
>> -    assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
>> +    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>> +    assert(start + bytes <= s->len ||
> 
> The '<=' looks correct but I wonder how only '<' worked without
> assertion failure before.

No difference, if end == s->len, and end is aligned, then len is aligned too and "end == QEMU_ALIGN_UP(s->len, s->cluster_size)" is true. But "<=" looks more native for end and file len.

>Was the s->len never aligned to the cluster size?
> 
>> +           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
>>    
>>        if (zeroes) {
>>            ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
>> @@ -347,7 +349,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>                                bool *error_is_read)
>>    {
>>        int ret = 0;
>> -    int64_t end = bytes + start; /* bytes */
>>        BlockCopyInFlightReq req;
>>    
>>        /*
>> @@ -358,58 +359,59 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               bdrv_get_aio_context(s->target->bs));
>>    
>>        assert(QEMU_IS_ALIGNED(start, s->cluster_size));
>> -    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
>> +    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>    
>>        block_copy_wait_inflight_reqs(s, start, bytes);
>> -    block_copy_inflight_req_begin(s, &req, start, end);
>> +    block_copy_inflight_req_begin(s, &req, start, bytes);
>>    
>> -    while (start < end) {
>> -        int64_t next_zero, chunk_end, status_bytes;
>> +    while (bytes) {
>> +        int64_t next_zero, cur_bytes, status_bytes;
>>    
>>            if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
>>                trace_block_copy_skip(s, start);
>>                start += s->cluster_size;
>> +            bytes -= s->cluster_size;
>>                continue; /* already copied */
>>            }
>>    
>> -        chunk_end = MIN(end, start + s->copy_size);
>> +        cur_bytes = MIN(bytes, s->copy_size);
>>    
>>            next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
>> -                                                chunk_end - start);
>> +                                                cur_bytes);
>>            if (next_zero >= 0) {
>>                assert(next_zero > start); /* start is dirty */
>> -            assert(next_zero < chunk_end); /* no need to do MIN() */
>> -            chunk_end = next_zero;
>> +            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
>> +            cur_bytes = next_zero - start;
>>            }
>>    
>> -        ret = block_copy_block_status(s, start, chunk_end - start,
>> -                                      &status_bytes);
>> +        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
>>            if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>                bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
>>                s->progress_reset_callback(s->progress_opaque);
>>                trace_block_copy_skip_range(s, start, status_bytes);
>>                start += status_bytes;
>> +            bytes -= status_bytes;
>>                continue;
>>            }
>>    
>> -        chunk_end = MIN(chunk_end, start + status_bytes);
>> +        cur_bytes = MIN(cur_bytes, status_bytes);
>>    
>>            trace_block_copy_process(s, start);
>>    
>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
>> +        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
>>    
>> -        co_get_from_shres(s->mem, chunk_end - start);
>> -        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
>> +        co_get_from_shres(s->mem, cur_bytes);
>> +        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
>>                                     error_is_read);
>> -        co_put_to_shres(s->mem, chunk_end - start);
>> +        co_put_to_shres(s->mem, cur_bytes);
>>            if (ret < 0) {
>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
>> +            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
>>                break;
>>            }
>>    
>> -        s->progress_bytes_callback(chunk_end - start, s->progress_opaque);
>> -        start = chunk_end;
>> -        ret = 0;
>> +        s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>> +        start += cur_bytes;
>> +        bytes -= cur_bytes;
>>        }
>>    
>>        block_copy_inflight_req_end(&req);
>>
> 
> At the first glance, we would only benefit from the 'while' loop by that
> substitution to reduce the number of addition operations.
> 
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/7] block/block-copy: specialcase first copy_range request
  2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
  2020-01-29  7:38   ` Andrey Shinkevich
@ 2020-02-07 17:28   ` Max Reitz
  2020-02-08 12:32     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-07 17:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3084 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> In block_copy_do_copy we fallback to read+write if copy_range failed.
> In this case copy_size is larger than defined for buffered IO, and
> there is corresponding commit. Still, backup copies data cluster by
> cluster, and most of requests are limited to one cluster anyway, so the
> only source of this one bad-limited request is copy-before-write
> operation.
> 
> Further patch will move backup to use block_copy directly, than for
> cases where copy_range is not supported, first request will be
> oversized in each backup. It's not good, let's change it now.
> 
> Fix is simple: just limit first copy_range request like buffer-based
> request. If it succeed, set larger copy_range limit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/block-copy.c | 41 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 30 insertions(+), 11 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 79798a1567..8602e2cae7 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

> @@ -168,7 +170,21 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>              s->use_copy_range = false;
>              s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>              /* Fallback to read+write with allocated buffer */
> -        } else {
> +        } else if (s->use_copy_range) {
> +            /*
> +             * Successful copy-range. Now increase copy_size.
> +             * copy_range does not respect max_transfer (it's a TODO), so we
> +             * factor that in here.
> +             *
> +             * Note: we double-check s->use_copy_range for the case when
> +             * parallel block-copy request unset it during previous
> +             * bdrv_co_copy_range call.

But we should still goto out anyway, shouldn’t we?

> +             */
> +            s->copy_size =
> +                    MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
> +                        QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
> +                                                                s->target),
> +                                        s->cluster_size));
>              goto out;
>          }
>      }
> @@ -176,7 +192,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>      /*
>       * In case of failed copy_range request above, we may proceed with buffered
>       * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
> -     * be properly limited, so don't care too much.
> +     * be properly limited, so don't care too much. Moreover the most possible

s/possible/likely/

Max

> +     * case (copy_range is unsupported for the configuration, so the very first
> +     * copy_range request fails) is handled by setting large copy_size only
> +     * after first successful copy_range.
>       */
>  
>      bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 2/7] block/block-copy: use block_status
  2019-11-27 18:08 ` [PATCH v2 2/7] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
  2020-01-29  9:12   ` Andrey Shinkevich
@ 2020-02-07 17:46   ` Max Reitz
  2020-02-08 12:25     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-07 17:46 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2188 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Use bdrv_block_status_above to chose effective chunk size and to handle
> zeroes effectively.
> 
> This substitutes checking for just being allocated or not, and drops
> old code path for it. Assistance by backup job is dropped too, as
> caching block-status information is more difficult than just caching
> is-allocated information in our dirty bitmap, and backup job is not
> good place for this caching anyway.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/block-copy.c | 67 +++++++++++++++++++++++++++++++++++++---------
>  block/trace-events |  1 +
>  2 files changed, 55 insertions(+), 13 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 8602e2cae7..74295d93d5 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

> @@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
>              chunk_end = next_zero;
>          }
>  
> -        if (s->skip_unallocated) {
> -            ret = block_copy_reset_unallocated(s, start, &status_bytes);
> -            if (ret == 0) {
> -                trace_block_copy_skip_range(s, start, status_bytes);
> -                start += status_bytes;
> -                continue;
> -            }
> -            /* Clamp to known allocated region */
> -            chunk_end = MIN(chunk_end, start + status_bytes);
> +        ret = block_copy_block_status(s, start, chunk_end - start,
> +                                      &status_bytes);
> +        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> +            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
> +            s->progress_reset_callback(s->progress_opaque);
> +            trace_block_copy_skip_range(s, start, status_bytes);
> +            start += status_bytes;
> +            continue;
>          }
>  
> +        chunk_end = MIN(chunk_end, start + status_bytes);

I’m not sure how much the old “Clamp to known allocated region” actually
helps, but I wouldn’t drop it anyway.

Apart from this nit, the patch looks good to me.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req
  2019-11-27 18:08 ` [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req Vladimir Sementsov-Ogievskiy
  2020-01-29  9:25   ` Andrey Shinkevich
@ 2020-02-07 17:50   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Max Reitz @ 2020-02-07 17:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2328 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Split block_copy_find_inflight_req to be used in seprate.

*separate, but separate what?

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/block-copy.c | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 74295d93d5..94e7e855ef 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -24,23 +24,30 @@
>  #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
>  #define BLOCK_COPY_MAX_MEM (128 * MiB)
>  
> +static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,

Hm.  Long already, but the name begs the question what in-flight
requests this is about.  Maybe drop the block_copy prefix (unless you
plan to make this function global) and call it “find_inflight_conflict”?
 (Or “find_conflicting_inflight_req” to be more verbose)

> +                                                          int64_t start,
> +                                                          int64_t end)
> +{
> +    BlockCopyInFlightReq *req;
> +
> +    QLIST_FOREACH(req, &s->inflight_reqs, list) {
> +        if (end > req->start_byte && start < req->end_byte) {
> +            return req;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +
>  static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>                                                         int64_t start,
>                                                         int64_t end)
>  {
>      BlockCopyInFlightReq *req;
> -    bool waited;
> -
> -    do {
> -        waited = false;
> -        QLIST_FOREACH(req, &s->inflight_reqs, list) {
> -            if (end > req->start_byte && start < req->end_byte) {
> -                qemu_co_queue_wait(&req->wait_queue, NULL);
> -                waited = true;
> -                break;
> -            }
> -        }
> -    } while (waited);
> +
> +    while ((req = block_copy_find_inflight_req(s, start, end))) {
> +        qemu_co_queue_wait(&req->wait_queue, NULL);
> +    }
>  }

The change itself looks rather nice to me.  Even without other users of
this new function, it turns what’s happening into a more obvious pattern.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end
  2019-11-27 18:08 ` [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
  2020-01-29 17:12   ` Andrey Shinkevich
@ 2020-02-07 18:01   ` Max Reitz
  2020-02-08 12:55     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-07 18:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1450 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> We have a lot of "chunk_end - start" invocations, let's switch to
> bytes/cur_bytes scheme instead.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block-copy.h |  4 +--
>  block/block-copy.c         | 68 ++++++++++++++++++++------------------
>  2 files changed, 37 insertions(+), 35 deletions(-)

[...]

> diff --git a/block/block-copy.c b/block/block-copy.c
> index 94e7e855ef..cc273b6cb8 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

> @@ -150,24 +150,26 @@ void block_copy_set_callbacks(

[...]

>  static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
> -                                           int64_t start, int64_t end,
> +                                           int64_t start, int64_t bytes,

I wonder whether it would make more sense to make some of these @bytes
parameters plain ints, because...

>                                             bool zeroes, bool *error_is_read)
>  {
>      int ret;
> -    int nbytes = MIN(end, s->len) - start;
> +    int nbytes = MIN(start + bytes, s->len) - start;

...things like this look a bit dangerous now.  So if the interface
already clearly shows that we’re always expecting something less than
INT_MAX, it might all be a bit clearer.

I’ll leave it up to you:

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces
  2019-11-27 18:08 ` [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
  2020-01-29 17:37   ` Andrey Shinkevich
@ 2020-02-07 18:04   ` Max Reitz
  1 sibling, 0 replies; 40+ messages in thread
From: Max Reitz @ 2020-02-07 18:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 430 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> offset/bytes pair is more usual naming in block layer, let's use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block-copy.h |  2 +-
>  block/block-copy.c         | 80 +++++++++++++++++++-------------------
>  2 files changed, 41 insertions(+), 41 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 for-5.0 0/7] block-copy improvements: part I
  2020-01-20  9:09   ` Vladimir Sementsov-Ogievskiy
@ 2020-02-07 18:05     ` Max Reitz
  2020-02-08 10:28       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-07 18:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, jsnow, qemu-devel, Denis Lunev


[-- Attachment #1.1: Type: text/plain, Size: 304 bytes --]

On 20.01.20 10:09, Vladimir Sementsov-Ogievskiy wrote:
> ping

Sorry, I only got to patch 5 so far (without major complaints). I’ll
have to pack things up for the weekend now and I’ll be on PTO next week,
so I won’t get to review the final two patches before Feb 17, I’m afraid...

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 for-5.0 0/7] block-copy improvements: part I
  2020-02-07 18:05     ` Max Reitz
@ 2020-02-08 10:28       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-08 10:28 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, jsnow, qemu-devel, Denis Lunev

07.02.2020 21:05, Max Reitz wrote:
> On 20.01.20 10:09, Vladimir Sementsov-Ogievskiy wrote:
>> ping
> 
> Sorry, I only got to patch 5 so far (without major complaints). I’ll
> have to pack things up for the weekend now and I’ll be on PTO next week,
> so I won’t get to review the final two patches before Feb 17, I’m afraid...
> 

Thanks for the start and have a nice rest!


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 2/7] block/block-copy: use block_status
  2020-02-07 17:46   ` Max Reitz
@ 2020-02-08 12:25     ` Vladimir Sementsov-Ogievskiy
  2020-02-17 11:48       ` Max Reitz
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-08 12:25 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

07.02.2020 20:46, Max Reitz wrote:
> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>> Use bdrv_block_status_above to chose effective chunk size and to handle
>> zeroes effectively.
>>
>> This substitutes checking for just being allocated or not, and drops
>> old code path for it. Assistance by backup job is dropped too, as
>> caching block-status information is more difficult than just caching
>> is-allocated information in our dirty bitmap, and backup job is not
>> good place for this caching anyway.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/block-copy.c | 67 +++++++++++++++++++++++++++++++++++++---------
>>   block/trace-events |  1 +
>>   2 files changed, 55 insertions(+), 13 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 8602e2cae7..74295d93d5 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
> 
> [...]
> 
>> @@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               chunk_end = next_zero;
>>           }
>>   
>> -        if (s->skip_unallocated) {
>> -            ret = block_copy_reset_unallocated(s, start, &status_bytes);
>> -            if (ret == 0) {
>> -                trace_block_copy_skip_range(s, start, status_bytes);
>> -                start += status_bytes;
>> -                continue;
>> -            }
>> -            /* Clamp to known allocated region */
>> -            chunk_end = MIN(chunk_end, start + status_bytes);
>> +        ret = block_copy_block_status(s, start, chunk_end - start,
>> +                                      &status_bytes);
>> +        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>> +            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
>> +            s->progress_reset_callback(s->progress_opaque);
>> +            trace_block_copy_skip_range(s, start, status_bytes);
>> +            start += status_bytes;
>> +            continue;
>>           }
>>   
>> +        chunk_end = MIN(chunk_end, start + status_bytes);
> 
> I’m not sure how much the old “Clamp to known allocated region” actually
> helps, but I wouldn’t drop it anyway.

But it may be not allocated now. We just clamp to status_bytes.
It's "known allocated" only if s->skip_unallocated is true.

> 
> Apart from this nit, the patch looks good to me.
> 
> Max
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/7] block/block-copy: specialcase first copy_range request
  2020-02-07 17:28   ` Max Reitz
@ 2020-02-08 12:32     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-08 12:32 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

07.02.2020 20:28, Max Reitz wrote:
> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>> In block_copy_do_copy we fallback to read+write if copy_range failed.
>> In this case copy_size is larger than defined for buffered IO, and
>> there is corresponding commit. Still, backup copies data cluster by
>> cluster, and most of requests are limited to one cluster anyway, so the
>> only source of this one bad-limited request is copy-before-write
>> operation.
>>
>> Further patch will move backup to use block_copy directly, than for
>> cases where copy_range is not supported, first request will be
>> oversized in each backup. It's not good, let's change it now.
>>
>> Fix is simple: just limit first copy_range request like buffer-based
>> request. If it succeed, set larger copy_range limit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/block-copy.c | 41 ++++++++++++++++++++++++++++++-----------
>>   1 file changed, 30 insertions(+), 11 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 79798a1567..8602e2cae7 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
> 
> [...]
> 
>> @@ -168,7 +170,21 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>>               s->use_copy_range = false;
>>               s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>>               /* Fallback to read+write with allocated buffer */
>> -        } else {
>> +        } else if (s->use_copy_range) {
>> +            /*
>> +             * Successful copy-range. Now increase copy_size.
>> +             * copy_range does not respect max_transfer (it's a TODO), so we
>> +             * factor that in here.
>> +             *
>> +             * Note: we double-check s->use_copy_range for the case when
>> +             * parallel block-copy request unset it during previous
>> +             * bdrv_co_copy_range call.
> 
> But we should still goto out anyway, shouldn’t we?

Oops yes, will fix.

> 
>> +             */
>> +            s->copy_size =
>> +                    MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>> +                        QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>> +                                                                s->target),
>> +                                        s->cluster_size));
>>               goto out;
>>           }
>>       }
>> @@ -176,7 +192,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>>       /*
>>        * In case of failed copy_range request above, we may proceed with buffered
>>        * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
>> -     * be properly limited, so don't care too much.
>> +     * be properly limited, so don't care too much. Moreover the most possible
> 
> s/possible/likely/
> 
> Max
> 
>> +     * case (copy_range is unsupported for the configuration, so the very first
>> +     * copy_range request fails) is handled by setting large copy_size only
>> +     * after first successful copy_range.
>>        */
>>   
>>       bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
>>
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end
  2020-02-07 18:01   ` Max Reitz
@ 2020-02-08 12:55     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-08 12:55 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

07.02.2020 21:01, Max Reitz wrote:
> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>> We have a lot of "chunk_end - start" invocations, let's switch to
>> bytes/cur_bytes scheme instead.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block-copy.h |  4 +--
>>   block/block-copy.c         | 68 ++++++++++++++++++++------------------
>>   2 files changed, 37 insertions(+), 35 deletions(-)
> 
> [...]
> 
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 94e7e855ef..cc273b6cb8 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
> 
> [...]
> 
>> @@ -150,24 +150,26 @@ void block_copy_set_callbacks(
> 
> [...]
> 
>>   static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>> -                                           int64_t start, int64_t end,
>> +                                           int64_t start, int64_t bytes,
> 
> I wonder whether it would make more sense to make some of these @bytes
> parameters plain ints, because...
> 
>>                                              bool zeroes, bool *error_is_read)
>>   {
>>       int ret;
>> -    int nbytes = MIN(end, s->len) - start;
>> +    int nbytes = MIN(start + bytes, s->len) - start;
> 
> ...things like this look a bit dangerous now.  So if the interface
> already clearly shows that we’re always expecting something less than
> INT_MAX, it might all be a bit clearer.

Hmm, yes. And it's preexisting, just becomes more obvious with new semantics.
All block-copy tasks are limited to s->copy_size so it's actually safe.
I'd better add an assertion, as I believe that 64bit write_zeroes will appear in not far future.

> 
> I’ll leave it up to you:
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 2/7] block/block-copy: use block_status
  2020-02-08 12:25     ` Vladimir Sementsov-Ogievskiy
@ 2020-02-17 11:48       ` Max Reitz
  0 siblings, 0 replies; 40+ messages in thread
From: Max Reitz @ 2020-02-17 11:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2629 bytes --]

On 08.02.20 13:25, Vladimir Sementsov-Ogievskiy wrote:
> 07.02.2020 20:46, Max Reitz wrote:
>> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>>> Use bdrv_block_status_above to chose effective chunk size and to handle
>>> zeroes effectively.
>>>
>>> This substitutes checking for just being allocated or not, and drops
>>> old code path for it. Assistance by backup job is dropped too, as
>>> caching block-status information is more difficult than just caching
>>> is-allocated information in our dirty bitmap, and backup job is not
>>> good place for this caching anyway.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   block/block-copy.c | 67 +++++++++++++++++++++++++++++++++++++---------
>>>   block/trace-events |  1 +
>>>   2 files changed, 55 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 8602e2cae7..74295d93d5 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>
>> [...]
>>
>>> @@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>               chunk_end = next_zero;
>>>           }
>>>   -        if (s->skip_unallocated) {
>>> -            ret = block_copy_reset_unallocated(s, start,
>>> &status_bytes);
>>> -            if (ret == 0) {
>>> -                trace_block_copy_skip_range(s, start, status_bytes);
>>> -                start += status_bytes;
>>> -                continue;
>>> -            }
>>> -            /* Clamp to known allocated region */
>>> -            chunk_end = MIN(chunk_end, start + status_bytes);
>>> +        ret = block_copy_block_status(s, start, chunk_end - start,
>>> +                                      &status_bytes);
>>> +        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>> +            bdrv_reset_dirty_bitmap(s->copy_bitmap, start,
>>> status_bytes);
>>> +            s->progress_reset_callback(s->progress_opaque);
>>> +            trace_block_copy_skip_range(s, start, status_bytes);
>>> +            start += status_bytes;
>>> +            continue;
>>>           }
>>>   +        chunk_end = MIN(chunk_end, start + status_bytes);
>>
>> I’m not sure how much the old “Clamp to known allocated region” actually
>> helps, but I wouldn’t drop it anyway.
> 
> But it may be not allocated now. We just clamp to status_bytes.
> It's "known allocated" only if s->skip_unallocated is true.

Ah, yes, I suppose I was just thinking about that case when trying to
understand how the code changes.  So:

Reviewed-by: Max Reitz <mreitz@redhat.com>

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
  2020-01-29 20:05   ` Andrey Shinkevich
  2020-01-30 15:53   ` Andrey Shinkevich
@ 2020-02-17 13:38   ` Max Reitz
  2020-02-20  7:21     ` Vladimir Sementsov-Ogievskiy
  2 siblings, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-17 13:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 10310 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Currently, block_copy operation lock the whole requested region. But
> there is no reason to lock clusters, which are already copied, it will
> disturb other parallel block_copy requests for no reason.
> 
> Let's instead do the following:
> 
> Lock only sub-region, which we are going to operate on. Then, after
> copying all dirty sub-regions, we should wait for intersecting
> requests block-copy, if they failed, we should retry these new dirty
> clusters.

Just a thought spoken aloud:

I would expect the number of intersecting CBW requests to be low in
general, so I don’t know how useful this change is in practice.  OTOH,
it makes block_copy call the existing implementation in a loop, which
seems just worse.

But then again, in the common case, block_copy_dirty_clusters() won’t
copy anything because it’s all been copied already, so there is no
change; and even if something is copied, the second call will just
re-check the dirty bitmap to see that the area’s clean (which will be
quick compared to the copy operation).  So there’s probably nothing to
worry about.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 95 insertions(+), 21 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 20068cd699..aca44b13fb 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>      return NULL;
>  }
>  
> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
> -                                                       int64_t offset,
> -                                                       int64_t bytes)
> +/*
> + * If there are no intersecting requests return false. Otherwise, wait for the
> + * first found intersecting request to finish and return true.
> + */
> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
> +                                             int64_t end)

s/end/bytes/?

(And maybe s/start/offset/, too)

>  {
> -    BlockCopyInFlightReq *req;
> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>  
> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
> -        qemu_co_queue_wait(&req->wait_queue, NULL);
> +    if (!req) {
> +        return false;
>      }
> +
> +    qemu_co_queue_wait(&req->wait_queue, NULL);
> +
> +    return true;
>  }
>  
> +/* Called only on full-dirty region */
>  static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                            BlockCopyInFlightReq *req,
>                                            int64_t offset, int64_t bytes)
>  {
> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
> +
> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +
>      req->offset = offset;
>      req->bytes = bytes;
>      qemu_co_queue_init(&req->wait_queue);
>      QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>  }
>  
> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
> +        BlockCopyInFlightReq *req, int64_t new_bytes)

It took me a while to understand that this is operation drops the tail
of the request.  I think there should be a comment on this.

(I thought it would successively drop the head after each copy, and so I
was wondering why the code didn’t match that.)

>  {
> +    if (new_bytes == req->bytes) {
> +        return;
> +    }
> +
> +    assert(new_bytes > 0 && new_bytes < req->bytes);
> +
> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
> +                          req->offset + new_bytes, req->bytes - new_bytes);> +
> +    req->bytes = new_bytes;
> +    qemu_co_queue_restart_all(&req->wait_queue);
> +}
> +
> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
> +                                                     BlockCopyInFlightReq *req,
> +                                                     int ret)
> +{
> +    if (ret < 0) {
> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
> +    }
>      QLIST_REMOVE(req, list);
>      qemu_co_queue_restart_all(&req->wait_queue);
>  }
> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>      return ret;
>  }
>  
> -int coroutine_fn block_copy(BlockCopyState *s,
> -                            int64_t offset, uint64_t bytes,
> -                            bool *error_is_read)
> +/*
> + * block_copy_dirty_clusters
> + *
> + * Copy dirty clusters in @start/@bytes range.
> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
> + * clusters found and -errno on failure.
> + */
> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
> +                                                  int64_t offset, int64_t bytes,
> +                                                  bool *error_is_read)
>  {
>      int ret = 0;
> -    BlockCopyInFlightReq req;
> +    bool found_dirty = false;
>  
>      /*
>       * block_copy() user is responsible for keeping source and target in same
> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>      assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>      assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>  
> -    block_copy_wait_inflight_reqs(s, offset, bytes);
> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
> -
>      while (bytes) {
> +        BlockCopyInFlightReq req;
>          int64_t next_zero, cur_bytes, status_bytes;
>  
>          if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>              continue; /* already copied */
>          }
>  
> +        found_dirty = true;
> +
>          cur_bytes = MIN(bytes, s->copy_size);
>  
>          next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>              assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>              cur_bytes = next_zero - offset;
>          }
> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>  
>          ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
> +        block_copy_inflight_req_shrink(s, &req, status_bytes);

block_copy_inflight_req_shrink() asserts that status_bytes <= cur_bytes.
 That isn’t necessarily correct, as block_copy_block_status() rounds up
on the last cluster.  So this should use the same MIN() as for the
cur_bytes update after the next block.

Would it make sense to move the block_copy_inflight_req_shrink() there
and pass the updated cur_bytes to it?

>          if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
> +            block_copy_inflight_req_end(s, &req, 0);
>              s->progress_reset_callback(s->progress_opaque);
>              trace_block_copy_skip_range(s, offset, status_bytes);
>              offset += status_bytes;
> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>  
>          trace_block_copy_process(s, offset);
>  
> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -
>          co_get_from_shres(s->mem, cur_bytes);
>          ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>                                   error_is_read);
>          co_put_to_shres(s->mem, cur_bytes);
> +        block_copy_inflight_req_end(s, &req, ret);
>          if (ret < 0) {
> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
> -            break;
> +            return ret;
>          }
>  
>          s->progress_bytes_callback(cur_bytes, s->progress_opaque);
> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>          bytes -= cur_bytes;
>      }
>  
> -    block_copy_inflight_req_end(&req);
> +    return found_dirty;
> +}
>  
> -    return ret;
> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
> +                            bool *error_is_read)
> +{
> +    while (true) {
> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
> +
> +        if (ret < 0) {
> +            /*
> +             * IO operation failed, which means the whole block_copy request
> +             * failed.
> +             */
> +            return ret;
> +        }
> +        if (ret) {
> +            /*
> +             * Something was copied, which means that there were yield points
> +             * and some new dirty bits may appered (due to failed parallel

s/appered/have appeared/

> +             * block-copy requests).
> +             */
> +            continue;
> +        }
> +
> +        /*
> +         * Here ret == 0, which means that there is no dirty clusters in
> +         * requested region.
> +         */
> +
> +        if (!block_copy_wait_one(s, start, bytes)) {
> +            /* No dirty bits and nothing to wait: the whole request is done */

Wouldn’t it make more sense to keep block_copy_wait_one() a loop (i.e.,
keep it as block_copy_wait_inflight_reqs()) that returns whether it
waited or not?  Because I suppose if we had to wait for anything, we
might as well wait for everything in the range.

> +            break;
> +        }
> +    }

Continuing my loud thought from the beginning, I would have written this
as a tail-recursive function to stress that this isn’t really a
(potentially expensive) loop but more of a re-check to be sure.

(i.e.

int ret = block_copy_dirty...();
if (ret < 0) {
    return ret;
}

if (ret || block_copy_wait_one()) {
    /* Something might have changed, re-check */
    return block_copy();
}

/* Done */
return 0;
)

But who cares.

Max

> +
> +    return 0;
>  }
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 7/7] block/block-copy: hide structure definitions
  2019-11-27 18:08 ` [PATCH v2 7/7] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
  2020-01-30 18:58   ` Andrey Shinkevich
@ 2020-02-17 14:04   ` Max Reitz
  2020-02-20  7:26     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 40+ messages in thread
From: Max Reitz @ 2020-02-17 14:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1415 bytes --]

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Hide structure definitions and add explicit API instead, to keep an
> eye on the scope of the shared fields.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block-copy.h | 57 +++------------------------------
>  block/backup-top.c         |  6 ++--
>  block/backup.c             | 27 ++++++++--------
>  block/block-copy.c         | 64 ++++++++++++++++++++++++++++++++++++++
>  4 files changed, 86 insertions(+), 68 deletions(-)

[...]

> diff --git a/block/backup.c b/block/backup.c
> index cf62b1a38c..acab0d08da 100644
> --- a/block/backup.c
> +++ b/block/backup.c

[...]

> @@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>      job->sync_bitmap = sync_bitmap;
>      job->bitmap_mode = bitmap_mode;
>      job->bcs = bcs;
> +    job->bcs_bitmap = block_copy_dirty_bitmap(bcs);

It seems a bit weird to me to store a pointer to the BCS-owned bitmap
here, because, well, it’s a BCS-owned object, and just calling
block_copy_dirty_bitmap() every time wouldn’t be prohibitively expensive.

I feel sufficiently bad about this to warrant not giving an R-b, but I
know I shouldn’t withhold an R-b over this, so:

Reviewed-by: Max Reitz <mreitz@redhat.com>

>      job->cluster_size = cluster_size;
>      job->len = len;


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-02-17 13:38   ` Max Reitz
@ 2020-02-20  7:21     ` Vladimir Sementsov-Ogievskiy
  2020-02-20  9:10       ` Max Reitz
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-20  7:21 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

17.02.2020 16:38, Max Reitz wrote:
> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>> Currently, block_copy operation lock the whole requested region. But
>> there is no reason to lock clusters, which are already copied, it will
>> disturb other parallel block_copy requests for no reason.
>>
>> Let's instead do the following:
>>
>> Lock only sub-region, which we are going to operate on. Then, after
>> copying all dirty sub-regions, we should wait for intersecting
>> requests block-copy, if they failed, we should retry these new dirty
>> clusters.
> 
> Just a thought spoken aloud:
> 
> I would expect the number of intersecting CBW requests to be low in
> general, so I don’t know how useful this change is in practice.  OTOH,
> it makes block_copy call the existing implementation in a loop, which
> seems just worse.
> 
> But then again, in the common case, block_copy_dirty_clusters() won’t
> copy anything because it’s all been copied already, so there is no
> change; and even if something is copied, the second call will just
> re-check the dirty bitmap to see that the area’s clean (which will be
> quick compared to the copy operation).  So there’s probably nothing to
> worry about.
> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 95 insertions(+), 21 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 20068cd699..aca44b13fb 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq *block_copy_find_inflight_req(BlockCopyState *s,
>>       return NULL;
>>   }
>>   
>> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
>> -                                                       int64_t offset,
>> -                                                       int64_t bytes)
>> +/*
>> + * If there are no intersecting requests return false. Otherwise, wait for the
>> + * first found intersecting request to finish and return true.
>> + */
>> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
>> +                                             int64_t end)
> 
> s/end/bytes/?
> 
> (And maybe s/start/offset/, too)
> 
>>   {
>> -    BlockCopyInFlightReq *req;
>> +    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>>   
>> -    while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
>> -        qemu_co_queue_wait(&req->wait_queue, NULL);
>> +    if (!req) {
>> +        return false;
>>       }
>> +
>> +    qemu_co_queue_wait(&req->wait_queue, NULL);
>> +
>> +    return true;
>>   }
>>   
>> +/* Called only on full-dirty region */
>>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>>                                             BlockCopyInFlightReq *req,
>>                                             int64_t offset, int64_t bytes)
>>   {
>> +    assert(!block_copy_find_inflight_req(s, offset, bytes));
>> +
>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>> +
>>       req->offset = offset;
>>       req->bytes = bytes;
>>       qemu_co_queue_init(&req->wait_queue);
>>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>   }
>>   
>> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
> 
> It took me a while to understand that this is operation drops the tail
> of the request.  I think there should be a comment on this.
> 
> (I thought it would successively drop the head after each copy, and so I
> was wondering why the code didn’t match that.)
> 
>>   {
>> +    if (new_bytes == req->bytes) {
>> +        return;
>> +    }
>> +
>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>> +
>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>> +                          req->offset + new_bytes, req->bytes - new_bytes);> +
>> +    req->bytes = new_bytes;
>> +    qemu_co_queue_restart_all(&req->wait_queue);
>> +}
>> +
>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>> +                                                     BlockCopyInFlightReq *req,
>> +                                                     int ret)
>> +{
>> +    if (ret < 0) {
>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>> +    }
>>       QLIST_REMOVE(req, list);
>>       qemu_co_queue_restart_all(&req->wait_queue);
>>   }
>> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>       return ret;
>>   }
>>   
>> -int coroutine_fn block_copy(BlockCopyState *s,
>> -                            int64_t offset, uint64_t bytes,
>> -                            bool *error_is_read)
>> +/*
>> + * block_copy_dirty_clusters
>> + *
>> + * Copy dirty clusters in @start/@bytes range.
>> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
>> + * clusters found and -errno on failure.
>> + */
>> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
>> +                                                  int64_t offset, int64_t bytes,
>> +                                                  bool *error_is_read)
>>   {
>>       int ret = 0;
>> -    BlockCopyInFlightReq req;
>> +    bool found_dirty = false;
>>   
>>       /*
>>        * block_copy() user is responsible for keeping source and target in same
>> @@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>   
>> -    block_copy_wait_inflight_reqs(s, offset, bytes);
>> -    block_copy_inflight_req_begin(s, &req, offset, bytes);
>> -
>>       while (bytes) {
>> +        BlockCopyInFlightReq req;
>>           int64_t next_zero, cur_bytes, status_bytes;
>>   
>>           if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
>> @@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               continue; /* already copied */
>>           }
>>   
>> +        found_dirty = true;
>> +
>>           cur_bytes = MIN(bytes, s->copy_size);
>>   
>>           next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
>> @@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>               assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
>>               cur_bytes = next_zero - offset;
>>           }
>> +        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
>>   
>>           ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
>> +        block_copy_inflight_req_shrink(s, &req, status_bytes);
> 
> block_copy_inflight_req_shrink() asserts that status_bytes <= cur_bytes.
>   That isn’t necessarily correct, as block_copy_block_status() rounds up
> on the last cluster.  So this should use the same MIN() as for the
> cur_bytes update after the next block.
> 
> Would it make sense to move the block_copy_inflight_req_shrink() there
> and pass the updated cur_bytes to it?
> 
>>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>> -            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
>> +            block_copy_inflight_req_end(s, &req, 0);
>>               s->progress_reset_callback(s->progress_opaque);
>>               trace_block_copy_skip_range(s, offset, status_bytes);
>>               offset += status_bytes;
>> @@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>   
>>           trace_block_copy_process(s, offset);
>>   
>> -        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -
>>           co_get_from_shres(s->mem, cur_bytes);
>>           ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
>>                                    error_is_read);
>>           co_put_to_shres(s->mem, cur_bytes);
>> +        block_copy_inflight_req_end(s, &req, ret);
>>           if (ret < 0) {
>> -            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
>> -            break;
>> +            return ret;
>>           }
>>   
>>           s->progress_bytes_callback(cur_bytes, s->progress_opaque);
>> @@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>           bytes -= cur_bytes;
>>       }
>>   
>> -    block_copy_inflight_req_end(&req);
>> +    return found_dirty;
>> +}
>>   
>> -    return ret;
>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
>> +                            bool *error_is_read)
>> +{
>> +    while (true) {
>> +        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
>> +
>> +        if (ret < 0) {
>> +            /*
>> +             * IO operation failed, which means the whole block_copy request
>> +             * failed.
>> +             */
>> +            return ret;
>> +        }
>> +        if (ret) {
>> +            /*
>> +             * Something was copied, which means that there were yield points
>> +             * and some new dirty bits may appered (due to failed parallel
> 
> s/appered/have appeared/
> 
>> +             * block-copy requests).
>> +             */
>> +            continue;
>> +        }
>> +
>> +        /*
>> +         * Here ret == 0, which means that there is no dirty clusters in
>> +         * requested region.
>> +         */
>> +
>> +        if (!block_copy_wait_one(s, start, bytes)) {
>> +            /* No dirty bits and nothing to wait: the whole request is done */
> 
> Wouldn’t it make more sense to keep block_copy_wait_one() a loop (i.e.,
> keep it as block_copy_wait_inflight_reqs()) that returns whether it
> waited or not?  Because I suppose if we had to wait for anything, we
> might as well wait for everything in the range.

No, we don't need to wait all. If some dirty bits appeared we may start copy them
immediately, not waiting for others.

> 
>> +            break;
>> +        }
>> +    }
> 
> Continuing my loud thought from the beginning, I would have written this
> as a tail-recursive function to stress that this isn’t really a
> (potentially expensive) loop but more of a re-check to be sure.
> 
> (i.e.
> 
> int ret = block_copy_dirty...();
> if (ret < 0) {
>      return ret;
> }
> 
> if (ret || block_copy_wait_one()) {
>      /* Something might have changed, re-check */
>      return block_copy();
> }
> 
> /* Done */
> return 0;
> )
> 
> But who cares.
> 

Hm, I'd prefer to avoid recursion. But combining two if operators seems good anyway.


With this patch I have two aims:

1. May be you are right and we don't have too much intersecting requests.. Actually, the only
case is intersection of guest write with block-job block_copy call, as intersection of
two job requests is not possible and intersection of guest writes is not normal thing. But anyway,
it should work better for intersecting requests, a bit more interactive.

2. As block-status handling and aio-tasking should be done here in block-copy, there is no reason
to keep a loop in backup.c. So at the end of the original series only one call block_copy(0, disk_size)
is left in backup.c. And this definitely needs this patch. So, it's an architectural patch, which
makes block_copy to be more universal thing.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 7/7] block/block-copy: hide structure definitions
  2020-02-17 14:04   ` Max Reitz
@ 2020-02-20  7:26     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-02-20  7:26 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

17.02.2020 17:04, Max Reitz wrote:
> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>> Hide structure definitions and add explicit API instead, to keep an
>> eye on the scope of the shared fields.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/block-copy.h | 57 +++------------------------------
>>   block/backup-top.c         |  6 ++--
>>   block/backup.c             | 27 ++++++++--------
>>   block/block-copy.c         | 64 ++++++++++++++++++++++++++++++++++++++
>>   4 files changed, 86 insertions(+), 68 deletions(-)
> 
> [...]
> 
>> diff --git a/block/backup.c b/block/backup.c
>> index cf62b1a38c..acab0d08da 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
> 
> [...]
> 
>> @@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>>       job->sync_bitmap = sync_bitmap;
>>       job->bitmap_mode = bitmap_mode;
>>       job->bcs = bcs;
>> +    job->bcs_bitmap = block_copy_dirty_bitmap(bcs);
> 
> It seems a bit weird to me to store a pointer to the BCS-owned bitmap
> here, because, well, it’s a BCS-owned object, and just calling
> block_copy_dirty_bitmap() every time wouldn’t be prohibitively expensive.
> 
> I feel sufficiently bad about this to warrant not giving an R-b, but I
> know I shouldn’t withhold an R-b over this, so:
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>

Hmm, actually, I tend to agree with you. Why did I write it so? I'll look and may be change it
to block_copy_dirty_bitmap() calls every time.

Thanks for reviewing!

> 
>>       job->cluster_size = cluster_size;
>>       job->len = len;
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
  2020-02-20  7:21     ` Vladimir Sementsov-Ogievskiy
@ 2020-02-20  9:10       ` Max Reitz
  0 siblings, 0 replies; 40+ messages in thread
From: Max Reitz @ 2020-02-20  9:10 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 817 bytes --]

On 20.02.20 08:21, Vladimir Sementsov-Ogievskiy wrote:
> 17.02.2020 16:38, Max Reitz wrote:
>> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:

[...]

>>> +        if (!block_copy_wait_one(s, start, bytes)) {
>>> +            /* No dirty bits and nothing to wait: the whole request
>>> is done */
>>
>> Wouldn’t it make more sense to keep block_copy_wait_one() a loop (i.e.,
>> keep it as block_copy_wait_inflight_reqs()) that returns whether it
>> waited or not?  Because I suppose if we had to wait for anything, we
>> might as well wait for everything in the range.
> 
> No, we don't need to wait all. If some dirty bits appeared we may start
> copy them
> immediately, not waiting for others.

OK, especially considering you’re aiming for block_copy(0, disk_size)
eventually.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-02-20  9:11 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-27 18:08 [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
2019-11-27 18:08 ` [PATCH v2 1/7] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
2020-01-29  7:38   ` Andrey Shinkevich
2020-02-07 17:28   ` Max Reitz
2020-02-08 12:32     ` Vladimir Sementsov-Ogievskiy
2019-11-27 18:08 ` [PATCH v2 2/7] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
2020-01-29  9:12   ` Andrey Shinkevich
2020-02-07 17:46   ` Max Reitz
2020-02-08 12:25     ` Vladimir Sementsov-Ogievskiy
2020-02-17 11:48       ` Max Reitz
2019-11-27 18:08 ` [PATCH v2 3/7] block/block-copy: factor out block_copy_find_inflight_req Vladimir Sementsov-Ogievskiy
2020-01-29  9:25   ` Andrey Shinkevich
2020-02-07 17:50   ` Max Reitz
2019-11-27 18:08 ` [PATCH v2 4/7] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
2020-01-29 17:12   ` Andrey Shinkevich
2020-02-05 11:36     ` Vladimir Sementsov-Ogievskiy
2020-02-07 18:01   ` Max Reitz
2020-02-08 12:55     ` Vladimir Sementsov-Ogievskiy
2019-11-27 18:08 ` [PATCH v2 5/7] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
2020-01-29 17:37   ` Andrey Shinkevich
2020-02-07 18:04   ` Max Reitz
2019-11-27 18:08 ` [PATCH v2 6/7] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
2020-01-29 20:05   ` Andrey Shinkevich
2020-01-30 13:45     ` Vladimir Sementsov-Ogievskiy
2020-01-30 16:24       ` Andrey Shinkevich
2020-01-30 17:09         ` Vladimir Sementsov-Ogievskiy
2020-01-30 18:00           ` Andrey Shinkevich
2020-01-30 15:53   ` Andrey Shinkevich
2020-01-30 16:05     ` Vladimir Sementsov-Ogievskiy
2020-02-17 13:38   ` Max Reitz
2020-02-20  7:21     ` Vladimir Sementsov-Ogievskiy
2020-02-20  9:10       ` Max Reitz
2019-11-27 18:08 ` [PATCH v2 7/7] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
2020-01-30 18:58   ` Andrey Shinkevich
2020-02-17 14:04   ` Max Reitz
2020-02-20  7:26     ` Vladimir Sementsov-Ogievskiy
2019-12-19  9:01 ` [PATCH v2 for-5.0 0/7] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
2020-01-20  9:09   ` Vladimir Sementsov-Ogievskiy
2020-02-07 18:05     ` Max Reitz
2020-02-08 10:28       ` Vladimir Sementsov-Ogievskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).