All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] block-copy improvements: part I
@ 2020-03-06  7:38 Vladimir Sementsov-Ogievskiy
  2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

v3:
01: new
03: fix block_copy_do_copy
04: add comment, rebase on 01
05: s/block_copy_find_inflight_req/find_conflicting_inflight_req/
06: add overflow check
    use int64_t for block_copy bytes parameter
    fix preexisting typo in modified comment
07: update forgotten block_copy prototype
    keep Max's r-b
08: changed a lot
    - check MIN after block_copy_block_status
    - refactor loop in block_copy()
09: drop BackupBlockJob.bcs_bitmap

Vladimir Sementsov-Ogievskiy (9):
  job: refactor progress to separate object
  block/block-copy: fix progress calculation
  block/block-copy: specialcase first copy_range request
  block/block-copy: use block_status
  block/block-copy: factor out find_conflicting_inflight_req
  block/block-copy: refactor interfaces to use bytes instead of end
  block/block-copy: rename start to offset in interfaces
  block/block-copy: reduce intersecting request lock
  block/block-copy: hide structure definitions

 include/block/block-copy.h    |  64 +-----
 include/qemu/job.h            |  11 +-
 include/qemu/progress_meter.h |  58 +++++
 block/backup-top.c            |   6 +-
 block/backup.c                |  38 ++--
 block/block-copy.c            | 404 ++++++++++++++++++++++++++--------
 blockjob.c                    |  16 +-
 job-qmp.c                     |   4 +-
 job.c                         |   6 +-
 qemu-img.c                    |   6 +-
 block/trace-events            |   1 +
 11 files changed, 419 insertions(+), 195 deletions(-)
 create mode 100644 include/qemu/progress_meter.h

-- 
2.21.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/9] job: refactor progress to separate object
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 12:22   ` Andrey Shinkevich
  2020-03-10 12:40   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 2/9] block/block-copy: fix progress calculation Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-stable, qemu-devel, mreitz,
	andrey.shinkevich, jsnow

We need it in separate to pass to the block-copy object in the next
commit.

Cc: qemu-stable@nongnu.org
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 include/qemu/job.h            | 11 ++-----
 include/qemu/progress_meter.h | 58 +++++++++++++++++++++++++++++++++++
 blockjob.c                    | 16 +++++-----
 job-qmp.c                     |  4 +--
 job.c                         |  6 ++--
 qemu-img.c                    |  6 ++--
 6 files changed, 76 insertions(+), 25 deletions(-)
 create mode 100644 include/qemu/progress_meter.h

diff --git a/include/qemu/job.h b/include/qemu/job.h
index bd59cd8944..32aabb1c60 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -28,6 +28,7 @@
 
 #include "qapi/qapi-types-job.h"
 #include "qemu/queue.h"
+#include "qemu/progress_meter.h"
 #include "qemu/coroutine.h"
 #include "block/aio.h"
 
@@ -117,15 +118,7 @@ typedef struct Job {
     /** True if this job should automatically dismiss itself */
     bool auto_dismiss;
 
-    /**
-     * Current progress. The unit is arbitrary as long as the ratio between
-     * progress_current and progress_total represents the estimated percentage
-     * of work already done.
-     */
-    int64_t progress_current;
-
-    /** Estimated progress_current value at the completion of the job */
-    int64_t progress_total;
+    ProgressMeter progress;
 
     /**
      * Return code from @run and/or @prepare callback(s).
diff --git a/include/qemu/progress_meter.h b/include/qemu/progress_meter.h
new file mode 100644
index 0000000000..9a23ff071c
--- /dev/null
+++ b/include/qemu/progress_meter.h
@@ -0,0 +1,58 @@
+/*
+ * Helper functionality for some process progress tracking.
+ *
+ * Copyright (c) 2011 IBM Corp.
+ * Copyright (c) 2012, 2018 Red Hat, Inc.
+ * Copyright (c) 2020 Virtuozzo International GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef QEMU_PROGRESS_METER_H
+#define QEMU_PROGRESS_METER_H
+
+typedef struct ProgressMeter {
+    /**
+     * Current progress. The unit is arbitrary as long as the ratio between
+     * current and total represents the estimated percentage
+     * of work already done.
+     */
+    uint64_t current;
+
+    /** Estimated current value at the completion of the process */
+    uint64_t total;
+} ProgressMeter;
+
+static inline void progress_work_done(ProgressMeter *pm, uint64_t done)
+{
+    pm->current += done;
+}
+
+static inline void progress_set_remaining(ProgressMeter *pm, uint64_t remaining)
+{
+    pm->total = pm->current + remaining;
+}
+
+static inline void progress_increase_remaining(ProgressMeter *pm,
+                                               uint64_t delta)
+{
+    pm->total += delta;
+}
+
+#endif /* QEMU_PROGRESS_METER_H */
diff --git a/blockjob.c b/blockjob.c
index 5d63b1e89d..fc850312c1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -299,8 +299,8 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
     info->device    = g_strdup(job->job.id);
     info->busy      = atomic_read(&job->job.busy);
     info->paused    = job->job.pause_count > 0;
-    info->offset    = job->job.progress_current;
-    info->len       = job->job.progress_total;
+    info->offset    = job->job.progress.current;
+    info->len       = job->job.progress.total;
     info->speed     = job->speed;
     info->io_status = job->iostatus;
     info->ready     = job_is_ready(&job->job),
@@ -330,8 +330,8 @@ static void block_job_event_cancelled(Notifier *n, void *opaque)
 
     qapi_event_send_block_job_cancelled(job_type(&job->job),
                                         job->job.id,
-                                        job->job.progress_total,
-                                        job->job.progress_current,
+                                        job->job.progress.total,
+                                        job->job.progress.current,
                                         job->speed);
 }
 
@@ -350,8 +350,8 @@ static void block_job_event_completed(Notifier *n, void *opaque)
 
     qapi_event_send_block_job_completed(job_type(&job->job),
                                         job->job.id,
-                                        job->job.progress_total,
-                                        job->job.progress_current,
+                                        job->job.progress.total,
+                                        job->job.progress.current,
                                         job->speed,
                                         !!msg,
                                         msg);
@@ -379,8 +379,8 @@ static void block_job_event_ready(Notifier *n, void *opaque)
 
     qapi_event_send_block_job_ready(job_type(&job->job),
                                     job->job.id,
-                                    job->job.progress_total,
-                                    job->job.progress_current,
+                                    job->job.progress.total,
+                                    job->job.progress.current,
                                     job->speed);
 }
 
diff --git a/job-qmp.c b/job-qmp.c
index fbfed25a00..fecc939ebd 100644
--- a/job-qmp.c
+++ b/job-qmp.c
@@ -143,8 +143,8 @@ static JobInfo *job_query_single(Job *job, Error **errp)
         .id                 = g_strdup(job->id),
         .type               = job_type(job),
         .status             = job->status,
-        .current_progress   = job->progress_current,
-        .total_progress     = job->progress_total,
+        .current_progress   = job->progress.current,
+        .total_progress     = job->progress.total,
         .has_error          = !!job->err,
         .error              = job->err ? \
                               g_strdup(error_get_pretty(job->err)) : NULL,
diff --git a/job.c b/job.c
index 04409b40aa..134a07b92e 100644
--- a/job.c
+++ b/job.c
@@ -369,17 +369,17 @@ void job_unref(Job *job)
 
 void job_progress_update(Job *job, uint64_t done)
 {
-    job->progress_current += done;
+    progress_work_done(&job->progress, done);
 }
 
 void job_progress_set_remaining(Job *job, uint64_t remaining)
 {
-    job->progress_total = job->progress_current + remaining;
+    progress_set_remaining(&job->progress, remaining);
 }
 
 void job_progress_increase_remaining(Job *job, uint64_t delta)
 {
-    job->progress_total += delta;
+    progress_increase_remaining(&job->progress, delta);
 }
 
 void job_event_cancelled(Job *job)
diff --git a/qemu-img.c b/qemu-img.c
index 804630a368..59a720abf6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -882,9 +882,9 @@ static void run_block_job(BlockJob *job, Error **errp)
     do {
         float progress = 0.0f;
         aio_poll(aio_context, true);
-        if (job->job.progress_total) {
-            progress = (float)job->job.progress_current /
-                       job->job.progress_total * 100.f;
+        if (job->job.progress.total) {
+            progress = (float)job->job.progress.current /
+                       job->job.progress.total * 100.f;
         }
         qemu_progress_print(progress, 0);
     } while (!job_is_ready(&job->job) && !job_is_completed(&job->job));
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/9] block/block-copy: fix progress calculation
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 13:32   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 3/9] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-stable, qemu-devel, mreitz,
	andrey.shinkevich, jsnow

Assume we have two regions, A and B, and region B is in-flight now,
region A is not yet touched, but it is unallocated and should be
skipped.

Correspondingly, as progress we have

  total = A + B
  current = 0

If we reset unallocated region A and call progress_reset_callback,
it will calculate 0 bytes dirty in the bitmap and call
job_progress_set_remaining, which will set

   total = current + 0 = 0 + 0 = 0

So, B bytes are actually removed from total accounting. When job
finishes we'll have

   total = 0
   current = B

, which doesn't sound good.

This is because we didn't considered in-flight bytes, actually when
calculating remaining, we should have set (in_flight + dirty_bytes)
as remaining, not only dirty_bytes.

To fix it, let's refactor progress calculation, moving it to block-copy
itself instead of fixing callback. And, of course, track in_flight
bytes count.

We still have to keep one callback, to maintain backup job bytes_read
calculation, but it will go on soon, when we turn the whole backup
process into one block_copy call.

Cc: qemu-stable@nongnu.org
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 include/block/block-copy.h | 14 +++++---------
 block/backup.c             | 13 ++-----------
 block/block-copy.c         | 16 ++++++++++++----
 3 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 0a161724d7..dc68c8c54e 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -36,6 +36,7 @@ typedef struct BlockCopyState {
     BdrvChild *source;
     BdrvChild *target;
     BdrvDirtyBitmap *copy_bitmap;
+    int64_t in_flight_bytes;
     int64_t cluster_size;
     bool use_copy_range;
     int64_t copy_size;
@@ -60,15 +61,9 @@ typedef struct BlockCopyState {
      */
     bool skip_unallocated;
 
+    ProgressMeter *progress;
     /* progress_bytes_callback: called when some copying progress is done. */
     ProgressBytesCallbackFunc progress_bytes_callback;
-
-    /*
-     * progress_reset_callback: called when some bytes reset from copy_bitmap
-     * (see @skip_unallocated above). The callee is assumed to recalculate how
-     * many bytes remain based on the dirty bit count of copy_bitmap.
-     */
-    ProgressResetCallbackFunc progress_reset_callback;
     void *progress_opaque;
 
     SharedResource *mem;
@@ -79,12 +74,13 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                      BdrvRequestFlags write_flags,
                                      Error **errp);
 
-void block_copy_set_callbacks(
+void block_copy_set_progress_callback(
         BlockCopyState *s,
         ProgressBytesCallbackFunc progress_bytes_callback,
-        ProgressResetCallbackFunc progress_reset_callback,
         void *progress_opaque);
 
+void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm);
+
 void block_copy_state_free(BlockCopyState *s);
 
 int64_t block_copy_reset_unallocated(BlockCopyState *s,
diff --git a/block/backup.c b/block/backup.c
index 1383e219f5..8694e0394b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -57,15 +57,6 @@ static void backup_progress_bytes_callback(int64_t bytes, void *opaque)
     BackupBlockJob *s = opaque;
 
     s->bytes_read += bytes;
-    job_progress_update(&s->common.job, bytes);
-}
-
-static void backup_progress_reset_callback(void *opaque)
-{
-    BackupBlockJob *s = opaque;
-    uint64_t estimate = bdrv_get_dirty_count(s->bcs->copy_bitmap);
-
-    job_progress_set_remaining(&s->common.job, estimate);
 }
 
 static int coroutine_fn backup_do_cow(BackupBlockJob *job,
@@ -464,8 +455,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     job->cluster_size = cluster_size;
     job->len = len;
 
-    block_copy_set_callbacks(bcs, backup_progress_bytes_callback,
-                             backup_progress_reset_callback, job);
+    block_copy_set_progress_callback(bcs, backup_progress_bytes_callback, job);
+    block_copy_set_progress_meter(bcs, &job->common.job.progress);
 
     /* Required permissions are already taken by backup-top target */
     block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
diff --git a/block/block-copy.c b/block/block-copy.c
index 79798a1567..e2d7b3b887 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -127,17 +127,20 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
     return s;
 }
 
-void block_copy_set_callbacks(
+void block_copy_set_progress_callback(
         BlockCopyState *s,
         ProgressBytesCallbackFunc progress_bytes_callback,
-        ProgressResetCallbackFunc progress_reset_callback,
         void *progress_opaque)
 {
     s->progress_bytes_callback = progress_bytes_callback;
-    s->progress_reset_callback = progress_reset_callback;
     s->progress_opaque = progress_opaque;
 }
 
+void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
+{
+    s->progress = pm;
+}
+
 /*
  * block_copy_do_copy
  *
@@ -269,7 +272,9 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 
     if (!ret) {
         bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-        s->progress_reset_callback(s->progress_opaque);
+        progress_set_remaining(s->progress,
+                               bdrv_get_dirty_count(s->copy_bitmap) +
+                               s->in_flight_bytes);
     }
 
     *count = bytes;
@@ -331,15 +336,18 @@ int coroutine_fn block_copy(BlockCopyState *s,
         trace_block_copy_process(s, start);
 
         bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
+        s->in_flight_bytes += chunk_end - start;
 
         co_get_from_shres(s->mem, chunk_end - start);
         ret = block_copy_do_copy(s, start, chunk_end, error_is_read);
         co_put_to_shres(s->mem, chunk_end - start);
+        s->in_flight_bytes -= chunk_end - start;
         if (ret < 0) {
             bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
             break;
         }
 
+        progress_work_done(s->progress, chunk_end - start);
         s->progress_bytes_callback(chunk_end - start, s->progress_opaque);
         start = chunk_end;
         ret = 0;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/9] block/block-copy: specialcase first copy_range request
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
  2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
  2020-03-06  7:38 ` [PATCH v3 2/9] block/block-copy: fix progress calculation Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 13:42   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 4/9] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

In block_copy_do_copy we fallback to read+write if copy_range failed.
In this case copy_size is larger than defined for buffered IO, and
there is corresponding commit. Still, backup copies data cluster by
cluster, and most of requests are limited to one cluster anyway, so the
only source of this one bad-limited request is copy-before-write
operation.

Further patch will move backup to use block_copy directly, than for
cases where copy_range is not supported, first request will be
oversized in each backup. It's not good, let's change it now.

Fix is simple: just limit first copy_range request like buffer-based
request. If it succeed, set larger copy_range limit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/block-copy.c | 41 +++++++++++++++++++++++++++++++----------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index e2d7b3b887..ddd61c1652 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -70,16 +70,19 @@ void block_copy_state_free(BlockCopyState *s)
     g_free(s);
 }
 
+static uint32_t block_copy_max_transfer(BdrvChild *source, BdrvChild *target)
+{
+    return MIN_NON_ZERO(INT_MAX,
+                        MIN_NON_ZERO(source->bs->bl.max_transfer,
+                                     target->bs->bl.max_transfer));
+}
+
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                      int64_t cluster_size,
                                      BdrvRequestFlags write_flags, Error **errp)
 {
     BlockCopyState *s;
     BdrvDirtyBitmap *copy_bitmap;
-    uint32_t max_transfer =
-            MIN_NON_ZERO(INT_MAX,
-                         MIN_NON_ZERO(source->bs->bl.max_transfer,
-                                      target->bs->bl.max_transfer));
 
     copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
                                            errp);
@@ -99,7 +102,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         .mem = shres_create(BLOCK_COPY_MAX_MEM),
     };
 
-    if (max_transfer < cluster_size) {
+    if (block_copy_max_transfer(source, target) < cluster_size) {
         /*
          * copy_range does not respect max_transfer. We don't want to bother
          * with requests smaller than block-copy cluster size, so fallback to
@@ -114,12 +117,11 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         s->copy_size = cluster_size;
     } else {
         /*
-         * copy_range does not respect max_transfer (it's a TODO), so we factor
-         * that in here.
+         * We enable copy-range, but keep small copy_size, until first
+         * successful copy_range (look at block_copy_do_copy).
          */
         s->use_copy_range = true;
-        s->copy_size = MIN(MAX(cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
-                           QEMU_ALIGN_DOWN(max_transfer, cluster_size));
+        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
     }
 
     QLIST_INIT(&s->inflight_reqs);
@@ -172,6 +174,22 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
             s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
             /* Fallback to read+write with allocated buffer */
         } else {
+            if (s->use_copy_range) {
+                /*
+                 * Successful copy-range. Now increase copy_size.  copy_range
+                 * does not respect max_transfer (it's a TODO), so we factor
+                 * that in here.
+                 *
+                 * Note: we double-check s->use_copy_range for the case when
+                 * parallel block-copy request unsets it during previous
+                 * bdrv_co_copy_range call.
+                 */
+                s->copy_size =
+                        MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
+                            QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
+                                                                    s->target),
+                                            s->cluster_size));
+            }
             goto out;
         }
     }
@@ -179,7 +197,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     /*
      * In case of failed copy_range request above, we may proceed with buffered
      * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
-     * be properly limited, so don't care too much.
+     * be properly limited, so don't care too much. Moreover the most likely
+     * case (copy_range is unsupported for the configuration, so the very first
+     * copy_range request fails) is handled by setting large copy_size only
+     * after first successful copy_range.
      */
 
     bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/9] block/block-copy: use block_status
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 3/9] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 14:21   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

Use bdrv_block_status_above to chose effective chunk size and to handle
zeroes effectively.

This substitutes checking for just being allocated or not, and drops
old code path for it. Assistance by backup job is dropped too, as
caching block-status information is more difficult than just caching
is-allocated information in our dirty bitmap, and backup job is not
good place for this caching anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/block-copy.c | 73 +++++++++++++++++++++++++++++++++++++---------
 block/trace-events |  1 +
 2 files changed, 61 insertions(+), 13 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index ddd61c1652..b075dba206 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -155,7 +155,7 @@ void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
                                            int64_t start, int64_t end,
-                                           bool *error_is_read)
+                                           bool zeroes, bool *error_is_read)
 {
     int ret;
     int nbytes = MIN(end, s->len) - start;
@@ -165,6 +165,18 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     assert(QEMU_IS_ALIGNED(end, s->cluster_size));
     assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
 
+    if (zeroes) {
+        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
+                                    ~BDRV_REQ_WRITE_COMPRESSED);
+        if (ret < 0) {
+            trace_block_copy_write_zeroes_fail(s, start, ret);
+            if (error_is_read) {
+                *error_is_read = false;
+            }
+        }
+        return ret;
+    }
+
     if (s->use_copy_range) {
         ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
                                  0, s->write_flags);
@@ -230,6 +242,38 @@ out:
     return ret;
 }
 
+static int block_copy_block_status(BlockCopyState *s, int64_t offset,
+                                   int64_t bytes, int64_t *pnum)
+{
+    int64_t num;
+    BlockDriverState *base;
+    int ret;
+
+    if (s->skip_unallocated && s->source->bs->backing) {
+        base = s->source->bs->backing->bs;
+    } else {
+        base = NULL;
+    }
+
+    ret = bdrv_block_status_above(s->source->bs, base, offset, bytes, &num,
+                                  NULL, NULL);
+    if (ret < 0 || num < s->cluster_size) {
+        /*
+         * On error or if failed to obtain large enough chunk just fallback to
+         * copy one cluster.
+         */
+        num = s->cluster_size;
+        ret = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_DATA;
+    } else if (offset + num == s->len) {
+        num = QEMU_ALIGN_UP(num, s->cluster_size);
+    } else {
+        num = QEMU_ALIGN_DOWN(num, s->cluster_size);
+    }
+
+    *pnum = num;
+    return ret;
+}
+
 /*
  * Check if the cluster starting at offset is allocated or not.
  * return via pnum the number of contiguous clusters sharing this allocation.
@@ -308,7 +352,6 @@ int coroutine_fn block_copy(BlockCopyState *s,
 {
     int ret = 0;
     int64_t end = bytes + start; /* bytes */
-    int64_t status_bytes;
     BlockCopyInFlightReq req;
 
     /*
@@ -325,7 +368,7 @@ int coroutine_fn block_copy(BlockCopyState *s,
     block_copy_inflight_req_begin(s, &req, start, end);
 
     while (start < end) {
-        int64_t next_zero, chunk_end;
+        int64_t next_zero, chunk_end, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
             trace_block_copy_skip(s, start);
@@ -343,24 +386,28 @@ int coroutine_fn block_copy(BlockCopyState *s,
             chunk_end = next_zero;
         }
 
-        if (s->skip_unallocated) {
-            ret = block_copy_reset_unallocated(s, start, &status_bytes);
-            if (ret == 0) {
-                trace_block_copy_skip_range(s, start, status_bytes);
-                start += status_bytes;
-                continue;
-            }
-            /* Clamp to known allocated region */
-            chunk_end = MIN(chunk_end, start + status_bytes);
+        ret = block_copy_block_status(s, start, chunk_end - start,
+                                      &status_bytes);
+        if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
+            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
+            progress_set_remaining(s->progress,
+                                   bdrv_get_dirty_count(s->copy_bitmap) +
+                                   s->in_flight_bytes);
+            trace_block_copy_skip_range(s, start, status_bytes);
+            start += status_bytes;
+            continue;
         }
 
+        chunk_end = MIN(chunk_end, start + status_bytes);
+
         trace_block_copy_process(s, start);
 
         bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
         s->in_flight_bytes += chunk_end - start;
 
         co_get_from_shres(s->mem, chunk_end - start);
-        ret = block_copy_do_copy(s, start, chunk_end, error_is_read);
+        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
+                                 error_is_read);
         co_put_to_shres(s->mem, chunk_end - start);
         s->in_flight_bytes -= chunk_end - start;
         if (ret < 0) {
diff --git a/block/trace-events b/block/trace-events
index 1a7329b736..29dff8881c 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -48,6 +48,7 @@ block_copy_process(void *bcs, int64_t start) "bcs %p start %"PRId64
 block_copy_copy_range_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 block_copy_read_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 block_copy_write_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
+block_copy_write_zeroes_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d"
 
 # ../blockdev.c
 qmp_block_job_cancel(void *job) "job %p"
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 4/9] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 14:27   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

Split find_conflicting_inflight_req to be used separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/block-copy.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index b075dba206..251d415a2c 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -24,23 +24,30 @@
 #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
+static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
+                                                           int64_t start,
+                                                           int64_t end)
+{
+    BlockCopyInFlightReq *req;
+
+    QLIST_FOREACH(req, &s->inflight_reqs, list) {
+        if (end > req->start_byte && start < req->end_byte) {
+            return req;
+        }
+    }
+
+    return NULL;
+}
+
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
                                                        int64_t start,
                                                        int64_t end)
 {
     BlockCopyInFlightReq *req;
-    bool waited;
-
-    do {
-        waited = false;
-        QLIST_FOREACH(req, &s->inflight_reqs, list) {
-            if (end > req->start_byte && start < req->end_byte) {
-                qemu_co_queue_wait(&req->wait_queue, NULL);
-                waited = true;
-                break;
-            }
-        }
-    } while (waited);
+
+    while ((req = find_conflicting_inflight_req(s, start, end))) {
+        qemu_co_queue_wait(&req->wait_queue, NULL);
+    }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 14:44   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

We have a lot of "chunk_end - start" invocations, let's switch to
bytes/cur_bytes scheme instead.

While being here, improve check on block_copy_do_copy parameters to not
overflow when calculating nbytes and use int64_t for bytes in
block_copy for consistency.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 include/block/block-copy.h |  6 +--
 block/block-copy.c         | 78 ++++++++++++++++++++------------------
 2 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index dc68c8c54e..039cf7c908 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -19,8 +19,8 @@
 #include "qemu/co-shared-resource.h"
 
 typedef struct BlockCopyInFlightReq {
-    int64_t start_byte;
-    int64_t end_byte;
+    int64_t start;
+    int64_t bytes;
     QLIST_ENTRY(BlockCopyInFlightReq) list;
     CoQueue wait_queue; /* coroutines blocked on this request */
 } BlockCopyInFlightReq;
@@ -86,7 +86,7 @@ void block_copy_state_free(BlockCopyState *s);
 int64_t block_copy_reset_unallocated(BlockCopyState *s,
                                      int64_t offset, int64_t *count);
 
-int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
+int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
                             bool *error_is_read);
 
 #endif /* BLOCK_COPY_H */
diff --git a/block/block-copy.c b/block/block-copy.c
index 251d415a2c..4c947e548b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -26,12 +26,12 @@
 
 static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
                                                            int64_t start,
-                                                           int64_t end)
+                                                           int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
     QLIST_FOREACH(req, &s->inflight_reqs, list) {
-        if (end > req->start_byte && start < req->end_byte) {
+        if (start + bytes > req->start && start < req->start + req->bytes) {
             return req;
         }
     }
@@ -41,21 +41,21 @@ static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
 
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
                                                        int64_t start,
-                                                       int64_t end)
+                                                       int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
-    while ((req = find_conflicting_inflight_req(s, start, end))) {
+    while ((req = find_conflicting_inflight_req(s, start, bytes))) {
         qemu_co_queue_wait(&req->wait_queue, NULL);
     }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
-                                          int64_t start, int64_t end)
+                                          int64_t start, int64_t bytes)
 {
-    req->start_byte = start;
-    req->end_byte = end;
+    req->start = start;
+    req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
 }
@@ -153,24 +153,28 @@ void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
 /*
  * block_copy_do_copy
  *
- * Do copy of cluser-aligned chunk. @end is allowed to exceed s->len only to
- * cover last cluster when s->len is not aligned to clusters.
+ * Do copy of cluster-aligned chunk. Requested region is allowed to exceed
+ * s->len only to cover last cluster when s->len is not aligned to clusters.
  *
  * No sync here: nor bitmap neighter intersecting requests handling, only copy.
  *
  * Returns 0 on success.
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
-                                           int64_t start, int64_t end,
+                                           int64_t start, int64_t bytes,
                                            bool zeroes, bool *error_is_read)
 {
     int ret;
-    int nbytes = MIN(end, s->len) - start;
+    int64_t nbytes = MIN(start + bytes, s->len) - start;
     void *bounce_buffer = NULL;
 
+    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
     assert(QEMU_IS_ALIGNED(start, s->cluster_size));
-    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
-    assert(end < s->len || end == QEMU_ALIGN_UP(s->len, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
+    assert(start < s->len);
+    assert(start + bytes <= s->len ||
+           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
+    assert(nbytes < INT_MAX);
 
     if (zeroes) {
         ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
@@ -354,11 +358,10 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 }
 
 int coroutine_fn block_copy(BlockCopyState *s,
-                            int64_t start, uint64_t bytes,
+                            int64_t start, int64_t bytes,
                             bool *error_is_read)
 {
     int ret = 0;
-    int64_t end = bytes + start; /* bytes */
     BlockCopyInFlightReq req;
 
     /*
@@ -369,32 +372,32 @@ int coroutine_fn block_copy(BlockCopyState *s,
            bdrv_get_aio_context(s->target->bs));
 
     assert(QEMU_IS_ALIGNED(start, s->cluster_size));
-    assert(QEMU_IS_ALIGNED(end, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
     block_copy_wait_inflight_reqs(s, start, bytes);
-    block_copy_inflight_req_begin(s, &req, start, end);
+    block_copy_inflight_req_begin(s, &req, start, bytes);
 
-    while (start < end) {
-        int64_t next_zero, chunk_end, status_bytes;
+    while (bytes) {
+        int64_t next_zero, cur_bytes, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
             trace_block_copy_skip(s, start);
             start += s->cluster_size;
+            bytes -= s->cluster_size;
             continue; /* already copied */
         }
 
-        chunk_end = MIN(end, start + s->copy_size);
+        cur_bytes = MIN(bytes, s->copy_size);
 
         next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
-                                                chunk_end - start);
+                                                cur_bytes);
         if (next_zero >= 0) {
             assert(next_zero > start); /* start is dirty */
-            assert(next_zero < chunk_end); /* no need to do MIN() */
-            chunk_end = next_zero;
+            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
+            cur_bytes = next_zero - start;
         }
 
-        ret = block_copy_block_status(s, start, chunk_end - start,
-                                      &status_bytes);
+        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
             bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
             progress_set_remaining(s->progress,
@@ -402,30 +405,31 @@ int coroutine_fn block_copy(BlockCopyState *s,
                                    s->in_flight_bytes);
             trace_block_copy_skip_range(s, start, status_bytes);
             start += status_bytes;
+            bytes -= status_bytes;
             continue;
         }
 
-        chunk_end = MIN(chunk_end, start + status_bytes);
+        cur_bytes = MIN(cur_bytes, status_bytes);
 
         trace_block_copy_process(s, start);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
-        s->in_flight_bytes += chunk_end - start;
+        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
+        s->in_flight_bytes += cur_bytes;
 
-        co_get_from_shres(s->mem, chunk_end - start);
-        ret = block_copy_do_copy(s, start, chunk_end, ret & BDRV_BLOCK_ZERO,
+        co_get_from_shres(s->mem, cur_bytes);
+        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
-        co_put_to_shres(s->mem, chunk_end - start);
-        s->in_flight_bytes -= chunk_end - start;
+        co_put_to_shres(s->mem, cur_bytes);
+        s->in_flight_bytes -= cur_bytes;
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, start, chunk_end - start);
+            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
             break;
         }
 
-        progress_work_done(s->progress, chunk_end - start);
-        s->progress_bytes_callback(chunk_end - start, s->progress_opaque);
-        start = chunk_end;
-        ret = 0;
+        progress_work_done(s->progress, cur_bytes);
+        s->progress_bytes_callback(cur_bytes, s->progress_opaque);
+        start += cur_bytes;
+        bytes -= cur_bytes;
     }
 
     block_copy_inflight_req_end(&req);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 14:50   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 8/9] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
  2020-03-06  7:38 ` [PATCH v3 9/9] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

offset/bytes pair is more usual naming in block layer, let's use it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block-copy.h |  4 +-
 block/block-copy.c         | 84 +++++++++++++++++++-------------------
 2 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 039cf7c908..c46b382cc6 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -19,7 +19,7 @@
 #include "qemu/co-shared-resource.h"
 
 typedef struct BlockCopyInFlightReq {
-    int64_t start;
+    int64_t offset;
     int64_t bytes;
     QLIST_ENTRY(BlockCopyInFlightReq) list;
     CoQueue wait_queue; /* coroutines blocked on this request */
@@ -86,7 +86,7 @@ void block_copy_state_free(BlockCopyState *s);
 int64_t block_copy_reset_unallocated(BlockCopyState *s,
                                      int64_t offset, int64_t *count);
 
-int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
+int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
                             bool *error_is_read);
 
 #endif /* BLOCK_COPY_H */
diff --git a/block/block-copy.c b/block/block-copy.c
index 4c947e548b..2b29131653 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -25,13 +25,13 @@
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
 static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
-                                                           int64_t start,
-                                                           int64_t bytes)
+                                                          int64_t offset,
+                                                          int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
     QLIST_FOREACH(req, &s->inflight_reqs, list) {
-        if (start + bytes > req->start && start < req->start + req->bytes) {
+        if (offset + bytes > req->offset && offset < req->offset + req->bytes) {
             return req;
         }
     }
@@ -40,21 +40,21 @@ static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
 }
 
 static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
-                                                       int64_t start,
+                                                       int64_t offset,
                                                        int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
-    while ((req = find_conflicting_inflight_req(s, start, bytes))) {
+    while ((req = find_conflicting_inflight_req(s, offset, bytes))) {
         qemu_co_queue_wait(&req->wait_queue, NULL);
     }
 }
 
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
-                                          int64_t start, int64_t bytes)
+                                          int64_t offset, int64_t bytes)
 {
-    req->start = start;
+    req->offset = offset;
     req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
@@ -161,26 +161,26 @@ void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
  * Returns 0 on success.
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
-                                           int64_t start, int64_t bytes,
+                                           int64_t offset, int64_t bytes,
                                            bool zeroes, bool *error_is_read)
 {
     int ret;
-    int64_t nbytes = MIN(start + bytes, s->len) - start;
+    int64_t nbytes = MIN(offset + bytes, s->len) - offset;
     void *bounce_buffer = NULL;
 
-    assert(start >= 0 && bytes > 0 && INT64_MAX - start >= bytes);
-    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
+    assert(offset >= 0 && bytes > 0 && INT64_MAX - offset >= bytes);
+    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
-    assert(start < s->len);
-    assert(start + bytes <= s->len ||
-           start + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
+    assert(offset < s->len);
+    assert(offset + bytes <= s->len ||
+           offset + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
     assert(nbytes < INT_MAX);
 
     if (zeroes) {
-        ret = bdrv_co_pwrite_zeroes(s->target, start, nbytes, s->write_flags &
+        ret = bdrv_co_pwrite_zeroes(s->target, offset, nbytes, s->write_flags &
                                     ~BDRV_REQ_WRITE_COMPRESSED);
         if (ret < 0) {
-            trace_block_copy_write_zeroes_fail(s, start, ret);
+            trace_block_copy_write_zeroes_fail(s, offset, ret);
             if (error_is_read) {
                 *error_is_read = false;
             }
@@ -189,10 +189,10 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     }
 
     if (s->use_copy_range) {
-        ret = bdrv_co_copy_range(s->source, start, s->target, start, nbytes,
+        ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
                                  0, s->write_flags);
         if (ret < 0) {
-            trace_block_copy_copy_range_fail(s, start, ret);
+            trace_block_copy_copy_range_fail(s, offset, ret);
             s->use_copy_range = false;
             s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
             /* Fallback to read+write with allocated buffer */
@@ -228,19 +228,19 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
 
     bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
 
-    ret = bdrv_co_pread(s->source, start, nbytes, bounce_buffer, 0);
+    ret = bdrv_co_pread(s->source, offset, nbytes, bounce_buffer, 0);
     if (ret < 0) {
-        trace_block_copy_read_fail(s, start, ret);
+        trace_block_copy_read_fail(s, offset, ret);
         if (error_is_read) {
             *error_is_read = true;
         }
         goto out;
     }
 
-    ret = bdrv_co_pwrite(s->target, start, nbytes, bounce_buffer,
+    ret = bdrv_co_pwrite(s->target, offset, nbytes, bounce_buffer,
                          s->write_flags);
     if (ret < 0) {
-        trace_block_copy_write_fail(s, start, ret);
+        trace_block_copy_write_fail(s, offset, ret);
         if (error_is_read) {
             *error_is_read = false;
         }
@@ -358,7 +358,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 }
 
 int coroutine_fn block_copy(BlockCopyState *s,
-                            int64_t start, int64_t bytes,
+                            int64_t offset, int64_t bytes,
                             bool *error_is_read)
 {
     int ret = 0;
@@ -371,64 +371,64 @@ int coroutine_fn block_copy(BlockCopyState *s,
     assert(bdrv_get_aio_context(s->source->bs) ==
            bdrv_get_aio_context(s->target->bs));
 
-    assert(QEMU_IS_ALIGNED(start, s->cluster_size));
+    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    block_copy_wait_inflight_reqs(s, start, bytes);
-    block_copy_inflight_req_begin(s, &req, start, bytes);
+    block_copy_wait_inflight_reqs(s, offset, bytes);
+    block_copy_inflight_req_begin(s, &req, offset, bytes);
 
     while (bytes) {
         int64_t next_zero, cur_bytes, status_bytes;
 
-        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, start)) {
-            trace_block_copy_skip(s, start);
-            start += s->cluster_size;
+        if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
+            trace_block_copy_skip(s, offset);
+            offset += s->cluster_size;
             bytes -= s->cluster_size;
             continue; /* already copied */
         }
 
         cur_bytes = MIN(bytes, s->copy_size);
 
-        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, start,
+        next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
                                                 cur_bytes);
         if (next_zero >= 0) {
-            assert(next_zero > start); /* start is dirty */
-            assert(next_zero < start + cur_bytes); /* no need to do MIN() */
-            cur_bytes = next_zero - start;
+            assert(next_zero > offset); /* offset is dirty */
+            assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
+            cur_bytes = next_zero - offset;
         }
 
-        ret = block_copy_block_status(s, start, cur_bytes, &status_bytes);
+        ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
-            bdrv_reset_dirty_bitmap(s->copy_bitmap, start, status_bytes);
+            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
             progress_set_remaining(s->progress,
                                    bdrv_get_dirty_count(s->copy_bitmap) +
                                    s->in_flight_bytes);
-            trace_block_copy_skip_range(s, start, status_bytes);
-            start += status_bytes;
+            trace_block_copy_skip_range(s, offset, status_bytes);
+            offset += status_bytes;
             bytes -= status_bytes;
             continue;
         }
 
         cur_bytes = MIN(cur_bytes, status_bytes);
 
-        trace_block_copy_process(s, start);
+        trace_block_copy_process(s, offset);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
+        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
         s->in_flight_bytes += cur_bytes;
 
         co_get_from_shres(s->mem, cur_bytes);
-        ret = block_copy_do_copy(s, start, cur_bytes, ret & BDRV_BLOCK_ZERO,
+        ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
         co_put_to_shres(s->mem, cur_bytes);
         s->in_flight_bytes -= cur_bytes;
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, start, cur_bytes);
+            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
             break;
         }
 
         progress_work_done(s->progress, cur_bytes);
         s->progress_bytes_callback(cur_bytes, s->progress_opaque);
-        start += cur_bytes;
+        offset += cur_bytes;
         bytes -= cur_bytes;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 8/9] block/block-copy: reduce intersecting request lock
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 15:32   ` Max Reitz
  2020-03-06  7:38 ` [PATCH v3 9/9] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
  8 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

Currently, block_copy operation lock the whole requested region. But
there is no reason to lock clusters, which are already copied, it will
disturb other parallel block_copy requests for no reason.

Let's instead do the following:

Lock only sub-region, which we are going to operate on. Then, after
copying all dirty sub-regions, we should wait for intersecting
requests block-copy, if they failed, we should retry these new dirty
clusters.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 block/block-copy.c | 128 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 104 insertions(+), 24 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 2b29131653..d66b8eb691 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -39,29 +39,71 @@ static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
     return NULL;
 }
 
-static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
-                                                       int64_t offset,
-                                                       int64_t bytes)
+/*
+ * If there are no intersecting requests return false. Otherwise, wait for the
+ * first found intersecting request to finish and return true.
+ */
+static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
+                                             int64_t bytes)
 {
-    BlockCopyInFlightReq *req;
+    BlockCopyInFlightReq *req = find_conflicting_inflight_req(s, offset, bytes);
 
-    while ((req = find_conflicting_inflight_req(s, offset, bytes))) {
-        qemu_co_queue_wait(&req->wait_queue, NULL);
+    if (!req) {
+        return false;
     }
+
+    qemu_co_queue_wait(&req->wait_queue, NULL);
+
+    return true;
 }
 
+/* Called only on full-dirty region */
 static void block_copy_inflight_req_begin(BlockCopyState *s,
                                           BlockCopyInFlightReq *req,
                                           int64_t offset, int64_t bytes)
 {
+    assert(!find_conflicting_inflight_req(s, offset, bytes));
+
+    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+    s->in_flight_bytes += bytes;
+
     req->offset = offset;
     req->bytes = bytes;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
 }
 
-static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
+/*
+ * block_copy_inflight_req_shrink
+ *
+ * Drop the tail of the request to be handled later. Set dirty bits back and
+ * wake up all requests waiting for us (may be some of them are not intersecting
+ * with shrunk request)
+ */
+static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
+        BlockCopyInFlightReq *req, int64_t new_bytes)
 {
+    if (new_bytes == req->bytes) {
+        return;
+    }
+
+    assert(new_bytes > 0 && new_bytes < req->bytes);
+
+    bdrv_set_dirty_bitmap(s->copy_bitmap,
+                          req->offset + new_bytes, req->bytes - new_bytes);
+
+    req->bytes = new_bytes;
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
+                                                     BlockCopyInFlightReq *req,
+                                                     int ret)
+{
+    s->in_flight_bytes -= req->bytes;
+    if (ret < 0) {
+        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
+    }
     QLIST_REMOVE(req, list);
     qemu_co_queue_restart_all(&req->wait_queue);
 }
@@ -357,12 +399,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
     return ret;
 }
 
-int coroutine_fn block_copy(BlockCopyState *s,
-                            int64_t offset, int64_t bytes,
-                            bool *error_is_read)
+/*
+ * block_copy_dirty_clusters
+ *
+ * Copy dirty clusters in @offset/@bytes range.
+ * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
+ * clusters found and -errno on failure.
+ */
+static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
+                                                  int64_t offset, int64_t bytes,
+                                                  bool *error_is_read)
 {
     int ret = 0;
-    BlockCopyInFlightReq req;
+    bool found_dirty = false;
 
     /*
      * block_copy() user is responsible for keeping source and target in same
@@ -374,10 +423,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    block_copy_wait_inflight_reqs(s, offset, bytes);
-    block_copy_inflight_req_begin(s, &req, offset, bytes);
-
     while (bytes) {
+        BlockCopyInFlightReq req;
         int64_t next_zero, cur_bytes, status_bytes;
 
         if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {
@@ -387,6 +434,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
             continue; /* already copied */
         }
 
+        found_dirty = true;
+
         cur_bytes = MIN(bytes, s->copy_size);
 
         next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,
@@ -396,10 +445,14 @@ int coroutine_fn block_copy(BlockCopyState *s,
             assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
             cur_bytes = next_zero - offset;
         }
+        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);
 
         ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);
+        assert(ret >= 0); /* never fail */
+        cur_bytes = MIN(cur_bytes, status_bytes);
+        block_copy_inflight_req_shrink(s, &req, cur_bytes);
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
-            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
+            block_copy_inflight_req_end(s, &req, 0);
             progress_set_remaining(s->progress,
                                    bdrv_get_dirty_count(s->copy_bitmap) +
                                    s->in_flight_bytes);
@@ -409,21 +462,15 @@ int coroutine_fn block_copy(BlockCopyState *s,
             continue;
         }
 
-        cur_bytes = MIN(cur_bytes, status_bytes);
-
         trace_block_copy_process(s, offset);
 
-        bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
-        s->in_flight_bytes += cur_bytes;
-
         co_get_from_shres(s->mem, cur_bytes);
         ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                  error_is_read);
         co_put_to_shres(s->mem, cur_bytes);
-        s->in_flight_bytes -= cur_bytes;
+        block_copy_inflight_req_end(s, &req, ret);
         if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
-            break;
+            return ret;
         }
 
         progress_work_done(s->progress, cur_bytes);
@@ -432,7 +479,40 @@ int coroutine_fn block_copy(BlockCopyState *s,
         bytes -= cur_bytes;
     }
 
-    block_copy_inflight_req_end(&req);
+    return found_dirty;
+}
+
+/*
+ * block_copy
+ *
+ * Copy requested region, accordingly to dirty bitmap.
+ * Collaborate with parallel block_copy requests: if they success it help us. If
+ * they fail, we retry not-copied regions. So, if we return error, it means that
+ * io operation failed in context of _this_ block_copy call, not some parallel
+ * operation.
+ */
+int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
+                            bool *error_is_read)
+{
+    int ret;
+
+    do {
+        ret = block_copy_dirty_clusters(s, offset, bytes, error_is_read);
+
+        if (ret == 0) {
+            ret = block_copy_wait_one(s, offset, bytes);
+        }
+
+        /*
+         * We retry in two cases:
+         * 1. Some progress done
+         *    Something was copied, which means that there were yield points
+         *    and some new dirty bits may have appeared (due to failed parallel
+         *    block-copy requests).
+         * 2. We have waited for some intersecting block-copy request
+         *    It may have failed and produced new dirty bits.
+         */
+    } while (ret > 0);
 
     return ret;
 }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 9/9] block/block-copy: hide structure definitions
  2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-03-06  7:38 ` [PATCH v3 8/9] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
@ 2020-03-06  7:38 ` Vladimir Sementsov-Ogievskiy
  2020-03-10 14:55   ` Andrey Shinkevich
  2020-03-10 15:38   ` Max Reitz
  8 siblings, 2 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-06  7:38 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, vsementsov, qemu-devel, mreitz, andrey.shinkevich, jsnow

Hide structure definitions and add explicit API instead, to keep an
eye on the scope of the shared fields.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block-copy.h | 52 +++----------------------------
 block/backup-top.c         |  6 ++--
 block/backup.c             | 25 +++++++--------
 block/block-copy.c         | 63 ++++++++++++++++++++++++++++++++++++--
 4 files changed, 82 insertions(+), 64 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index c46b382cc6..9718c241c9 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -18,56 +18,9 @@
 #include "block/block.h"
 #include "qemu/co-shared-resource.h"
 
-typedef struct BlockCopyInFlightReq {
-    int64_t offset;
-    int64_t bytes;
-    QLIST_ENTRY(BlockCopyInFlightReq) list;
-    CoQueue wait_queue; /* coroutines blocked on this request */
-} BlockCopyInFlightReq;
-
 typedef void (*ProgressBytesCallbackFunc)(int64_t bytes, void *opaque);
 typedef void (*ProgressResetCallbackFunc)(void *opaque);
-typedef struct BlockCopyState {
-    /*
-     * BdrvChild objects are not owned or managed by block-copy. They are
-     * provided by block-copy user and user is responsible for appropriate
-     * permissions on these children.
-     */
-    BdrvChild *source;
-    BdrvChild *target;
-    BdrvDirtyBitmap *copy_bitmap;
-    int64_t in_flight_bytes;
-    int64_t cluster_size;
-    bool use_copy_range;
-    int64_t copy_size;
-    uint64_t len;
-    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
-
-    BdrvRequestFlags write_flags;
-
-    /*
-     * skip_unallocated:
-     *
-     * Used by sync=top jobs, which first scan the source node for unallocated
-     * areas and clear them in the copy_bitmap.  During this process, the bitmap
-     * is thus not fully initialized: It may still have bits set for areas that
-     * are unallocated and should actually not be copied.
-     *
-     * This is indicated by skip_unallocated.
-     *
-     * In this case, block_copy() will query the source’s allocation status,
-     * skip unallocated regions, clear them in the copy_bitmap, and invoke
-     * block_copy_reset_unallocated() every time it does.
-     */
-    bool skip_unallocated;
-
-    ProgressMeter *progress;
-    /* progress_bytes_callback: called when some copying progress is done. */
-    ProgressBytesCallbackFunc progress_bytes_callback;
-    void *progress_opaque;
-
-    SharedResource *mem;
-} BlockCopyState;
+typedef struct BlockCopyState BlockCopyState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                      int64_t cluster_size,
@@ -89,4 +42,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
                             bool *error_is_read);
 
+BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s);
+void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip);
+
 #endif /* BLOCK_COPY_H */
diff --git a/block/backup-top.c b/block/backup-top.c
index 1bfb360bd3..3b50c06e2c 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -38,6 +38,7 @@ typedef struct BDRVBackupTopState {
     BlockCopyState *bcs;
     BdrvChild *target;
     bool active;
+    int64_t cluster_size;
 } BDRVBackupTopState;
 
 static coroutine_fn int backup_top_co_preadv(
@@ -57,8 +58,8 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, uint64_t offset,
         return 0;
     }
 
-    off = QEMU_ALIGN_DOWN(offset, s->bcs->cluster_size);
-    end = QEMU_ALIGN_UP(offset + bytes, s->bcs->cluster_size);
+    off = QEMU_ALIGN_DOWN(offset, s->cluster_size);
+    end = QEMU_ALIGN_UP(offset + bytes, s->cluster_size);
 
     return block_copy(s->bcs, off, end - off, NULL);
 }
@@ -238,6 +239,7 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
         goto fail;
     }
 
+    state->cluster_size = cluster_size;
     state->bcs = block_copy_state_new(top->backing, state->target,
                                       cluster_size, write_flags, &local_err);
     if (local_err) {
diff --git a/block/backup.c b/block/backup.c
index 8694e0394b..7430ca5883 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -102,7 +102,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
 
     if (ret < 0 && job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) {
         /* If we failed and synced, merge in the bits we didn't copy: */
-        bdrv_dirty_bitmap_merge_internal(bm, job->bcs->copy_bitmap,
+        bdrv_dirty_bitmap_merge_internal(bm, block_copy_dirty_bitmap(job->bcs),
                                          NULL, true);
     }
 }
@@ -145,7 +145,8 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
         return;
     }
 
-    bdrv_set_dirty_bitmap(backup_job->bcs->copy_bitmap, 0, backup_job->len);
+    bdrv_set_dirty_bitmap(block_copy_dirty_bitmap(backup_job->bcs), 0,
+                          backup_job->len);
 }
 
 static BlockErrorAction backup_error_action(BackupBlockJob *job,
@@ -190,7 +191,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
     BdrvDirtyBitmapIter *bdbi;
     int ret = 0;
 
-    bdbi = bdrv_dirty_iter_new(job->bcs->copy_bitmap);
+    bdbi = bdrv_dirty_iter_new(block_copy_dirty_bitmap(job->bcs));
     while ((offset = bdrv_dirty_iter_next(bdbi)) != -1) {
         do {
             if (yield_and_check(job)) {
@@ -210,14 +211,14 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
     return ret;
 }
 
-static void backup_init_copy_bitmap(BackupBlockJob *job)
+static void backup_init_bcs_bitmap(BackupBlockJob *job)
 {
     bool ret;
     uint64_t estimate;
+    BdrvDirtyBitmap *bcs_bitmap = block_copy_dirty_bitmap(job->bcs);
 
     if (job->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
-        ret = bdrv_dirty_bitmap_merge_internal(job->bcs->copy_bitmap,
-                                               job->sync_bitmap,
+        ret = bdrv_dirty_bitmap_merge_internal(bcs_bitmap, job->sync_bitmap,
                                                NULL, true);
         assert(ret);
     } else {
@@ -226,12 +227,12 @@ static void backup_init_copy_bitmap(BackupBlockJob *job)
              * We can't hog the coroutine to initialize this thoroughly.
              * Set a flag and resume work when we are able to yield safely.
              */
-            job->bcs->skip_unallocated = true;
+            block_copy_set_skip_unallocated(job->bcs, true);
         }
-        bdrv_set_dirty_bitmap(job->bcs->copy_bitmap, 0, job->len);
+        bdrv_set_dirty_bitmap(bcs_bitmap, 0, job->len);
     }
 
-    estimate = bdrv_get_dirty_count(job->bcs->copy_bitmap);
+    estimate = bdrv_get_dirty_count(bcs_bitmap);
     job_progress_set_remaining(&job->common.job, estimate);
 }
 
@@ -240,7 +241,7 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
     int ret = 0;
 
-    backup_init_copy_bitmap(s);
+    backup_init_bcs_bitmap(s);
 
     if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
         int64_t offset = 0;
@@ -259,12 +260,12 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
 
             offset += count;
         }
-        s->bcs->skip_unallocated = false;
+        block_copy_set_skip_unallocated(s->bcs, false);
     }
 
     if (s->sync_mode == MIRROR_SYNC_MODE_NONE) {
         /*
-         * All bits are set in copy_bitmap to allow any cluster to be copied.
+         * All bits are set in bcs bitmap to allow any cluster to be copied.
          * This does not actually require them to be copied.
          */
         while (!job_is_cancelled(job)) {
diff --git a/block/block-copy.c b/block/block-copy.c
index d66b8eb691..a2d8579ca0 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -24,9 +24,58 @@
 #define BLOCK_COPY_MAX_BUFFER (1 * MiB)
 #define BLOCK_COPY_MAX_MEM (128 * MiB)
 
+typedef struct BlockCopyInFlightReq {
+    int64_t offset;
+    int64_t bytes;
+    QLIST_ENTRY(BlockCopyInFlightReq) list;
+    CoQueue wait_queue; /* coroutines blocked on this request */
+} BlockCopyInFlightReq;
+
+typedef struct BlockCopyState {
+    /*
+     * BdrvChild objects are not owned or managed by block-copy. They are
+     * provided by block-copy user and user is responsible for appropriate
+     * permissions on these children.
+     */
+    BdrvChild *source;
+    BdrvChild *target;
+    BdrvDirtyBitmap *copy_bitmap;
+    int64_t in_flight_bytes;
+    int64_t cluster_size;
+    bool use_copy_range;
+    int64_t copy_size;
+    uint64_t len;
+    QLIST_HEAD(, BlockCopyInFlightReq) inflight_reqs;
+
+    BdrvRequestFlags write_flags;
+
+    /*
+     * skip_unallocated:
+     *
+     * Used by sync=top jobs, which first scan the source node for unallocated
+     * areas and clear them in the copy_bitmap.  During this process, the bitmap
+     * is thus not fully initialized: It may still have bits set for areas that
+     * are unallocated and should actually not be copied.
+     *
+     * This is indicated by skip_unallocated.
+     *
+     * In this case, block_copy() will query the source’s allocation status,
+     * skip unallocated regions, clear them in the copy_bitmap, and invoke
+     * block_copy_reset_unallocated() every time it does.
+     */
+    bool skip_unallocated;
+
+    ProgressMeter *progress;
+    /* progress_bytes_callback: called when some copying progress is done. */
+    ProgressBytesCallbackFunc progress_bytes_callback;
+    void *progress_opaque;
+
+    SharedResource *mem;
+} BlockCopyState;
+
 static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
-                                                          int64_t offset,
-                                                          int64_t bytes)
+                                                           int64_t offset,
+                                                           int64_t bytes)
 {
     BlockCopyInFlightReq *req;
 
@@ -516,3 +565,13 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
 
     return ret;
 }
+
+BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s)
+{
+    return s->copy_bitmap;
+}
+
+void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip)
+{
+    s->skip_unallocated = skip;
+}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/9] job: refactor progress to separate object
  2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
@ 2020-03-10 12:22   ` Andrey Shinkevich
  2020-03-10 12:40   ` Max Reitz
  1 sibling, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2020-03-10 12:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, qemu-stable, jsnow, qemu-devel, mreitz

On 06/03/2020 10:38, Vladimir Sementsov-Ogievskiy wrote:
> We need it in separate to pass to the block-copy object in the next
> commit.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>   include/qemu/job.h            | 11 ++-----
>   include/qemu/progress_meter.h | 58 +++++++++++++++++++++++++++++++++++
>   blockjob.c                    | 16 +++++-----
>   job-qmp.c                     |  4 +--
>   job.c                         |  6 ++--
>   qemu-img.c                    |  6 ++--
>   6 files changed, 76 insertions(+), 25 deletions(-)
>   create mode 100644 include/qemu/progress_meter.h
> 
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index bd59cd8944..32aabb1c60 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -28,6 +28,7 @@
>   
>   #include "qapi/qapi-types-job.h"
>   #include "qemu/queue.h"
> +#include "qemu/progress_meter.h"
>   #include "qemu/coroutine.h"
>   #include "block/aio.h"
>   
> @@ -117,15 +118,7 @@ typedef struct Job {
>       /** True if this job should automatically dismiss itself */
>       bool auto_dismiss;
>   
> -    /**
> -     * Current progress. The unit is arbitrary as long as the ratio between
> -     * progress_current and progress_total represents the estimated percentage
> -     * of work already done.
> -     */
> -    int64_t progress_current;
> -
> -    /** Estimated progress_current value at the completion of the job */
> -    int64_t progress_total;
> +    ProgressMeter progress;
>   
>       /**
>        * Return code from @run and/or @prepare callback(s).
> diff --git a/include/qemu/progress_meter.h b/include/qemu/progress_meter.h
> new file mode 100644
> index 0000000000..9a23ff071c
> --- /dev/null
> +++ b/include/qemu/progress_meter.h
> @@ -0,0 +1,58 @@
> +/*
> + * Helper functionality for some process progress tracking.
> + *
> + * Copyright (c) 2011 IBM Corp.
> + * Copyright (c) 2012, 2018 Red Hat, Inc.
> + * Copyright (c) 2020 Virtuozzo International GmbH
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#ifndef QEMU_PROGRESS_METER_H
> +#define QEMU_PROGRESS_METER_H
> +
> +typedef struct ProgressMeter {
> +    /**
> +     * Current progress. The unit is arbitrary as long as the ratio between
> +     * current and total represents the estimated percentage
> +     * of work already done.
> +     */
> +    uint64_t current;
> +
> +    /** Estimated current value at the completion of the process */
> +    uint64_t total;
> +} ProgressMeter;
> +
> +static inline void progress_work_done(ProgressMeter *pm, uint64_t done)
> +{
> +    pm->current += done;
> +}
> +
> +static inline void progress_set_remaining(ProgressMeter *pm, uint64_t remaining)
> +{
> +    pm->total = pm->current + remaining;
> +}
> +
> +static inline void progress_increase_remaining(ProgressMeter *pm,
> +                                               uint64_t delta)
> +{
> +    pm->total += delta;
> +}
> +
> +#endif /* QEMU_PROGRESS_METER_H */
> diff --git a/blockjob.c b/blockjob.c
> index 5d63b1e89d..fc850312c1 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -299,8 +299,8 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
>       info->device    = g_strdup(job->job.id);
>       info->busy      = atomic_read(&job->job.busy);
>       info->paused    = job->job.pause_count > 0;
> -    info->offset    = job->job.progress_current;
> -    info->len       = job->job.progress_total;
> +    info->offset    = job->job.progress.current;
> +    info->len       = job->job.progress.total;
>       info->speed     = job->speed;
>       info->io_status = job->iostatus;
>       info->ready     = job_is_ready(&job->job),
> @@ -330,8 +330,8 @@ static void block_job_event_cancelled(Notifier *n, void *opaque)
>   
>       qapi_event_send_block_job_cancelled(job_type(&job->job),
>                                           job->job.id,
> -                                        job->job.progress_total,
> -                                        job->job.progress_current,
> +                                        job->job.progress.total,
> +                                        job->job.progress.current,
>                                           job->speed);
>   }
>   
> @@ -350,8 +350,8 @@ static void block_job_event_completed(Notifier *n, void *opaque)
>   
>       qapi_event_send_block_job_completed(job_type(&job->job),
>                                           job->job.id,
> -                                        job->job.progress_total,
> -                                        job->job.progress_current,
> +                                        job->job.progress.total,
> +                                        job->job.progress.current,
>                                           job->speed,
>                                           !!msg,
>                                           msg);
> @@ -379,8 +379,8 @@ static void block_job_event_ready(Notifier *n, void *opaque)
>   
>       qapi_event_send_block_job_ready(job_type(&job->job),
>                                       job->job.id,
> -                                    job->job.progress_total,
> -                                    job->job.progress_current,
> +                                    job->job.progress.total,
> +                                    job->job.progress.current,
>                                       job->speed);
>   }
>   
> diff --git a/job-qmp.c b/job-qmp.c
> index fbfed25a00..fecc939ebd 100644
> --- a/job-qmp.c
> +++ b/job-qmp.c
> @@ -143,8 +143,8 @@ static JobInfo *job_query_single(Job *job, Error **errp)
>           .id                 = g_strdup(job->id),
>           .type               = job_type(job),
>           .status             = job->status,
> -        .current_progress   = job->progress_current,
> -        .total_progress     = job->progress_total,
> +        .current_progress   = job->progress.current,
> +        .total_progress     = job->progress.total,
>           .has_error          = !!job->err,
>           .error              = job->err ? \
>                                 g_strdup(error_get_pretty(job->err)) : NULL,
> diff --git a/job.c b/job.c
> index 04409b40aa..134a07b92e 100644
> --- a/job.c
> +++ b/job.c
> @@ -369,17 +369,17 @@ void job_unref(Job *job)
>   
>   void job_progress_update(Job *job, uint64_t done)
>   {
> -    job->progress_current += done;
> +    progress_work_done(&job->progress, done);
>   }
>   
>   void job_progress_set_remaining(Job *job, uint64_t remaining)
>   {
> -    job->progress_total = job->progress_current + remaining;
> +    progress_set_remaining(&job->progress, remaining);
>   }
>   
>   void job_progress_increase_remaining(Job *job, uint64_t delta)
>   {
> -    job->progress_total += delta;
> +    progress_increase_remaining(&job->progress, delta);
>   }
>   
>   void job_event_cancelled(Job *job)
> diff --git a/qemu-img.c b/qemu-img.c
> index 804630a368..59a720abf6 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -882,9 +882,9 @@ static void run_block_job(BlockJob *job, Error **errp)
>       do {
>           float progress = 0.0f;
>           aio_poll(aio_context, true);
> -        if (job->job.progress_total) {
> -            progress = (float)job->job.progress_current /
> -                       job->job.progress_total * 100.f;
> +        if (job->job.progress.total) {
> +            progress = (float)job->job.progress.current /
> +                       job->job.progress.total * 100.f;
>           }
>           qemu_progress_print(progress, 0);
>       } while (!job_is_ready(&job->job) && !job_is_completed(&job->job));
> 

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/9] job: refactor progress to separate object
  2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
  2020-03-10 12:22   ` Andrey Shinkevich
@ 2020-03-10 12:40   ` Max Reitz
  1 sibling, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 12:40 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel, qemu-stable


[-- Attachment #1.1: Type: text/plain, Size: 780 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> We need it in separate to pass to the block-copy object in the next
> commit.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  include/qemu/job.h            | 11 ++-----
>  include/qemu/progress_meter.h | 58 +++++++++++++++++++++++++++++++++++
>  blockjob.c                    | 16 +++++-----
>  job-qmp.c                     |  4 +--
>  job.c                         |  6 ++--
>  qemu-img.c                    |  6 ++--
>  6 files changed, 76 insertions(+), 25 deletions(-)
>  create mode 100644 include/qemu/progress_meter.h

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/9] block/block-copy: fix progress calculation
  2020-03-06  7:38 ` [PATCH v3 2/9] block/block-copy: fix progress calculation Vladimir Sementsov-Ogievskiy
@ 2020-03-10 13:32   ` Max Reitz
  0 siblings, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 13:32 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel, qemu-stable


[-- Attachment #1.1: Type: text/plain, Size: 1700 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> Assume we have two regions, A and B, and region B is in-flight now,
> region A is not yet touched, but it is unallocated and should be
> skipped.
> 
> Correspondingly, as progress we have
> 
>   total = A + B
>   current = 0
> 
> If we reset unallocated region A and call progress_reset_callback,
> it will calculate 0 bytes dirty in the bitmap and call
> job_progress_set_remaining, which will set
> 
>    total = current + 0 = 0 + 0 = 0
> 
> So, B bytes are actually removed from total accounting. When job
> finishes we'll have
> 
>    total = 0
>    current = B
> 
> , which doesn't sound good.
> 
> This is because we didn't considered in-flight bytes, actually when
> calculating remaining, we should have set (in_flight + dirty_bytes)
> as remaining, not only dirty_bytes.
> 
> To fix it, let's refactor progress calculation, moving it to block-copy
> itself instead of fixing callback. And, of course, track in_flight
> bytes count.
> 
> We still have to keep one callback, to maintain backup job bytes_read
> calculation, but it will go on soon, when we turn the whole backup
> process into one block_copy call.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  include/block/block-copy.h | 14 +++++---------
>  block/backup.c             | 13 ++-----------
>  block/block-copy.c         | 16 ++++++++++++----
>  3 files changed, 19 insertions(+), 24 deletions(-)

Looks good, but I suppose we should also drop the
ProgressResetCallbackFunc type.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/9] block/block-copy: specialcase first copy_range request
  2020-03-06  7:38 ` [PATCH v3 3/9] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
@ 2020-03-10 13:42   ` Max Reitz
  0 siblings, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 13:42 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1090 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> In block_copy_do_copy we fallback to read+write if copy_range failed.
> In this case copy_size is larger than defined for buffered IO, and
> there is corresponding commit. Still, backup copies data cluster by
> cluster, and most of requests are limited to one cluster anyway, so the
> only source of this one bad-limited request is copy-before-write
> operation.
> 
> Further patch will move backup to use block_copy directly, than for
> cases where copy_range is not supported, first request will be
> oversized in each backup. It's not good, let's change it now.
> 
> Fix is simple: just limit first copy_range request like buffer-based
> request. If it succeed, set larger copy_range limit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/block-copy.c | 41 +++++++++++++++++++++++++++++++----------
>  1 file changed, 31 insertions(+), 10 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/9] block/block-copy: use block_status
  2020-03-06  7:38 ` [PATCH v3 4/9] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
@ 2020-03-10 14:21   ` Max Reitz
  0 siblings, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 14:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 842 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> Use bdrv_block_status_above to chose effective chunk size and to handle
> zeroes effectively.
> 
> This substitutes checking for just being allocated or not, and drops
> old code path for it. Assistance by backup job is dropped too, as
> caching block-status information is more difficult than just caching
> is-allocated information in our dirty bitmap, and backup job is not
> good place for this caching anyway.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/block-copy.c | 73 +++++++++++++++++++++++++++++++++++++---------
>  block/trace-events |  1 +
>  2 files changed, 61 insertions(+), 13 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req
  2020-03-06  7:38 ` [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req Vladimir Sementsov-Ogievskiy
@ 2020-03-10 14:27   ` Max Reitz
  0 siblings, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 14:27 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 433 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> Split find_conflicting_inflight_req to be used separately.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/block-copy.c | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end
  2020-03-06  7:38 ` [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
@ 2020-03-10 14:44   ` Max Reitz
  0 siblings, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 14:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 706 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> We have a lot of "chunk_end - start" invocations, let's switch to
> bytes/cur_bytes scheme instead.
> 
> While being here, improve check on block_copy_do_copy parameters to not
> overflow when calculating nbytes and use int64_t for bytes in
> block_copy for consistency.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  include/block/block-copy.h |  6 +--
>  block/block-copy.c         | 78 ++++++++++++++++++++------------------
>  2 files changed, 44 insertions(+), 40 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces
  2020-03-06  7:38 ` [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
@ 2020-03-10 14:50   ` Max Reitz
  2020-03-10 14:55     ` Andrey Shinkevich
  0 siblings, 1 reply; 25+ messages in thread
From: Max Reitz @ 2020-03-10 14:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1159 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> offset/bytes pair is more usual naming in block layer, let's use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> ---
>  include/block/block-copy.h |  4 +-
>  block/block-copy.c         | 84 +++++++++++++++++++-------------------
>  2 files changed, 44 insertions(+), 44 deletions(-)

[...]

> diff --git a/block/block-copy.c b/block/block-copy.c
> index 4c947e548b..2b29131653 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -25,13 +25,13 @@
>  #define BLOCK_COPY_MAX_MEM (128 * MiB)
>  
>  static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
> -                                                           int64_t start,
> -                                                           int64_t bytes)
> +                                                          int64_t offset,
> +                                                          int64_t bytes)

The alignment’s off now.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces
  2020-03-10 14:50   ` Max Reitz
@ 2020-03-10 14:55     ` Andrey Shinkevich
  2020-03-10 15:14       ` Max Reitz
  0 siblings, 1 reply; 25+ messages in thread
From: Andrey Shinkevich @ 2020-03-10 14:55 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, jsnow, qemu-devel



On 10/03/2020 17:50, Max Reitz wrote:
> On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
>> offset/bytes pair is more usual naming in block layer, let's use it.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   include/block/block-copy.h |  4 +-
>>   block/block-copy.c         | 84 +++++++++++++++++++-------------------
>>   2 files changed, 44 insertions(+), 44 deletions(-)
> 
> [...]
> 
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 4c947e548b..2b29131653 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -25,13 +25,13 @@
>>   #define BLOCK_COPY_MAX_MEM (128 * MiB)
>>   
>>   static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
>> -                                                           int64_t start,
>> -                                                           int64_t bytes)
>> +                                                          int64_t offset,
>> +                                                          int64_t bytes)
> 
> The alignment’s off now.
> 
> Max
> 

After applying the patch, it looks aligned in my vim editor.
-- 
With the best regards,
Andrey Shinkevich



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 9/9] block/block-copy: hide structure definitions
  2020-03-06  7:38 ` [PATCH v3 9/9] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
@ 2020-03-10 14:55   ` Andrey Shinkevich
  2020-03-10 15:38   ` Max Reitz
  1 sibling, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2020-03-10 14:55 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, jsnow, qemu-devel, mreitz



On 06/03/2020 10:38, Vladimir Sementsov-Ogievskiy wrote:
> Hide structure definitions and add explicit API instead, to keep an
> eye on the scope of the shared fields.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
> ---
>   include/block/block-copy.h | 52 +++----------------------------
>   block/backup-top.c         |  6 ++--
>   block/backup.c             | 25 +++++++--------
>   block/block-copy.c         | 63 ++++++++++++++++++++++++++++++++++++--
>   4 files changed, 82 insertions(+), 64 deletions(-)
> 
> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
> index c46b382cc6..9718c241c9 100644
> --- a/include/block/block-copy.h
> +++ b/include/block/block-copy.h

Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
-- 
With the best regards,
Andrey Shinkevich


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces
  2020-03-10 14:55     ` Andrey Shinkevich
@ 2020-03-10 15:14       ` Max Reitz
  2020-03-10 16:15         ` Andrey Shinkevich
  0 siblings, 1 reply; 25+ messages in thread
From: Max Reitz @ 2020-03-10 15:14 UTC (permalink / raw)
  To: Andrey Shinkevich, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1943 bytes --]

On 10.03.20 15:55, Andrey Shinkevich wrote:
> 
> 
> On 10/03/2020 17:50, Max Reitz wrote:
>> On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
>>> offset/bytes pair is more usual naming in block layer, let's use it.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   include/block/block-copy.h |  4 +-
>>>   block/block-copy.c         | 84 +++++++++++++++++++-------------------
>>>   2 files changed, 44 insertions(+), 44 deletions(-)
>>
>> [...]
>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 4c947e548b..2b29131653 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -25,13 +25,13 @@
>>>   #define BLOCK_COPY_MAX_MEM (128 * MiB)
>>>     static BlockCopyInFlightReq
>>> *find_conflicting_inflight_req(BlockCopyState *s,
>>> -                                                           int64_t
>>> start,
>>> -                                                           int64_t
>>> bytes)
>>> +                                                          int64_t
>>> offset,
>>> +                                                          int64_t
>>> bytes)
>>
>> The alignment’s off now.
>>
>> Max
>>
> 
> After applying the patch, it looks aligned in my vim editor.

I did apply it and it wasn’t aligned for me.

Now we (Red Hat) have some mailing agent for a couple of months now that
for some reason likes to change incoming mails’ encodings (in this case
from, I presume, 8bit to quoted-printable), so I have to use a script to
translate it back.  But judging from what I can see in the archive:

https://lists.nongnu.org/archive/html/qemu-block/2020-03/msg00196.html

the alignment is indeed off.  Otherwise, the second line (with the
@bytes parameter) would not be changed.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/9] block/block-copy: reduce intersecting request lock
  2020-03-06  7:38 ` [PATCH v3 8/9] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
@ 2020-03-10 15:32   ` Max Reitz
  2020-03-11  9:39     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 25+ messages in thread
From: Max Reitz @ 2020-03-10 15:32 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 4686 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> Currently, block_copy operation lock the whole requested region. But
> there is no reason to lock clusters, which are already copied, it will
> disturb other parallel block_copy requests for no reason.
> 
> Let's instead do the following:
> 
> Lock only sub-region, which we are going to operate on. Then, after
> copying all dirty sub-regions, we should wait for intersecting
> requests block-copy, if they failed, we should retry these new dirty
> clusters.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> ---
>  block/block-copy.c | 128 ++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 104 insertions(+), 24 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 2b29131653..d66b8eb691 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

> +/* Called only on full-dirty region */
>  static void block_copy_inflight_req_begin(BlockCopyState *s,
>                                            BlockCopyInFlightReq *req,
>                                            int64_t offset, int64_t bytes)
>  {
> +    assert(!find_conflicting_inflight_req(s, offset, bytes));
> +
> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +    s->in_flight_bytes += bytes;
> +
>      req->offset = offset;
>      req->bytes = bytes;
>      qemu_co_queue_init(&req->wait_queue);
>      QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>  }
>  
> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
> +/*
> + * block_copy_inflight_req_shrink
> + *
> + * Drop the tail of the request to be handled later. Set dirty bits back and
> + * wake up all requests waiting for us (may be some of them are not intersecting
> + * with shrunk request)
> + */
> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>  {
> +    if (new_bytes == req->bytes) {
> +        return;
> +    }
> +
> +    assert(new_bytes > 0 && new_bytes < req->bytes);
> +
> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
> +                          req->offset + new_bytes, req->bytes - new_bytes);

I think we need to reduce in_flight_bytes here.

> +
> +    req->bytes = new_bytes;
> +    qemu_co_queue_restart_all(&req->wait_queue);
> +}
> +
> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
> +                                                     BlockCopyInFlightReq *req,
> +                                                     int ret)
> +{
> +    s->in_flight_bytes -= req->bytes;
> +    if (ret < 0) {
> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
> +    }
>      QLIST_REMOVE(req, list);
>      qemu_co_queue_restart_all(&req->wait_queue);
>  }

[...]

> @@ -432,7 +479,40 @@ int coroutine_fn block_copy(BlockCopyState *s,
>          bytes -= cur_bytes;
>      }
>  
> -    block_copy_inflight_req_end(&req);
> +    return found_dirty;
> +}
> +
> +/*
> + * block_copy
> + *
> + * Copy requested region, accordingly to dirty bitmap.
> + * Collaborate with parallel block_copy requests: if they success it help us. If

s/success/succeed/, s/it help/it will help/

> + * they fail, we retry not-copied regions. So, if we return error, it means that

s/retry/will retry/

(In theory also s/it means/it will mean/, but I suppose that also works
as-is.)

> + * io operation failed in context of _this_ block_copy call, not some parallel

Perhaps rather “some I/O operation failed in the context of […]”?

> + * operation.
> + */
> +int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
> +                            bool *error_is_read)
> +{
> +    int ret;
> +
> +    do {
> +        ret = block_copy_dirty_clusters(s, offset, bytes, error_is_read);
> +
> +        if (ret == 0) {
> +            ret = block_copy_wait_one(s, offset, bytes);
> +        }
> +
> +        /*
> +         * We retry in two cases:
> +         * 1. Some progress done
> +         *    Something was copied, which means that there were yield points
> +         *    and some new dirty bits may have appeared (due to failed parallel
> +         *    block-copy requests).
> +         * 2. We have waited for some intersecting block-copy request
> +         *    It may have failed and produced new dirty bits.
> +         */
> +    } while (ret > 0);
>  
>      return ret;
>  }

This new code looks good.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 9/9] block/block-copy: hide structure definitions
  2020-03-06  7:38 ` [PATCH v3 9/9] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
  2020-03-10 14:55   ` Andrey Shinkevich
@ 2020-03-10 15:38   ` Max Reitz
  1 sibling, 0 replies; 25+ messages in thread
From: Max Reitz @ 2020-03-10 15:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1272 bytes --]

On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
> Hide structure definitions and add explicit API instead, to keep an
> eye on the scope of the shared fields.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block-copy.h | 52 +++----------------------------
>  block/backup-top.c         |  6 ++--
>  block/backup.c             | 25 +++++++--------
>  block/block-copy.c         | 63 ++++++++++++++++++++++++++++++++++++--
>  4 files changed, 82 insertions(+), 64 deletions(-)

[...]

> diff --git a/block/block-copy.c b/block/block-copy.c
> index d66b8eb691..a2d8579ca0 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

>  static BlockCopyInFlightReq *find_conflicting_inflight_req(BlockCopyState *s,
> -                                                          int64_t offset,
> -                                                          int64_t bytes)
> +                                                           int64_t offset,
> +                                                           int64_t bytes)

Now I see why Andrey says that the alignment fits; this should be
squashed into patch 7.

With that done:

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces
  2020-03-10 15:14       ` Max Reitz
@ 2020-03-10 16:15         ` Andrey Shinkevich
  0 siblings, 0 replies; 25+ messages in thread
From: Andrey Shinkevich @ 2020-03-10 16:15 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, jsnow, qemu-devel



On 10/03/2020 18:14, Max Reitz wrote:
> On 10.03.20 15:55, Andrey Shinkevich wrote:
>>
>>
>> On 10/03/2020 17:50, Max Reitz wrote:
>>> On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
>>>> offset/bytes pair is more usual naming in block layer, let's use it.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    include/block/block-copy.h |  4 +-
>>>>    block/block-copy.c         | 84 +++++++++++++++++++-------------------
>>>>    2 files changed, 44 insertions(+), 44 deletions(-)
>>>
>>> [...]
>>>
>>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>>> index 4c947e548b..2b29131653 100644
>>>> --- a/block/block-copy.c
>>>> +++ b/block/block-copy.c
>>>> @@ -25,13 +25,13 @@
>>>>    #define BLOCK_COPY_MAX_MEM (128 * MiB)
>>>>      static BlockCopyInFlightReq
>>>> *find_conflicting_inflight_req(BlockCopyState *s,
>>>> -                                                           int64_t
>>>> start,
>>>> -                                                           int64_t
>>>> bytes)
>>>> +                                                          int64_t
>>>> offset,
>>>> +                                                          int64_t
>>>> bytes)
>>>
>>> The alignment’s off now.
>>>
>>> Max
>>>
>>
>> After applying the patch, it looks aligned in my vim editor.
> 
> I did apply it and it wasn’t aligned for me.
> 
> Now we (Red Hat) have some mailing agent for a couple of months now that
> for some reason likes to change incoming mails’ encodings (in this case
> from, I presume, 8bit to quoted-printable), so I have to use a script to
> translate it back.  But judging from what I can see in the archive:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2020-03/msg00196.html
> 
> the alignment is indeed off.  Otherwise, the second line (with the
> @bytes parameter) would not be changed.
> 
> Max
> 

Thank you Max for noticing that.

Andrey
-- 
With the best regards,
Andrey Shinkevich



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/9] block/block-copy: reduce intersecting request lock
  2020-03-10 15:32   ` Max Reitz
@ 2020-03-11  9:39     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-03-11  9:39 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: kwolf, andrey.shinkevich, jsnow, qemu-devel

10.03.2020 18:32, Max Reitz wrote:
> On 06.03.20 08:38, Vladimir Sementsov-Ogievskiy wrote:
>> Currently, block_copy operation lock the whole requested region. But
>> there is no reason to lock clusters, which are already copied, it will
>> disturb other parallel block_copy requests for no reason.
>>
>> Let's instead do the following:
>>
>> Lock only sub-region, which we are going to operate on. Then, after
>> copying all dirty sub-regions, we should wait for intersecting
>> requests block-copy, if they failed, we should retry these new dirty
>> clusters.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>> ---
>>   block/block-copy.c | 128 ++++++++++++++++++++++++++++++++++++---------
>>   1 file changed, 104 insertions(+), 24 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 2b29131653..d66b8eb691 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
> 
> [...]
> 
>> +/* Called only on full-dirty region */
>>   static void block_copy_inflight_req_begin(BlockCopyState *s,
>>                                             BlockCopyInFlightReq *req,
>>                                             int64_t offset, int64_t bytes)
>>   {
>> +    assert(!find_conflicting_inflight_req(s, offset, bytes));
>> +
>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>> +    s->in_flight_bytes += bytes;
>> +
>>       req->offset = offset;
>>       req->bytes = bytes;
>>       qemu_co_queue_init(&req->wait_queue);
>>       QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
>>   }
>>   
>> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)
>> +/*
>> + * block_copy_inflight_req_shrink
>> + *
>> + * Drop the tail of the request to be handled later. Set dirty bits back and
>> + * wake up all requests waiting for us (may be some of them are not intersecting
>> + * with shrunk request)
>> + */
>> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
>> +        BlockCopyInFlightReq *req, int64_t new_bytes)
>>   {
>> +    if (new_bytes == req->bytes) {
>> +        return;
>> +    }
>> +
>> +    assert(new_bytes > 0 && new_bytes < req->bytes);
>> +
>> +    bdrv_set_dirty_bitmap(s->copy_bitmap,
>> +                          req->offset + new_bytes, req->bytes - new_bytes);
> 
> I think we need to reduce in_flight_bytes here.

Yes, you are right, thanks!

> 
>> +
>> +    req->bytes = new_bytes;
>> +    qemu_co_queue_restart_all(&req->wait_queue);
>> +}
>> +
>> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
>> +                                                     BlockCopyInFlightReq *req,
>> +                                                     int ret)
>> +{
>> +    s->in_flight_bytes -= req->bytes;
>> +    if (ret < 0) {
>> +        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
>> +    }
>>       QLIST_REMOVE(req, list);
>>       qemu_co_queue_restart_all(&req->wait_queue);
>>   }
> 
> [...]
> 
>> @@ -432,7 +479,40 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>           bytes -= cur_bytes;
>>       }
>>   
>> -    block_copy_inflight_req_end(&req);
>> +    return found_dirty;
>> +}
>> +
>> +/*
>> + * block_copy
>> + *
>> + * Copy requested region, accordingly to dirty bitmap.
>> + * Collaborate with parallel block_copy requests: if they success it help us. If
> 
> s/success/succeed/, s/it help/it will help/
> 
>> + * they fail, we retry not-copied regions. So, if we return error, it means that
> 
> s/retry/will retry/
> 
> (In theory also s/it means/it will mean/, but I suppose that also works
> as-is.)
> 
>> + * io operation failed in context of _this_ block_copy call, not some parallel
> 
> Perhaps rather “some I/O operation failed in the context of […]”?
> 
>> + * operation.
>> + */
>> +int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
>> +                            bool *error_is_read)
>> +{
>> +    int ret;
>> +
>> +    do {
>> +        ret = block_copy_dirty_clusters(s, offset, bytes, error_is_read);
>> +
>> +        if (ret == 0) {
>> +            ret = block_copy_wait_one(s, offset, bytes);
>> +        }
>> +
>> +        /*
>> +         * We retry in two cases:
>> +         * 1. Some progress done
>> +         *    Something was copied, which means that there were yield points
>> +         *    and some new dirty bits may have appeared (due to failed parallel
>> +         *    block-copy requests).
>> +         * 2. We have waited for some intersecting block-copy request
>> +         *    It may have failed and produced new dirty bits.
>> +         */
>> +    } while (ret > 0);
>>   
>>       return ret;
>>   }
> 
> This new code looks good.
> 
> Max
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-03-11  9:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-06  7:38 [PATCH v3 0/9] block-copy improvements: part I Vladimir Sementsov-Ogievskiy
2020-03-06  7:38 ` [PATCH v3 1/9] job: refactor progress to separate object Vladimir Sementsov-Ogievskiy
2020-03-10 12:22   ` Andrey Shinkevich
2020-03-10 12:40   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 2/9] block/block-copy: fix progress calculation Vladimir Sementsov-Ogievskiy
2020-03-10 13:32   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 3/9] block/block-copy: specialcase first copy_range request Vladimir Sementsov-Ogievskiy
2020-03-10 13:42   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 4/9] block/block-copy: use block_status Vladimir Sementsov-Ogievskiy
2020-03-10 14:21   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 5/9] block/block-copy: factor out find_conflicting_inflight_req Vladimir Sementsov-Ogievskiy
2020-03-10 14:27   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 6/9] block/block-copy: refactor interfaces to use bytes instead of end Vladimir Sementsov-Ogievskiy
2020-03-10 14:44   ` Max Reitz
2020-03-06  7:38 ` [PATCH v3 7/9] block/block-copy: rename start to offset in interfaces Vladimir Sementsov-Ogievskiy
2020-03-10 14:50   ` Max Reitz
2020-03-10 14:55     ` Andrey Shinkevich
2020-03-10 15:14       ` Max Reitz
2020-03-10 16:15         ` Andrey Shinkevich
2020-03-06  7:38 ` [PATCH v3 8/9] block/block-copy: reduce intersecting request lock Vladimir Sementsov-Ogievskiy
2020-03-10 15:32   ` Max Reitz
2020-03-11  9:39     ` Vladimir Sementsov-Ogievskiy
2020-03-06  7:38 ` [PATCH v3 9/9] block/block-copy: hide structure definitions Vladimir Sementsov-Ogievskiy
2020-03-10 14:55   ` Andrey Shinkevich
2020-03-10 15:38   ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.