All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/6] block-copy: protect block-copy internal structures
@ 2021-06-14  7:33 Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common Emanuele Giuseppe Esposito
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

This serie of patches aims to reduce the usage of the
AioContexlock in block-copy, by introducing smaller granularity
locks thus on making the block layer thread safe. 

This serie depends on my previous serie that brings thread safety
to the smaller API used by block-copy, like ratelimit, progressmeter
abd co-shared-resource.

What's missing for block-copy to be fully thread-safe is fixing
the CoSleep API to allow cross-thread sleep and wakeup.
Paolo is working on it.

Patch 1 provides a small refactoring, patch 2 introduces the 
.method field in BlockCopyState, to be used instead of .use_copy_range,
.copy_size and .zeros.
Patch 3-4 provide comments and refactoring in preparation to
the lock added in patch 5 on BlockCopyTask, BlockCopyCallState and
BlockCopyState. Patch 6 uses load_acquire/store_release to make sure
BlockCopyCallState OUT fields are updated before finished is set to
true. 

Based-on: <20210518094058.25952-1-eesposit@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
v4:
* Introduce patch 1 (refactoring)
* Adjust to Vladimir's style comments in patch 2 [Paolo, Vladimir]
* Extend the lock to cover an additional race condition in
  block_copy_common [Vladimir]

Emanuele Giuseppe Esposito (5):
  block-copy: small refactor in block_copy_task_entry and
    block_copy_common
  block-copy: improve comments of BlockCopyTask and BlockCopyState types
    and functions
  block-copy: move progress_set_remaining in block_copy_task_end
  block-copy: add a CoMutex
  block-copy: atomic .cancelled and .finished fields in
    BlockCopyCallState

Paolo Bonzini (1):
  block-copy: streamline choice of copy_range vs. read/write

 block/block-copy.c         | 327 ++++++++++++++++++++++---------------
 include/block/block-copy.h |   2 +
 2 files changed, 197 insertions(+), 132 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-19 14:33   ` Vladimir Sementsov-Ogievskiy
  2021-06-14  7:33 ` [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

Use a local variable instead of referencing BlockCopyState through a
BlockCopyCallState or BlockCopyTask every time.
This is in preparation for next patches.

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 943e30b7e6..f0dbb4912b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -452,14 +452,15 @@ static void block_copy_handle_copy_range_result(BlockCopyState *s,
 static coroutine_fn int block_copy_task_entry(AioTask *task)
 {
     BlockCopyTask *t = container_of(task, BlockCopyTask, task);
+    BlockCopyState *s = t->s;
     bool error_is_read = false;
     bool copy_range = t->copy_range;
     int ret;
 
-    ret = block_copy_do_copy(t->s, t->offset, t->bytes, t->zeroes,
+    ret = block_copy_do_copy(s, t->offset, t->bytes, t->zeroes,
                              &copy_range, &error_is_read);
     if (t->copy_range) {
-        block_copy_handle_copy_range_result(t->s, copy_range);
+        block_copy_handle_copy_range_result(s, copy_range);
     }
     if (ret < 0) {
         if (!t->call_state->ret) {
@@ -467,9 +468,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
             t->call_state->error_is_read = error_is_read;
         }
     } else {
-        progress_work_done(t->s->progress, t->bytes);
+        progress_work_done(s->progress, t->bytes);
     }
-    co_put_to_shres(t->s->mem, t->bytes);
+    co_put_to_shres(s->mem, t->bytes);
     block_copy_task_end(t, ret);
 
     return ret;
@@ -714,14 +715,15 @@ void block_copy_kick(BlockCopyCallState *call_state)
 static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
 {
     int ret;
+    BlockCopyState *s = call_state->s;
 
-    QLIST_INSERT_HEAD(&call_state->s->calls, call_state, list);
+    QLIST_INSERT_HEAD(&s->calls, call_state, list);
 
     do {
         ret = block_copy_dirty_clusters(call_state);
 
         if (ret == 0 && !call_state->cancelled) {
-            ret = block_copy_wait_one(call_state->s, call_state->offset,
+            ret = block_copy_wait_one(s, call_state->offset,
                                       call_state->bytes);
         }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-19 15:05   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 18:23   ` Vladimir Sementsov-Ogievskiy
  2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

From: Paolo Bonzini <pbonzini@redhat.com>

Put the logic to determine the copy size in a separate function, so
that there is a simple state machine for the possible methods of
copying data from one BlockDriverState to the other.

Use .method instead of .copy_range as in-out argument, and
include also .zeroes as an additional copy method.

While at it, store the common computation of block_copy_max_transfer
into a new field of BlockCopyState, and make sure that we always
obey max_transfer; that's more efficient even for the
COPY_RANGE_READ_WRITE case.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/block-copy.c | 174 +++++++++++++++++++++++----------------------
 1 file changed, 89 insertions(+), 85 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index f0dbb4912b..3f26be8ddc 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -28,6 +28,14 @@
 #define BLOCK_COPY_MAX_WORKERS 64
 #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
 
+typedef enum {
+    COPY_READ_WRITE_CLUSTER,
+    COPY_READ_WRITE,
+    COPY_WRITE_ZEROES,
+    COPY_RANGE_SMALL,
+    COPY_RANGE_FULL
+} BlockCopyMethod;
+
 static coroutine_fn int block_copy_task_entry(AioTask *task);
 
 typedef struct BlockCopyCallState {
@@ -64,8 +72,7 @@ typedef struct BlockCopyTask {
     BlockCopyCallState *call_state;
     int64_t offset;
     int64_t bytes;
-    bool zeroes;
-    bool copy_range;
+    BlockCopyMethod method;
     QLIST_ENTRY(BlockCopyTask) list;
     CoQueue wait_queue; /* coroutines blocked on this task */
 } BlockCopyTask;
@@ -86,8 +93,8 @@ typedef struct BlockCopyState {
     BdrvDirtyBitmap *copy_bitmap;
     int64_t in_flight_bytes;
     int64_t cluster_size;
-    bool use_copy_range;
-    int64_t copy_size;
+    BlockCopyMethod method;
+    int64_t max_transfer;
     uint64_t len;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QLIST_HEAD(, BlockCopyCallState) calls;
@@ -149,6 +156,24 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
     return true;
 }
 
+static int64_t block_copy_chunk_size(BlockCopyState *s)
+{
+    switch (s->method) {
+    case COPY_READ_WRITE_CLUSTER:
+        return s->cluster_size;
+    case COPY_READ_WRITE:
+    case COPY_RANGE_SMALL:
+        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
+                   s->max_transfer);
+    case COPY_RANGE_FULL:
+        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
+                   s->max_transfer);
+    default:
+        /* Cannot have COPY_WRITE_ZEROES here.  */
+        abort();
+    }
+}
+
 /*
  * Search for the first dirty area in offset/bytes range and create task at
  * the beginning of it.
@@ -158,8 +183,9 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
                                              int64_t offset, int64_t bytes)
 {
     BlockCopyTask *task;
-    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, call_state->max_chunk);
+    int64_t max_chunk = block_copy_chunk_size(s);
 
+    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
     if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
                                            offset, offset + bytes,
                                            max_chunk, &offset, &bytes))
@@ -183,7 +209,7 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
         .call_state = call_state,
         .offset = offset,
         .bytes = bytes,
-        .copy_range = s->use_copy_range,
+        .method = s->method,
     };
     qemu_co_queue_init(&task->wait_queue);
     QLIST_INSERT_HEAD(&s->tasks, task, list);
@@ -267,28 +293,28 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         .len = bdrv_dirty_bitmap_size(copy_bitmap),
         .write_flags = write_flags,
         .mem = shres_create(BLOCK_COPY_MAX_MEM),
+        .max_transfer = QEMU_ALIGN_DOWN(
+                                    block_copy_max_transfer(source, target),
+                                    cluster_size),
     };
 
-    if (block_copy_max_transfer(source, target) < cluster_size) {
+    if (s->max_transfer < cluster_size) {
         /*
          * copy_range does not respect max_transfer. We don't want to bother
          * with requests smaller than block-copy cluster size, so fallback to
          * buffered copying (read and write respect max_transfer on their
          * behalf).
          */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;
     } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
         /* Compression supports only cluster-size writes and no copy-range. */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;
     } else {
         /*
          * We enable copy-range, but keep small copy_size, until first
          * successful copy_range (look at block_copy_do_copy).
          */
-        s->use_copy_range = use_copy_range;
-        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
+        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
     }
 
     ratelimit_init(&s->rate_limit);
@@ -343,17 +369,14 @@ static coroutine_fn int block_copy_task_run(AioTaskPool *pool,
  *
  * No sync here: nor bitmap neighter intersecting requests handling, only copy.
  *
- * @copy_range is an in-out argument: if *copy_range is false, copy_range is not
- * done. If *copy_range is true, copy_range is attempted. If the copy_range
- * attempt fails, the function falls back to the usual read+write and
- * *copy_range is set to false. *copy_range and zeroes must not be true
- * simultaneously.
- *
+ * @method is an in-out argument, so that copy_range can be either extended to
+ * a full-size buffer or disabled if the copy_range attempt fails.  The output
+ * value of @method should be used for subsequent tasks.
  * Returns 0 on success.
  */
 static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
                                            int64_t offset, int64_t bytes,
-                                           bool zeroes, bool *copy_range,
+                                           BlockCopyMethod *method,
                                            bool *error_is_read)
 {
     int ret;
@@ -367,9 +390,9 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
     assert(offset + bytes <= s->len ||
            offset + bytes == QEMU_ALIGN_UP(s->len, s->cluster_size));
     assert(nbytes < INT_MAX);
-    assert(!(*copy_range && zeroes));
 
-    if (zeroes) {
+    switch (*method) {
+    case COPY_WRITE_ZEROES:
         ret = bdrv_co_pwrite_zeroes(s->target, offset, nbytes, s->write_flags &
                                     ~BDRV_REQ_WRITE_COMPRESSED);
         if (ret < 0) {
@@ -377,76 +400,59 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
             *error_is_read = false;
         }
         return ret;
-    }
 
-    if (*copy_range) {
+    case COPY_RANGE_SMALL:
+    case COPY_RANGE_FULL:
         ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
                                  0, s->write_flags);
-        if (ret < 0) {
-            trace_block_copy_copy_range_fail(s, offset, ret);
-            *copy_range = false;
-            /* Fallback to read+write with allocated buffer */
-        } else {
+        if (ret >= 0) {
+            /* Successful copy-range, increase copy_size.  */
+            *method = COPY_RANGE_FULL;
             return 0;
         }
-    }
 
-    /*
-     * In case of failed copy_range request above, we may proceed with buffered
-     * request larger than BLOCK_COPY_MAX_BUFFER. Still, further requests will
-     * be properly limited, so don't care too much. Moreover the most likely
-     * case (copy_range is unsupported for the configuration, so the very first
-     * copy_range request fails) is handled by setting large copy_size only
-     * after first successful copy_range.
-     */
+        trace_block_copy_copy_range_fail(s, offset, ret);
+        *method = COPY_READ_WRITE;
+        /* Fall through to read+write with allocated buffer */
 
-    bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
+    case COPY_READ_WRITE_CLUSTER:
+    case COPY_READ_WRITE:
+        /*
+         * In case of failed copy_range request above, we may proceed with
+         * buffered request larger than BLOCK_COPY_MAX_BUFFER.
+         * Still, further requests will be properly limited, so don't care too
+         * much. Moreover the most likely case (copy_range is unsupported for
+         * the configuration, so the very first copy_range request fails)
+         * is handled by setting large copy_size only after first successful
+         * copy_range.
+         */
 
-    ret = bdrv_co_pread(s->source, offset, nbytes, bounce_buffer, 0);
-    if (ret < 0) {
-        trace_block_copy_read_fail(s, offset, ret);
-        *error_is_read = true;
-        goto out;
-    }
+        bounce_buffer = qemu_blockalign(s->source->bs, nbytes);
 
-    ret = bdrv_co_pwrite(s->target, offset, nbytes, bounce_buffer,
-                         s->write_flags);
-    if (ret < 0) {
-        trace_block_copy_write_fail(s, offset, ret);
-        *error_is_read = false;
-        goto out;
-    }
+        ret = bdrv_co_pread(s->source, offset, nbytes, bounce_buffer, 0);
+        if (ret < 0) {
+            trace_block_copy_read_fail(s, offset, ret);
+            *error_is_read = true;
+            goto out;
+        }
 
-out:
-    qemu_vfree(bounce_buffer);
+        ret = bdrv_co_pwrite(s->target, offset, nbytes, bounce_buffer,
+                             s->write_flags);
+        if (ret < 0) {
+            trace_block_copy_write_fail(s, offset, ret);
+            *error_is_read = false;
+            goto out;
+        }
 
-    return ret;
-}
+    out:
+        qemu_vfree(bounce_buffer);
+        break;
 
-static void block_copy_handle_copy_range_result(BlockCopyState *s,
-                                                bool is_success)
-{
-    if (!s->use_copy_range) {
-        /* already disabled */
-        return;
+    default:
+        abort();
     }
 
-    if (is_success) {
-        /*
-         * Successful copy-range. Now increase copy_size.  copy_range
-         * does not respect max_transfer (it's a TODO), so we factor
-         * that in here.
-         */
-        s->copy_size =
-                MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
-                    QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
-                                                            s->target),
-                                    s->cluster_size));
-    } else {
-        /* Copy-range failed, disable it. */
-        s->use_copy_range = false;
-        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
-    }
+    return ret;
 }
 
 static coroutine_fn int block_copy_task_entry(AioTask *task)
@@ -454,13 +460,12 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
     BlockCopyTask *t = container_of(task, BlockCopyTask, task);
     BlockCopyState *s = t->s;
     bool error_is_read = false;
-    bool copy_range = t->copy_range;
+    BlockCopyMethod method = t->method;
     int ret;
 
-    ret = block_copy_do_copy(s, t->offset, t->bytes, t->zeroes,
-                             &copy_range, &error_is_read);
-    if (t->copy_range) {
-        block_copy_handle_copy_range_result(s, copy_range);
+    ret = block_copy_do_copy(s, t->offset, t->bytes, &method, &error_is_read);
+    if (s->method == t->method) {
+        s->method = method;
     }
     if (ret < 0) {
         if (!t->call_state->ret) {
@@ -643,8 +648,7 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
             continue;
         }
         if (ret & BDRV_BLOCK_ZERO) {
-            task->zeroes = true;
-            task->copy_range = false;
+            task->method = COPY_WRITE_ZEROES;
         }
 
         if (!call_state->ignore_ratelimit) {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
                     ` (2 more replies)
  2021-06-14  7:33 ` [PATCH v4 4/6] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

As done in BlockCopyCallState, categorize BlockCopyTask
and BlockCopyState in IN, State and OUT fields.
This is just to understand which field has to be protected with a lock.

.sleep_state is handled in the series "coroutine: new sleep/wake API"
and thus here left as TODO.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 3f26be8ddc..5ff7764e87 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
     /* Coroutine where async block-copy is running */
     Coroutine *co;
 
-    /* To reference all call states from BlockCopyState */
-    QLIST_ENTRY(BlockCopyCallState) list;
-
     /* State */
-    int ret;
     bool finished;
-    QemuCoSleep sleep;
-    bool cancelled;
+    QemuCoSleep sleep; /* TODO: protect API with a lock */
+
+    /* To reference all call states from BlockCopyState */
+    QLIST_ENTRY(BlockCopyCallState) list;
 
     /* OUT parameters */
+    bool cancelled;
     bool error_is_read;
+    int ret;
 } BlockCopyCallState;
 
 typedef struct BlockCopyTask {
     AioTask task;
 
+    /*
+     * IN parameters. Initialized in block_copy_task_create()
+     * and never changed.
+     */
     BlockCopyState *s;
     BlockCopyCallState *call_state;
     int64_t offset;
-    int64_t bytes;
     BlockCopyMethod method;
-    QLIST_ENTRY(BlockCopyTask) list;
+
+    /* State */
     CoQueue wait_queue; /* coroutines blocked on this task */
+    int64_t bytes;
+    QLIST_ENTRY(BlockCopyTask) list;
 } BlockCopyTask;
 
 static int64_t task_end(BlockCopyTask *task)
@@ -90,15 +96,25 @@ typedef struct BlockCopyState {
      */
     BdrvChild *source;
     BdrvChild *target;
-    BdrvDirtyBitmap *copy_bitmap;
+
+    /* State */
     int64_t in_flight_bytes;
-    int64_t cluster_size;
     BlockCopyMethod method;
-    int64_t max_transfer;
-    uint64_t len;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QLIST_HEAD(, BlockCopyCallState) calls;
+    /* State fields that use a thread-safe API */
+    BdrvDirtyBitmap *copy_bitmap;
+    ProgressMeter *progress;
+    SharedResource *mem;
+    RateLimit rate_limit;
 
+    /*
+     * IN parameters. Initialized in block_copy_state_new()
+     * and never changed.
+     */
+    int64_t cluster_size;
+    int64_t max_transfer;
+    uint64_t len;
     BdrvRequestFlags write_flags;
 
     /*
@@ -114,14 +130,11 @@ typedef struct BlockCopyState {
      * In this case, block_copy() will query the source’s allocation status,
      * skip unallocated regions, clear them in the copy_bitmap, and invoke
      * block_copy_reset_unallocated() every time it does.
+     *
+     * This field is set in backup_run() before coroutines are run,
+     * therefore is an IN.
      */
     bool skip_unallocated;
-
-    ProgressMeter *progress;
-
-    SharedResource *mem;
-
-    RateLimit rate_limit;
 } BlockCopyState;
 
 static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 4/6] block-copy: move progress_set_remaining in block_copy_task_end
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (2 preceding siblings ...)
  2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 5/6] block-copy: add a CoMutex Emanuele Giuseppe Esposito
  2021-06-14  7:33 ` [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
  5 siblings, 0 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

Moving this function in task_end ensures to update the progress
anyways, even if there is an error.

It also helps in next patch, allowing task_end to have only
one critical section.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 5ff7764e87..afa2f484f0 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -261,6 +261,9 @@ static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
         bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
     }
     QLIST_REMOVE(task, list);
+    progress_set_remaining(task->s->progress,
+                           bdrv_get_dirty_count(task->s->copy_bitmap) +
+                           task->s->in_flight_bytes);
     qemu_co_queue_restart_all(&task->wait_queue);
 }
 
@@ -651,9 +654,6 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
         }
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
             block_copy_task_end(task, 0);
-            progress_set_remaining(s->progress,
-                                   bdrv_get_dirty_count(s->copy_bitmap) +
-                                   s->in_flight_bytes);
             trace_block_copy_skip_range(s, task->offset, task->bytes);
             offset = task_end(task);
             bytes = end - offset;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 5/6] block-copy: add a CoMutex
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (3 preceding siblings ...)
  2021-06-14  7:33 ` [PATCH v4 4/6] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-19 19:34   ` Vladimir Sementsov-Ogievskiy
  2021-06-14  7:33 ` [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
  5 siblings, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

Add a CoMutex to protect concurrent access of block-copy
data structures.

This mutex also protects .copy_bitmap, because its thread-safe
API does not prevent it from assigning two tasks to the same
bitmap region.

.finished, .cancelled and reads to .ret and .error_is_read will be
protected in the following patch, because are used also outside
coroutines.

Also set block_copy_task_create as coroutine_fn because:
1) it is static and only invoked by coroutine functions
2) this patch introduces and uses a CoMutex lock there

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c         | 79 +++++++++++++++++++++++++++++---------
 include/block/block-copy.h |  2 +
 2 files changed, 62 insertions(+), 19 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index afa2f484f0..6416929abd 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -61,6 +61,7 @@ typedef struct BlockCopyCallState {
 
     /* OUT parameters */
     bool cancelled;
+    /* Fields protected by lock in BlockCopyState */
     bool error_is_read;
     int ret;
 } BlockCopyCallState;
@@ -77,8 +78,12 @@ typedef struct BlockCopyTask {
     int64_t offset;
     BlockCopyMethod method;
 
-    /* State */
+    /* State. Protected by lock in BlockCopyState */
     CoQueue wait_queue; /* coroutines blocked on this task */
+    /*
+     * Only protect the case of parallel read while updating @bytes
+     * value in block_copy_task_shrink().
+     */
     int64_t bytes;
     QLIST_ENTRY(BlockCopyTask) list;
 } BlockCopyTask;
@@ -97,7 +102,8 @@ typedef struct BlockCopyState {
     BdrvChild *source;
     BdrvChild *target;
 
-    /* State */
+    /* State. Protected by lock */
+    CoMutex lock;
     int64_t in_flight_bytes;
     BlockCopyMethod method;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
@@ -137,6 +143,7 @@ typedef struct BlockCopyState {
     bool skip_unallocated;
 } BlockCopyState;
 
+/* Called with lock held */
 static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
                                             int64_t offset, int64_t bytes)
 {
@@ -154,6 +161,8 @@ static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
 /*
  * If there are no intersecting tasks return false. Otherwise, wait for the
  * first found intersecting tasks to finish and return true.
+ *
+ * Called with lock held.
  */
 static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
                                              int64_t bytes)
@@ -164,7 +173,7 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
         return false;
     }
 
-    qemu_co_queue_wait(&task->wait_queue, NULL);
+    qemu_co_queue_wait(&task->wait_queue, &s->lock);
 
     return true;
 }
@@ -191,14 +200,15 @@ static int64_t block_copy_chunk_size(BlockCopyState *s)
  * Search for the first dirty area in offset/bytes range and create task at
  * the beginning of it.
  */
-static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
-                                             BlockCopyCallState *call_state,
-                                             int64_t offset, int64_t bytes)
+static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
+                                                BlockCopyCallState *call_state,
+                                                int64_t offset, int64_t bytes)
 {
     BlockCopyTask *task;
-    int64_t max_chunk = block_copy_chunk_size(s);
+    int64_t max_chunk;
 
-    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
+    QEMU_LOCK_GUARD(&s->lock);
+    max_chunk = MIN_NON_ZERO(block_copy_chunk_size(s), call_state->max_chunk);
     if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
                                            offset, offset + bytes,
                                            max_chunk, &offset, &bytes))
@@ -240,6 +250,7 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
 static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
                                                 int64_t new_bytes)
 {
+    QEMU_LOCK_GUARD(&task->s->lock);
     if (new_bytes == task->bytes) {
         return;
     }
@@ -256,6 +267,7 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
 
 static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
 {
+    QEMU_LOCK_GUARD(&task->s->lock);
     task->s->in_flight_bytes -= task->bytes;
     if (ret < 0) {
         bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
@@ -334,12 +346,14 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
     }
 
     ratelimit_init(&s->rate_limit);
+    qemu_co_mutex_init(&s->lock);
     QLIST_INIT(&s->tasks);
     QLIST_INIT(&s->calls);
 
     return s;
 }
 
+/* Only set before running the job, so it is thread safe to call. */
 void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
 {
     s->progress = pm;
@@ -480,16 +494,20 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
     int ret;
 
     ret = block_copy_do_copy(s, t->offset, t->bytes, &method, &error_is_read);
-    if (s->method == t->method) {
-        s->method = method;
-    }
-    if (ret < 0) {
-        if (!t->call_state->ret) {
-            t->call_state->ret = ret;
-            t->call_state->error_is_read = error_is_read;
+
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        if (s->method == t->method) {
+            s->method = method;
+        }
+
+        if (ret < 0) {
+            if (!t->call_state->ret) {
+                t->call_state->ret = ret;
+                t->call_state->error_is_read = error_is_read;
+            }
+        } else {
+            progress_work_done(s->progress, t->bytes);
         }
-    } else {
-        progress_work_done(s->progress, t->bytes);
     }
     co_put_to_shres(s->mem, t->bytes);
     block_copy_task_end(t, ret);
@@ -591,10 +609,12 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
     bytes = clusters * s->cluster_size;
 
     if (!ret) {
+        qemu_co_mutex_lock(&s->lock);
         bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
         progress_set_remaining(s->progress,
                                bdrv_get_dirty_count(s->copy_bitmap) +
                                s->in_flight_bytes);
+        qemu_co_mutex_unlock(&s->lock);
     }
 
     *count = bytes;
@@ -734,14 +754,33 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
     int ret;
     BlockCopyState *s = call_state->s;
 
+    qemu_co_mutex_lock(&s->lock);
     QLIST_INSERT_HEAD(&s->calls, call_state, list);
+    qemu_co_mutex_unlock(&s->lock);
 
     do {
         ret = block_copy_dirty_clusters(call_state);
 
         if (ret == 0 && !call_state->cancelled) {
-            ret = block_copy_wait_one(s, call_state->offset,
-                                      call_state->bytes);
+            WITH_QEMU_LOCK_GUARD(&s->lock) {
+                /*
+                 * Check that there is no task we still need to
+                 * wait to complete
+                 */
+                ret = block_copy_wait_one(s, call_state->offset,
+                                          call_state->bytes);
+                if (ret == 0) {
+                    /*
+                     * No pending tasks, but check again the bitmap in this
+                     * same critical section, since a task might have failed
+                     * between this and the critical section in
+                     * block_copy_dirty_clusters().
+                     */
+                    ret = bdrv_dirty_bitmap_next_dirty(s->copy_bitmap,
+                                                       call_state->offset,
+                                                       call_state->bytes) > 0;
+                }
+            }
         }
 
         /*
@@ -761,7 +800,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
         call_state->cb(call_state->cb_opaque);
     }
 
+    qemu_co_mutex_lock(&s->lock);
     QLIST_REMOVE(call_state, list);
+    qemu_co_mutex_unlock(&s->lock);
 
     return ret;
 }
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 338f2ea7fd..bf8f0c679f 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -18,6 +18,8 @@
 #include "block/block.h"
 #include "qemu/co-shared-resource.h"
 
+/* All APIs are thread-safe*/
+
 typedef void (*BlockCopyAsyncCallbackFunc)(void *opaque);
 typedef struct BlockCopyState BlockCopyState;
 typedef struct BlockCopyCallState BlockCopyCallState;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (4 preceding siblings ...)
  2021-06-14  7:33 ` [PATCH v4 5/6] block-copy: add a CoMutex Emanuele Giuseppe Esposito
@ 2021-06-14  7:33 ` Emanuele Giuseppe Esposito
  2021-06-19 20:06   ` Vladimir Sementsov-Ogievskiy
  5 siblings, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  7:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.

The atomic here are necessary because the fields are concurrently modified
also outside coroutines.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 6416929abd..5348e1f61b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
     Coroutine *co;
 
     /* State */
-    bool finished;
+    bool finished; /* atomic */
     QemuCoSleep sleep; /* TODO: protect API with a lock */
 
     /* To reference all call states from BlockCopyState */
     QLIST_ENTRY(BlockCopyCallState) list;
 
     /* OUT parameters */
-    bool cancelled;
+    bool cancelled; /* atomic */
     /* Fields protected by lock in BlockCopyState */
     bool error_is_read;
     int ret;
@@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
+    while (bytes && aio_task_pool_status(aio) == 0 &&
+           !qatomic_read(&call_state->cancelled)) {
         BlockCopyTask *task;
         int64_t status_bytes;
 
@@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
     do {
         ret = block_copy_dirty_clusters(call_state);
 
-        if (ret == 0 && !call_state->cancelled) {
+        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
             WITH_QEMU_LOCK_GUARD(&s->lock) {
                 /*
                  * Check that there is no task we still need to
@@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
          * 2. We have waited for some intersecting block-copy request
          *    It may have failed and produced new dirty bits.
          */
-    } while (ret > 0 && !call_state->cancelled);
+    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
 
-    call_state->finished = true;
+    qatomic_store_release(&call_state->finished, true);
 
     if (call_state->cb) {
         call_state->cb(call_state->cb_opaque);
@@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
         return;
     }
 
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));
     g_free(call_state);
 }
 
 bool block_copy_call_finished(BlockCopyCallState *call_state)
 {
-    return call_state->finished;
+    return qatomic_load_acquire(&call_state->finished);
 }
 
 bool block_copy_call_succeeded(BlockCopyCallState *call_state)
 {
-    return call_state->finished && !call_state->cancelled &&
-        call_state->ret == 0;
+    return qatomic_load_acquire(&call_state->finished) &&
+           !qatomic_read(&call_state->cancelled) &&
+           call_state->ret == 0;
 }
 
 bool block_copy_call_failed(BlockCopyCallState *call_state)
 {
-    return call_state->finished && !call_state->cancelled &&
-        call_state->ret < 0;
+    return qatomic_load_acquire(&call_state->finished) &&
+           !qatomic_read(&call_state->cancelled) &&
+           call_state->ret < 0;
 }
 
 bool block_copy_call_cancelled(BlockCopyCallState *call_state)
 {
-    return call_state->cancelled;
+    return qatomic_read(&call_state->cancelled);
 }
 
 int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
 {
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));
     if (error_is_read) {
         *error_is_read = call_state->error_is_read;
     }
@@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
 
 void block_copy_call_cancel(BlockCopyCallState *call_state)
 {
-    call_state->cancelled = true;
+    qatomic_set(&call_state->cancelled, true);
     block_copy_kick(call_state);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common
  2021-06-14  7:33 ` [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common Emanuele Giuseppe Esposito
@ 2021-06-19 14:33   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 14:33 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> Use a local variable instead of referencing BlockCopyState through a
> BlockCopyCallState or BlockCopyTask every time.
> This is in preparation for next patches.
> 
> No functional change intended.
> 
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write
  2021-06-14  7:33 ` [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
@ 2021-06-19 15:05   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 18:23   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 15:05 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> Put the logic to determine the copy size in a separate function, so
> that there is a simple state machine for the possible methods of
> copying data from one BlockDriverState to the other.
> 
> Use .method instead of .copy_range as in-out argument, and
> include also .zeroes as an additional copy method.
> 
> While at it, store the common computation of block_copy_max_transfer
> into a new field of BlockCopyState, and make sure that we always
> obey max_transfer; that's more efficient even for the
> COPY_RANGE_READ_WRITE case.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/block-copy.c | 174 +++++++++++++++++++++++----------------------
>   1 file changed, 89 insertions(+), 85 deletions(-)
> 

[..]

> @@ -267,28 +293,28 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>           .len = bdrv_dirty_bitmap_size(copy_bitmap),
>           .write_flags = write_flags,
>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
> +        .max_transfer = QEMU_ALIGN_DOWN(
> +                                    block_copy_max_transfer(source, target),
> +                                    cluster_size),
>       };
>   
> -    if (block_copy_max_transfer(source, target) < cluster_size) {
> +    if (s->max_transfer < cluster_size) {
>           /*
>            * copy_range does not respect max_transfer. We don't want to bother
>            * with requests smaller than block-copy cluster size, so fallback to
>            * buffered copying (read and write respect max_transfer on their
>            * behalf).
>            */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>           /* Compression supports only cluster-size writes and no copy-range. */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
>       } else {
>           /*
>            * We enable copy-range, but keep small copy_size, until first
>            * successful copy_range (look at block_copy_do_copy).
>            */

Comment outdated. Maybe:

/* if copy range enabled, start with COPY_RANGE_SMALL, until first successful copy_range (look at block_copy_do_copy). */

> -        s->use_copy_range = use_copy_range;
> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>       }
>   
>       ratelimit_init(&s->rate_limit);

[..]

> @@ -377,76 +400,59 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>               *error_is_read = false;
>           }
>           return ret;
> -    }
>   
> -    if (*copy_range) {
> +    case COPY_RANGE_SMALL:
> +    case COPY_RANGE_FULL:
>           ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
>                                    0, s->write_flags);
> -        if (ret < 0) {
> -            trace_block_copy_copy_range_fail(s, offset, ret);
> -            *copy_range = false;
> -            /* Fallback to read+write with allocated buffer */
> -        } else {
> +        if (ret >= 0) {
> +            /* Successful copy-range, increase copy_size.  */

No copy_size now. So, "increase chunk size".

> +            *method = COPY_RANGE_FULL;
>               return 0;
>           }
> -    }
>   

Comment tweaks are non-critical, can be done in-flight.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
@ 2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 18:31     ` Vladimir Sementsov-Ogievskiy
  2021-06-21  7:59     ` Emanuele Giuseppe Esposito
  2021-06-19 17:27   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 18:53   ` Vladimir Sementsov-Ogievskiy
  2 siblings, 2 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 15:23 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> As done in BlockCopyCallState, categorize BlockCopyTask
> and BlockCopyState in IN, State and OUT fields.
> This is just to understand which field has to be protected with a lock.
> 
> .sleep_state is handled in the series "coroutine: new sleep/wake API"
> and thus here left as TODO.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
>   1 file changed, 31 insertions(+), 18 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 3f26be8ddc..5ff7764e87 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>       /* Coroutine where async block-copy is running */
>       Coroutine *co;
>   
> -    /* To reference all call states from BlockCopyState */
> -    QLIST_ENTRY(BlockCopyCallState) list;
> -
>       /* State */
> -    int ret;
>       bool finished;
> -    QemuCoSleep sleep;
> -    bool cancelled;
> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
> +
> +    /* To reference all call states from BlockCopyState */
> +    QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* OUT parameters */
> +    bool cancelled;
>       bool error_is_read;
> +    int ret;
>   } BlockCopyCallState;
>   
>   typedef struct BlockCopyTask {
>       AioTask task;
>   
> +    /*
> +     * IN parameters. Initialized in block_copy_task_create()
> +     * and never changed.
> +     */

That's just not true for method field :(

>       BlockCopyState *s;
>       BlockCopyCallState *call_state;
>       int64_t offset;
> -    int64_t bytes;
>       BlockCopyMethod method;
> -    QLIST_ENTRY(BlockCopyTask) list;
> +
> +    /* State */
>       CoQueue wait_queue; /* coroutines blocked on this task */
> +    int64_t bytes;
> +    QLIST_ENTRY(BlockCopyTask) list;
>   } BlockCopyTask;
>   
>   static int64_t task_end(BlockCopyTask *task)
> @@ -90,15 +96,25 @@ typedef struct BlockCopyState {
>        */
>       BdrvChild *source;
>       BdrvChild *target;
> -    BdrvDirtyBitmap *copy_bitmap;
> +
> +    /* State */
>       int64_t in_flight_bytes;
> -    int64_t cluster_size;
>       BlockCopyMethod method;
> -    int64_t max_transfer;
> -    uint64_t len;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>       QLIST_HEAD(, BlockCopyCallState) calls;
> +    /* State fields that use a thread-safe API */
> +    BdrvDirtyBitmap *copy_bitmap;
> +    ProgressMeter *progress;
> +    SharedResource *mem;
> +    RateLimit rate_limit;
>   
> +    /*
> +     * IN parameters. Initialized in block_copy_state_new()
> +     * and never changed.
> +     */
> +    int64_t cluster_size;
> +    int64_t max_transfer;
> +    uint64_t len;
>       BdrvRequestFlags write_flags;
>   
>       /*
> @@ -114,14 +130,11 @@ typedef struct BlockCopyState {
>        * In this case, block_copy() will query the source’s allocation status,
>        * skip unallocated regions, clear them in the copy_bitmap, and invoke
>        * block_copy_reset_unallocated() every time it does.
> +     *
> +     * This field is set in backup_run() before coroutines are run,
> +     * therefore is an IN.

That's not true: it is modified from backup_run, when block-copy already initialized, and block_copy() calls may be done from backup-top filter.

>        */
>       bool skip_unallocated;
> -
> -    ProgressMeter *progress;
> -
> -    SharedResource *mem;
> -
> -    RateLimit rate_limit;
>   } BlockCopyState;
>   
>   static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
  2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-19 17:27   ` Vladimir Sementsov-Ogievskiy
  2021-06-21  8:21     ` Emanuele Giuseppe Esposito
  2021-06-19 18:53   ` Vladimir Sementsov-Ogievskiy
  2 siblings, 1 reply; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 17:27 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> As done in BlockCopyCallState, categorize BlockCopyTask
> and BlockCopyState in IN, State and OUT fields.
> This is just to understand which field has to be protected with a lock.
> 
> .sleep_state is handled in the series "coroutine: new sleep/wake API"
> and thus here left as TODO.
> 
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>
> ---
>   block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
>   1 file changed, 31 insertions(+), 18 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 3f26be8ddc..5ff7764e87 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>       /* Coroutine where async block-copy is running */
>       Coroutine *co;
>   
> -    /* To reference all call states from BlockCopyState */
> -    QLIST_ENTRY(BlockCopyCallState) list;
> -
>       /* State */
> -    int ret;
>       bool finished;
> -    QemuCoSleep sleep;
> -    bool cancelled;
> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
> +
> +    /* To reference all call states from BlockCopyState */
> +    QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* OUT parameters */
> +    bool cancelled;

actually, cancelled is not OUT field. It's set by user to cancel the operation. And block-copy tracks the field to understand should it cancel at the moment or not. So, it's part of state.

Also, I just now understand, that "out parameter" sounds strange here. As "parameter" is an input by general meaning.. We may have "out parameters" for functions, as all parameters of a function are generally called "parameters" anyway. I think "OUT fields" is more correct here. I don't insist, as I'm not an expert in English (for sure, you'll find mistakes even in this paragraph :\

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write
  2021-06-14  7:33 ` [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
  2021-06-19 15:05   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-19 18:23   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 18:23 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> @@ -158,8 +183,9 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>                                                int64_t offset, int64_t bytes)
>   {
>       BlockCopyTask *task;
> -    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, call_state->max_chunk);
> +    int64_t max_chunk = block_copy_chunk_size(s);
>   
> +    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);

Note that you modify this code in patch 4. So, you may write here

int64_t max_chunk;

max_chunk = MIN_NON_ZERO(block_copy_chunk_size(s), call_state->max_chunk);

to not modify it later.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-19 18:31     ` Vladimir Sementsov-Ogievskiy
  2021-06-21  8:13       ` Emanuele Giuseppe Esposito
  2021-06-21  7:59     ` Emanuele Giuseppe Esposito
  1 sibling, 1 reply; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 18:31 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

19.06.2021 18:23, Vladimir Sementsov-Ogievskiy wrote:
>>   typedef struct BlockCopyTask {
>>       AioTask task;
>> +    /*
>> +     * IN parameters. Initialized in block_copy_task_create()
>> +     * and never changed.
>> +     */
> 
> That's just not true for method field :(

I think, we just need to document that @method is never accessed concurrently

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
  2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 17:27   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-19 18:53   ` Vladimir Sementsov-Ogievskiy
  2021-06-21  8:28     ` Emanuele Giuseppe Esposito
  2 siblings, 1 reply; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 18:53 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>       /* Coroutine where async block-copy is running */
>       Coroutine *co;
>   
> -    /* To reference all call states from BlockCopyState */
> -    QLIST_ENTRY(BlockCopyCallState) list;
> -
>       /* State */
> -    int ret;
>       bool finished;
> -    QemuCoSleep sleep;
> -    bool cancelled;
> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
> +
> +    /* To reference all call states from BlockCopyState */
> +    QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* OUT parameters */
> +    bool cancelled;
>       bool error_is_read;
> +    int ret;

Hmm, about that. Is @ret an "OUT parameter"? Yes it is.

But someone may think, that out parameters doesn't need locking like "State" parameters (otherwise, what is the difference for the person who read these comments?). But that is wrong. And ret is modified under mutex for reason.

Actually, the full description of "ret" field usage may look as follows:

Set concurrently by tasks under mutex. Only set once by first failed task (and untouched if no task failed).
After finish (if call_state->finished is true) not modified anymore and may be read safely without mutex.

So, before finished, ret is a kind of "State" too: it is both read and written by tasks.

This shows to me that dividing fields into "IN", "State" and "OUT", doesn't really help here. In this series we use different policies of concurrent access to fields: some are accessed only under mutex, other has more complex usage scenario (like this @ret), some needs atomic access.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/6] block-copy: add a CoMutex
  2021-06-14  7:33 ` [PATCH v4 5/6] block-copy: add a CoMutex Emanuele Giuseppe Esposito
@ 2021-06-19 19:34   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 19:34 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> Add a CoMutex to protect concurrent access of block-copy
> data structures.
> 
> This mutex also protects .copy_bitmap, because its thread-safe
> API does not prevent it from assigning two tasks to the same
> bitmap region.
> 
> .finished, .cancelled and reads to .ret and .error_is_read will be
> protected in the following patch, because are used also outside
> coroutines.
> 
> Also set block_copy_task_create as coroutine_fn because:
> 1) it is static and only invoked by coroutine functions
> 2) this patch introduces and uses a CoMutex lock there
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c         | 79 +++++++++++++++++++++++++++++---------
>   include/block/block-copy.h |  2 +
>   2 files changed, 62 insertions(+), 19 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index afa2f484f0..6416929abd 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -61,6 +61,7 @@ typedef struct BlockCopyCallState {
>   
>       /* OUT parameters */
>       bool cancelled;
> +    /* Fields protected by lock in BlockCopyState */
>       bool error_is_read;
>       int ret;
>   } BlockCopyCallState;
> @@ -77,8 +78,12 @@ typedef struct BlockCopyTask {
>       int64_t offset;
>       BlockCopyMethod method;
>   
> -    /* State */
> +    /* State. Protected by lock in BlockCopyState */
>       CoQueue wait_queue; /* coroutines blocked on this task */
> +    /*
> +     * Only protect the case of parallel read while updating @bytes
> +     * value in block_copy_task_shrink().
> +     */

So, you have to add new comments and modify comment added in previous commit. That's another sign to just merge patch 03 here.

>       int64_t bytes;
>       QLIST_ENTRY(BlockCopyTask) list;
>   } BlockCopyTask;
> @@ -97,7 +102,8 @@ typedef struct BlockCopyState {
>       BdrvChild *source;
>       BdrvChild *target;
>   
> -    /* State */
> +    /* State. Protected by lock */
> +    CoMutex lock;
>       int64_t in_flight_bytes;
>       BlockCopyMethod method;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
> @@ -137,6 +143,7 @@ typedef struct BlockCopyState {
>       bool skip_unallocated;
>   } BlockCopyState;
>   
> +/* Called with lock held */
>   static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
>                                               int64_t offset, int64_t bytes)
>   {
> @@ -154,6 +161,8 @@ static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
>   /*
>    * If there are no intersecting tasks return false. Otherwise, wait for the
>    * first found intersecting tasks to finish and return true.
> + *
> + * Called with lock held.

Please:

Called with lock held, may temporary release the lock. Return value of 0 proves that lock was NOT released.

>    */
>   static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>                                                int64_t bytes)
> @@ -164,7 +173,7 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>           return false;
>       }
>   
> -    qemu_co_queue_wait(&task->wait_queue, NULL);
> +    qemu_co_queue_wait(&task->wait_queue, &s->lock);
>   
>       return true;
>   }
> @@ -191,14 +200,15 @@ static int64_t block_copy_chunk_size(BlockCopyState *s)

Hmm, you had /* Called with lock held */ comment for block_copy_chunk_size() in a previous version.. Why dropped?

>    * Search for the first dirty area in offset/bytes range and create task at
>    * the beginning of it.
>    */
> -static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
> -                                             BlockCopyCallState *call_state,
> -                                             int64_t offset, int64_t bytes)
> +static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
> +                                                BlockCopyCallState *call_state,
> +                                                int64_t offset, int64_t bytes)

That breaks Qemu coding style (docs/devel/style.rst  "Multiline Indent")

Function "type" becomes too long. Common practice is keep a type on a separate line:

static coroutine_fn BlockCopyTask *
block_copy_task_create(BlockCopyState *s, BlockCopyCallState *call_state,
                        int64_t offset, int64_t bytes)

(and this is used in the file in one place, so it would be consistent do same thing here)

Another way is 4-spaces indent:

static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
     BlockCopyCallState *call_state, int64_t offset, int64_t bytes)


choose what you want.

>   {
>       BlockCopyTask *task;
> -    int64_t max_chunk = block_copy_chunk_size(s);
> +    int64_t max_chunk;
>   
> -    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
> +    QEMU_LOCK_GUARD(&s->lock);
> +    max_chunk = MIN_NON_ZERO(block_copy_chunk_size(s), call_state->max_chunk);
>       if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
>                                              offset, offset + bytes,
>                                              max_chunk, &offset, &bytes))
> @@ -240,6 +250,7 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>   static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
>                                                   int64_t new_bytes)
>   {
> +    QEMU_LOCK_GUARD(&task->s->lock);
>       if (new_bytes == task->bytes) {
>           return;
>       }
> @@ -256,6 +267,7 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
>   
>   static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
>   {
> +    QEMU_LOCK_GUARD(&task->s->lock);
>       task->s->in_flight_bytes -= task->bytes;
>       if (ret < 0) {
>           bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
> @@ -334,12 +346,14 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>       }
>   
>       ratelimit_init(&s->rate_limit);
> +    qemu_co_mutex_init(&s->lock);
>       QLIST_INIT(&s->tasks);
>       QLIST_INIT(&s->calls);
>   
>       return s;
>   }
>   
> +/* Only set before running the job, so it is thread safe to call. */

It's not thread-safe to call, it's just not used so that we don't have to care abot thread-safety.. So, maybe:

Only set before running the job, no need for locking.

>   void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm)
>   {
>       s->progress = pm;
> @@ -480,16 +494,20 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
>       int ret;
>   
>       ret = block_copy_do_copy(s, t->offset, t->bytes, &method, &error_is_read);
> -    if (s->method == t->method) {
> -        s->method = method;
> -    }
> -    if (ret < 0) {
> -        if (!t->call_state->ret) {
> -            t->call_state->ret = ret;
> -            t->call_state->error_is_read = error_is_read;
> +
> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
> +        if (s->method == t->method) {
> +            s->method = method;
> +        }
> +
> +        if (ret < 0) {
> +            if (!t->call_state->ret) {
> +                t->call_state->ret = ret;
> +                t->call_state->error_is_read = error_is_read;
> +            }
> +        } else {
> +            progress_work_done(s->progress, t->bytes);
>           }
> -    } else {
> -        progress_work_done(s->progress, t->bytes);
>       }
>       co_put_to_shres(s->mem, t->bytes);
>       block_copy_task_end(t, ret);
> @@ -591,10 +609,12 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>       bytes = clusters * s->cluster_size;
>   
>       if (!ret) {
> +        qemu_co_mutex_lock(&s->lock);
>           bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>           progress_set_remaining(s->progress,
>                                  bdrv_get_dirty_count(s->copy_bitmap) +
>                                  s->in_flight_bytes);
> +        qemu_co_mutex_unlock(&s->lock);
>       }
>   
>       *count = bytes;
> @@ -734,14 +754,33 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>       int ret;
>       BlockCopyState *s = call_state->s;
>   
> +    qemu_co_mutex_lock(&s->lock);
>       QLIST_INSERT_HEAD(&s->calls, call_state, list);
> +    qemu_co_mutex_unlock(&s->lock);
>   
>       do {
>           ret = block_copy_dirty_clusters(call_state);
>   
>           if (ret == 0 && !call_state->cancelled) {
> -            ret = block_copy_wait_one(s, call_state->offset,
> -                                      call_state->bytes);
> +            WITH_QEMU_LOCK_GUARD(&s->lock) {
> +                /*
> +                 * Check that there is no task we still need to
> +                 * wait to complete
> +                 */
> +                ret = block_copy_wait_one(s, call_state->offset,
> +                                          call_state->bytes);
> +                if (ret == 0) {
> +                    /*

Please add: block_copy_wait_one return value 0 also means that it didn't relase the lock. So, we are still in the same critical section, not interrupted by any concurrent access to state.

(actually, I started to write a comment that critical section is broken here, but understood that all is OK)

> +                     * No pending tasks, but check again the bitmap in this
> +                     * same critical section, since a task might have failed
> +                     * between this and the critical section in
> +                     * block_copy_dirty_clusters().
> +                     */
> +                    ret = bdrv_dirty_bitmap_next_dirty(s->copy_bitmap,
> +                                                       call_state->offset,
> +                                                       call_state->bytes) > 0;

It should be s/> 0/>= 0/. bdrv_dirty_bitmap_next_dirty() returns -1 if not found. And 0 is valid offset.

> +                }
> +            }
>           }
k>
>           /*
> @@ -761,7 +800,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>           call_state->cb(call_state->cb_opaque);
>       }
>   
> +    qemu_co_mutex_lock(&s->lock);
>       QLIST_REMOVE(call_state, list);
> +    qemu_co_mutex_unlock(&s->lock);
>   
>       return ret;
>   }
> diff --git a/include/block/block-copy.h b/include/block/block-copy.h
> index 338f2ea7fd..bf8f0c679f 100644
> --- a/include/block/block-copy.h
> +++ b/include/block/block-copy.h

Please, add

[diff]
     orderFile = /path/to/qemu/scripts/git.orderfile

to your .git/config file, for good ordering of files inside a patch. For example, headers would be at the top of the patch.

> @@ -18,6 +18,8 @@
>   #include "block/block.h"
>   #include "qemu/co-shared-resource.h"
>   
> +/* All APIs are thread-safe*/

To be honest, you can add this comment only in the following commit.
Also, whitespace missed.

> +
>   typedef void (*BlockCopyAsyncCallbackFunc)(void *opaque);
>   typedef struct BlockCopyState BlockCopyState;
>   typedef struct BlockCopyCallState BlockCopyCallState;
> 

Overall, looks close :)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-14  7:33 ` [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
@ 2021-06-19 20:06   ` Vladimir Sementsov-Ogievskiy
  2021-06-21  9:30     ` Emanuele Giuseppe Esposito
  2021-06-22  8:15     ` Paolo Bonzini
  0 siblings, 2 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 20:06 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> By adding acquire/release pairs, we ensure that .ret and .error_is_read
> fields are written by block_copy_dirty_clusters before .finished is true.

And that they are read by API user after .finished is true.

> 
> The atomic here are necessary because the fields are concurrently modified
> also outside coroutines.

To be honest, finished is modified only in coroutine. And read outside.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 33 ++++++++++++++++++---------------
>   1 file changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 6416929abd..5348e1f61b 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>       Coroutine *co;
>   
>       /* State */
> -    bool finished;
> +    bool finished; /* atomic */

So, logic around finished:

Thread of block_copy does:
0. finished is false
1. tasks set ret and error_is_read
2. qatomic_store_release finished -> true
3. after that point ret and error_is_read are not modified

Other threads can:

- qatomic_read finished, just to check are we finished or not

- if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true.

>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>   
>       /* To reference all call states from BlockCopyState */
>       QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* OUT parameters */
> -    bool cancelled;
> +    bool cancelled; /* atomic */

Logic around cancelled is simpler:

- false at start

- qatomic_read is allowed from any thread

- qatomic_write to true is allowed from any thread

- never write to false

Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation.

>       /* Fields protected by lock in BlockCopyState */
>       bool error_is_read;
>       int ret;
> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
> +    while (bytes && aio_task_pool_status(aio) == 0 &&
> +           !qatomic_read(&call_state->cancelled)) {
>           BlockCopyTask *task;
>           int64_t status_bytes;
>   
> @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>       do {
>           ret = block_copy_dirty_clusters(call_state);
>   
> -        if (ret == 0 && !call_state->cancelled) {
> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>                   /*
>                    * Check that there is no task we still need to
> @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>            * 2. We have waited for some intersecting block-copy request
>            *    It may have failed and produced new dirty bits.
>            */
> -    } while (ret > 0 && !call_state->cancelled);
> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>   
> -    call_state->finished = true;
> +    qatomic_store_release(&call_state->finished, true);

so, all writes to ret and error_is_read are finished to this point.

>   
>       if (call_state->cb) {
>           call_state->cb(call_state->cb_opaque);
> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
>           return;
>       }
>   
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));

Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough.

>       g_free(call_state);
>   }
>   
>   bool block_copy_call_finished(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished;
> +    return qatomic_load_acquire(&call_state->finished);

and here

>   }
>   
>   bool block_copy_call_succeeded(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret == 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret == 0;

Here qatomic_load_acquire() is reasonable: it protects read of ->ret

>   }
>   
>   bool block_copy_call_failed(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret < 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret < 0;

Here reasonable.

>   }
>   
>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>   {
> -    return call_state->cancelled;
> +    return qatomic_read(&call_state->cancelled);
>   }
>   
>   int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   {
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));

Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.

Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.

So, let's use simple qatomic_read here too.

>       if (error_is_read) {
>           *error_is_read = call_state->error_is_read;
>       }
> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   
>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>   {
> -    call_state->cancelled = true;
> +    qatomic_set(&call_state->cancelled, true);
>       block_copy_kick(call_state);
>   }
>   
> 

Uhh :)

Ok, that looks close too. Or in other words, I feel that I have good enough understanding of all the thread-safe logic that you have implemented :)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
  2021-06-19 18:31     ` Vladimir Sementsov-Ogievskiy
@ 2021-06-21  7:59     ` Emanuele Giuseppe Esposito
  2021-06-22  9:16       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-21  7:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 19/06/2021 17:23, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>> As done in BlockCopyCallState, categorize BlockCopyTask
>> and BlockCopyState in IN, State and OUT fields.
>> This is just to understand which field has to be protected with a lock.
>>
>> .sleep_state is handled in the series "coroutine: new sleep/wake API"
>> and thus here left as TODO.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
>>   1 file changed, 31 insertions(+), 18 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 3f26be8ddc..5ff7764e87 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>>       /* Coroutine where async block-copy is running */
>>       Coroutine *co;
>> -    /* To reference all call states from BlockCopyState */
>> -    QLIST_ENTRY(BlockCopyCallState) list;
>> -
>>       /* State */
>> -    int ret;
>>       bool finished;
>> -    QemuCoSleep sleep;
>> -    bool cancelled;
>> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
>> +
>> +    /* To reference all call states from BlockCopyState */
>> +    QLIST_ENTRY(BlockCopyCallState) list;
>>       /* OUT parameters */
>> +    bool cancelled;
>>       bool error_is_read;
>> +    int ret;
>>   } BlockCopyCallState;
>>   typedef struct BlockCopyTask {
>>       AioTask task;
>> +    /*
>> +     * IN parameters. Initialized in block_copy_task_create()
>> +     * and never changed.
>> +     */
> 
> That's just not true for method field :(

You're right, because it is modified later in the while loop of 
block_copy_dirty_clusters, but the task is already in the list.
Will move it to state field.

> 
>>       BlockCopyState *s;
>>       BlockCopyCallState *call_state;
>>       int64_t offset;
>> -    int64_t bytes;
>>       BlockCopyMethod method;
>> -    QLIST_ENTRY(BlockCopyTask) list;
>> +
>> +    /* State */
>>       CoQueue wait_queue; /* coroutines blocked on this task */
>> +    int64_t bytes;
>> +    QLIST_ENTRY(BlockCopyTask) list;
>>   } BlockCopyTask;
>>   static int64_t task_end(BlockCopyTask *task)
>> @@ -90,15 +96,25 @@ typedef struct BlockCopyState {
>>        */
>>       BdrvChild *source;
>>       BdrvChild *target;
>> -    BdrvDirtyBitmap *copy_bitmap;
>> +
>> +    /* State */
>>       int64_t in_flight_bytes;
>> -    int64_t cluster_size;
>>       BlockCopyMethod method;
>> -    int64_t max_transfer;
>> -    uint64_t len;
>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all 
>> block-copy calls */
>>       QLIST_HEAD(, BlockCopyCallState) calls;
>> +    /* State fields that use a thread-safe API */
>> +    BdrvDirtyBitmap *copy_bitmap;
>> +    ProgressMeter *progress;
>> +    SharedResource *mem;
>> +    RateLimit rate_limit;
>> +    /*
>> +     * IN parameters. Initialized in block_copy_state_new()
>> +     * and never changed.
>> +     */
>> +    int64_t cluster_size;
>> +    int64_t max_transfer;
>> +    uint64_t len;
>>       BdrvRequestFlags write_flags;
>>       /*
>> @@ -114,14 +130,11 @@ typedef struct BlockCopyState {
>>        * In this case, block_copy() will query the source’s allocation 
>> status,
>>        * skip unallocated regions, clear them in the copy_bitmap, and 
>> invoke
>>        * block_copy_reset_unallocated() every time it does.
>> +     *
>> +     * This field is set in backup_run() before coroutines are run,
>> +     * therefore is an IN.
> 
> That's not true: it is modified from backup_run, when block-copy already 
> initialized, and block_copy() calls may be done from backup-top filter.
> 

Ok, I am not very familiar with the backup code, so I did not see it.
This means we need to protect this field too.

At this point, I think we can increase the granularity of the lock in 
block_copy_dirty_clusters:
instead of having in each while loop

block_copy_task_create(); // locks and releases internally
block_copy_block_status(); // no lock used, but uses skip_unallocated so 
it will need one
if()
	block_copy_task_shrink(); // locks and releases internally
if(s->skip_unallocated) // will need lock
	block_copy_task_end(); // locks and releases internally
	[..]
if()
	task->method = COPY_WRITE_ZEROS; // needs lock
[..]

we can just lock the while section and eventually create _locked() 
version of task_end and similar functions that are also used in 
non-locked code blocks.


Emanuele

>>        */
>>       bool skip_unallocated;
>> -
>> -    ProgressMeter *progress;
>> -
>> -    SharedResource *mem;
>> -
>> -    RateLimit rate_limit;
>>   } BlockCopyState;
>>   static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
>>
> 
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-19 18:31     ` Vladimir Sementsov-Ogievskiy
@ 2021-06-21  8:13       ` Emanuele Giuseppe Esposito
  2021-06-22  9:20         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-21  8:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 19/06/2021 20:31, Vladimir Sementsov-Ogievskiy wrote:
> 19.06.2021 18:23, Vladimir Sementsov-Ogievskiy wrote:
>>>   typedef struct BlockCopyTask {
>>>       AioTask task;
>>> +    /*
>>> +     * IN parameters. Initialized in block_copy_task_create()
>>> +     * and never changed.
>>> +     */
>>
>> That's just not true for method field :(
> 
> I think, we just need to document that @method is never accessed 
> concurrently
> 
Ok I got confused in the last patch. Method is read by 
block_copy_task_entry only after it is re-set in 
block_copy_dirty_clusters loop. Sorry for that.

Will leave it as IN and document it better.

Still, moving the lock granularity inside the while loop might not be 
too bad. Not sure though. At this point skip_unallocated can also be an 
atomic, even though I sense that you won't like that :)

Emanuele



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-19 17:27   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-21  8:21     ` Emanuele Giuseppe Esposito
  0 siblings, 0 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-21  8:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 19/06/2021 19:27, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>> As done in BlockCopyCallState, categorize BlockCopyTask
>> and BlockCopyState in IN, State and OUT fields.
>> This is just to understand which field has to be protected with a lock.
>>
>> .sleep_state is handled in the series "coroutine: new sleep/wake API"
>> and thus here left as TODO.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
>>   1 file changed, 31 insertions(+), 18 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 3f26be8ddc..5ff7764e87 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>>       /* Coroutine where async block-copy is running */
>>       Coroutine *co;
>> -    /* To reference all call states from BlockCopyState */
>> -    QLIST_ENTRY(BlockCopyCallState) list;
>> -
>>       /* State */
>> -    int ret;
>>       bool finished;
>> -    QemuCoSleep sleep;
>> -    bool cancelled;
>> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
>> +
>> +    /* To reference all call states from BlockCopyState */
>> +    QLIST_ENTRY(BlockCopyCallState) list;
>>       /* OUT parameters */
>> +    bool cancelled;
> 
> actually, cancelled is not OUT field. It's set by user to cancel the 
> operation. And block-copy tracks the field to understand should it 
> cancel at the moment or not. So, it's part of state.

Makes sense.

> 
> Also, I just now understand, that "out parameter" sounds strange here. 
> As "parameter" is an input by general meaning.. We may have "out 
> parameters" for functions, as all parameters of a function are generally 
> called "parameters" anyway. I think "OUT fields" is more correct here. I 
> don't insist, as I'm not an expert in English (for sure, you'll find 
> mistakes even in this paragraph :\
> 
Actually I am using the terminology that was already there :)
Anyways, I am not expert here either but fields do sounds better.
I will change parameter -> field replacement to this patch.

Emanuele



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-19 18:53   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-21  8:28     ` Emanuele Giuseppe Esposito
  0 siblings, 0 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-21  8:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 19/06/2021 20:53, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>>       /* Coroutine where async block-copy is running */
>>       Coroutine *co;
>> -    /* To reference all call states from BlockCopyState */
>> -    QLIST_ENTRY(BlockCopyCallState) list;
>> -
>>       /* State */
>> -    int ret;
>>       bool finished;
>> -    QemuCoSleep sleep;
>> -    bool cancelled;
>> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
>> +
>> +    /* To reference all call states from BlockCopyState */
>> +    QLIST_ENTRY(BlockCopyCallState) list;
>>       /* OUT parameters */
>> +    bool cancelled;
>>       bool error_is_read;
>> +    int ret;
> 
> Hmm, about that. Is @ret an "OUT parameter"? Yes it is.
> 
> But someone may think, that out parameters doesn't need locking like 
> "State" parameters (otherwise, what is the difference for the person who 
> read these comments?). But that is wrong. And ret is modified under 
> mutex for reason.

In patch 5 I added a comment above @ret and @error_is_read:
/* Fields protected by lock in BlockCopyState */

I can add your explanation too.

> 
> Actually, the full description of "ret" field usage may look as follows:
> 
> Set concurrently by tasks under mutex. Only set once by first failed 
> task (and untouched if no task failed).
> After finish (if call_state->finished is true) not modified anymore and 
> may be read safely without mutex.
> 
> So, before finished, ret is a kind of "State" too: it is both read and 
> written by tasks.
> 
> This shows to me that dividing fields into "IN", "State" and "OUT", 
> doesn't really help here. In this series we use different policies of 
> concurrent access to fields: some are accessed only under mutex, other 
> has more complex usage scenario (like this @ret), some needs atomic access.
> 
Yes but I think especially the IN vs State division helps a lot to 
understand what needs a lock and what doesn't.

Emanuele



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-19 20:06   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-21  9:30     ` Emanuele Giuseppe Esposito
  2021-06-22  9:56       ` Vladimir Sementsov-Ogievskiy
  2021-06-22  8:15     ` Paolo Bonzini
  1 sibling, 1 reply; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-21  9:30 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>> fields are written by block_copy_dirty_clusters before .finished is true.
> 
> And that they are read by API user after .finished is true.
> 
>>
>> The atomic here are necessary because the fields are concurrently 
>> modified
>> also outside coroutines.
> 
> To be honest, finished is modified only in coroutine. And read outside.
> 
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 33 ++++++++++++++++++---------------
>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 6416929abd..5348e1f61b 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>>       Coroutine *co;
>>       /* State */
>> -    bool finished;
>> +    bool finished; /* atomic */
> 
> So, logic around finished:
> 
> Thread of block_copy does:
> 0. finished is false
> 1. tasks set ret and error_is_read
> 2. qatomic_store_release finished -> true
> 3. after that point ret and error_is_read are not modified
> 
> Other threads can:
> 
> - qatomic_read finished, just to check are we finished or not
> 
> - if finished, can read ret and error_is_read safely. If you not sure 
> that block-copy finished, use qatomic_load_acquire() of finished first, 
> to be sure that you read ret and error_is_read AFTER finished read and 
> checked to be true.
> 
>>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>>       /* To reference all call states from BlockCopyState */
>>       QLIST_ENTRY(BlockCopyCallState) list;
>>       /* OUT parameters */
>> -    bool cancelled;
>> +    bool cancelled; /* atomic */
> 
> Logic around cancelled is simpler:
> 
> - false at start
> 
> - qatomic_read is allowed from any thread
> 
> - qatomic_write to true is allowed from any thread
> 
> - never write to false
> 
> Note that cancelling and finishing are racy. User can cancel block-copy 
> that's already finished. We probably may improve change it, but I'm not 
> sure that it worth doing. Still, maybe leave some comment in API 
> documentation.
> 
>>       /* Fields protected by lock in BlockCopyState */
>>       bool error_is_read;
>>       int ret;
>> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState 
>> *call_state)
>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>> -    while (bytes && aio_task_pool_status(aio) == 0 && 
>> !call_state->cancelled) {
>> +    while (bytes && aio_task_pool_status(aio) == 0 &&
>> +           !qatomic_read(&call_state->cancelled)) {
>>           BlockCopyTask *task;
>>           int64_t status_bytes;
>> @@ -761,7 +762,7 @@ static int coroutine_fn 
>> block_copy_common(BlockCopyCallState *call_state)
>>       do {
>>           ret = block_copy_dirty_clusters(call_state);
>> -        if (ret == 0 && !call_state->cancelled) {
>> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>>                   /*
>>                    * Check that there is no task we still need to
>> @@ -792,9 +793,9 @@ static int coroutine_fn 
>> block_copy_common(BlockCopyCallState *call_state)
>>            * 2. We have waited for some intersecting block-copy request
>>            *    It may have failed and produced new dirty bits.
>>            */
>> -    } while (ret > 0 && !call_state->cancelled);
>> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>> -    call_state->finished = true;
>> +    qatomic_store_release(&call_state->finished, true);
> 
> so, all writes to ret and error_is_read are finished to this point.
> 
>>       if (call_state->cb) {
>>           call_state->cb(call_state->cb_opaque);
>> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState 
>> *call_state)
>>           return;
>>       }
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Here we don't need load_aquire, as we don't read other fields. 
> qatomic_read is enough.

So what you say makes sense, the only thing that I wonder is: wouldn't 
it be better to have the acquire without assertion (or assert 
afterwards), just to be sure that we delete when finished is true?

[...]

> 
>>   }
>>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>>   {
>> -    return call_state->cancelled;
>> +    return qatomic_read(&call_state->cancelled);
>>   }
>>   int block_copy_call_status(BlockCopyCallState *call_state, bool 
>> *error_is_read)
>>   {
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
> not yet finished anyway). So, caller is double sure that block-copy is 
> finished.
> 
> Also it's misleading: if we think that it do some protection, we are 
> doing wrong thing: assertions may be simply compiled out, we can't rely 
> on statements inside assert() to be executed.
> 
> So, let's use simple qatomic_read here too.

Same applies here.

> 
>>       if (error_is_read) {
>>           *error_is_read = call_state->error_is_read;
>>       }
>> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState 
>> *call_state, bool *error_is_read)
>>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>>   {
>> -    call_state->cancelled = true;
>> +    qatomic_set(&call_state->cancelled, true);
>>       block_copy_kick(call_state);
>>   }
>>
> 
> Uhh :)
> 
> Ok, that looks close too. Or in other words, I feel that I have good 
> enough understanding of all the thread-safe logic that you have 
> implemented :)

Good! :)

Emanuele



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-19 20:06   ` Vladimir Sementsov-Ogievskiy
  2021-06-21  9:30     ` Emanuele Giuseppe Esposito
@ 2021-06-22  8:15     ` Paolo Bonzini
  2021-06-22  9:36       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2021-06-22  8:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:
>>
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
> not yet finished anyway). So, caller is double sure that block-copy is 
> finished.

It does.  If it returns true, you still want the load of finished to 
happen before the reads that follow.

Otherwise I agree with your remarks.

Paolo

> Also it's misleading: if we think that it do some protection, we are 
> doing wrong thing: assertions may be simply compiled out, we can't rely 
> on statements inside assert() to be executed.
> 
> So, let's use simple qatomic_read here too.
> 
>>       if (error_is_read) {
>>           *error_is_read = call_state->error_is_read;
>>       } 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-21  7:59     ` Emanuele Giuseppe Esposito
@ 2021-06-22  9:16       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-22  9:16 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

21.06.2021 10:59, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 19/06/2021 17:23, Vladimir Sementsov-Ogievskiy wrote:
>> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>>> As done in BlockCopyCallState, categorize BlockCopyTask
>>> and BlockCopyState in IN, State and OUT fields.
>>> This is just to understand which field has to be protected with a lock.
>>>
>>> .sleep_state is handled in the series "coroutine: new sleep/wake API"
>>> and thus here left as TODO.
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/block-copy.c | 49 +++++++++++++++++++++++++++++-----------------
>>>   1 file changed, 31 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 3f26be8ddc..5ff7764e87 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -52,29 +52,35 @@ typedef struct BlockCopyCallState {
>>>       /* Coroutine where async block-copy is running */
>>>       Coroutine *co;
>>> -    /* To reference all call states from BlockCopyState */
>>> -    QLIST_ENTRY(BlockCopyCallState) list;
>>> -
>>>       /* State */
>>> -    int ret;
>>>       bool finished;
>>> -    QemuCoSleep sleep;
>>> -    bool cancelled;
>>> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
>>> +
>>> +    /* To reference all call states from BlockCopyState */
>>> +    QLIST_ENTRY(BlockCopyCallState) list;
>>>       /* OUT parameters */
>>> +    bool cancelled;
>>>       bool error_is_read;
>>> +    int ret;
>>>   } BlockCopyCallState;
>>>   typedef struct BlockCopyTask {
>>>       AioTask task;
>>> +    /*
>>> +     * IN parameters. Initialized in block_copy_task_create()
>>> +     * and never changed.
>>> +     */
>>
>> That's just not true for method field :(
> 
> You're right, because it is modified later in the while loop of block_copy_dirty_clusters, but the task is already in the list.
> Will move it to state field.
> 
>>
>>>       BlockCopyState *s;
>>>       BlockCopyCallState *call_state;
>>>       int64_t offset;
>>> -    int64_t bytes;
>>>       BlockCopyMethod method;
>>> -    QLIST_ENTRY(BlockCopyTask) list;
>>> +
>>> +    /* State */
>>>       CoQueue wait_queue; /* coroutines blocked on this task */
>>> +    int64_t bytes;
>>> +    QLIST_ENTRY(BlockCopyTask) list;
>>>   } BlockCopyTask;
>>>   static int64_t task_end(BlockCopyTask *task)
>>> @@ -90,15 +96,25 @@ typedef struct BlockCopyState {
>>>        */
>>>       BdrvChild *source;
>>>       BdrvChild *target;
>>> -    BdrvDirtyBitmap *copy_bitmap;
>>> +
>>> +    /* State */
>>>       int64_t in_flight_bytes;
>>> -    int64_t cluster_size;
>>>       BlockCopyMethod method;
>>> -    int64_t max_transfer;
>>> -    uint64_t len;
>>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>>>       QLIST_HEAD(, BlockCopyCallState) calls;
>>> +    /* State fields that use a thread-safe API */
>>> +    BdrvDirtyBitmap *copy_bitmap;
>>> +    ProgressMeter *progress;
>>> +    SharedResource *mem;
>>> +    RateLimit rate_limit;
>>> +    /*
>>> +     * IN parameters. Initialized in block_copy_state_new()
>>> +     * and never changed.
>>> +     */
>>> +    int64_t cluster_size;
>>> +    int64_t max_transfer;
>>> +    uint64_t len;
>>>       BdrvRequestFlags write_flags;
>>>       /*
>>> @@ -114,14 +130,11 @@ typedef struct BlockCopyState {
>>>        * In this case, block_copy() will query the source’s allocation status,
>>>        * skip unallocated regions, clear them in the copy_bitmap, and invoke
>>>        * block_copy_reset_unallocated() every time it does.
>>> +     *
>>> +     * This field is set in backup_run() before coroutines are run,
>>> +     * therefore is an IN.
>>
>> That's not true: it is modified from backup_run, when block-copy already initialized, and block_copy() calls may be done from backup-top filter.
>>
> 
> Ok, I am not very familiar with the backup code, so I did not see it.
> This means we need to protect this field too.
> 
> At this point, I think we can increase the granularity of the lock in block_copy_dirty_clusters:
> instead of having in each while loop
> 
> block_copy_task_create(); // locks and releases internally
> block_copy_block_status(); // no lock used, but uses skip_unallocated so it will need one
> if()
>      block_copy_task_shrink(); // locks and releases internally
> if(s->skip_unallocated) // will need lock
>      block_copy_task_end(); // locks and releases internally
>      [..]
> if()
>      task->method = COPY_WRITE_ZEROS; // needs lock
> [..]
> 
> we can just lock the while section and eventually create _locked() version of task_end and similar functions that are also used in non-locked code blocks.

No, holding lock during block_copy_block_status is bad idea, as the function may yield (it call block_status).

I tend to agree that making atomic access to skip_unallocated is porbably the simplest way.

> 
>>>        */
>>>       bool skip_unallocated;
>>> -
>>> -    ProgressMeter *progress;
>>> -
>>> -    SharedResource *mem;
>>> -
>>> -    RateLimit rate_limit;
>>>   } BlockCopyState;
>>>   static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
>>>
>>
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions
  2021-06-21  8:13       ` Emanuele Giuseppe Esposito
@ 2021-06-22  9:20         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-22  9:20 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

21.06.2021 11:13, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 19/06/2021 20:31, Vladimir Sementsov-Ogievskiy wrote:
>> 19.06.2021 18:23, Vladimir Sementsov-Ogievskiy wrote:
>>>>   typedef struct BlockCopyTask {
>>>>       AioTask task;
>>>> +    /*
>>>> +     * IN parameters. Initialized in block_copy_task_create()
>>>> +     * and never changed.
>>>> +     */
>>>
>>> That's just not true for method field :(
>>
>> I think, we just need to document that @method is never accessed concurrently
>>
> Ok I got confused in the last patch. Method is read by block_copy_task_entry only after it is re-set in block_copy_dirty_clusters loop. Sorry for that.
> 
> Will leave it as IN and document it better.

IN/State/OUT doesn't really help IMHO. As most of fields of interest doesn't cleanly belong to any of the categories. Better documentation is good. If same variables share the same access documentation - make them a group.

> 
> Still, moving the lock granularity inside the while loop might not be too bad. Not sure though. At this point skip_unallocated can also be an atomic, even though I sense that you won't like that :)
> 

As I said in parallel email, We can't keep lock locked during block_status. So, make it atomic if it is convenient.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-22  8:15     ` Paolo Bonzini
@ 2021-06-22  9:36       ` Vladimir Sementsov-Ogievskiy
  2021-06-22 10:20         ` Paolo Bonzini
  0 siblings, 1 reply; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-22  9:36 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

22.06.2021 11:15, Paolo Bonzini wrote:
> On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.
> 
> It does.  If it returns true, you still want the load of finished to happen before the reads that follow.
> 

Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway.

OK, I agree, let's keep it.

> Otherwise I agree with your remarks.
> 
> Paolo
> 
>> Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.
>>
>> So, let's use simple qatomic_read here too.
>>
>>>       if (error_is_read) {
>>>           *error_is_read = call_state->error_is_read;
>>>       } 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-21  9:30     ` Emanuele Giuseppe Esposito
@ 2021-06-22  9:56       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-22  9:56 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

21.06.2021 12:30, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:
>> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>>> fields are written by block_copy_dirty_clusters before .finished is true.
>>
>> And that they are read by API user after .finished is true.
>>
>>>
>>> The atomic here are necessary because the fields are concurrently modified
>>> also outside coroutines.
>>
>> To be honest, finished is modified only in coroutine. And read outside.
>>
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/block-copy.c | 33 ++++++++++++++++++---------------
>>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 6416929abd..5348e1f61b 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>>>       Coroutine *co;
>>>       /* State */
>>> -    bool finished;
>>> +    bool finished; /* atomic */
>>
>> So, logic around finished:
>>
>> Thread of block_copy does:
>> 0. finished is false
>> 1. tasks set ret and error_is_read
>> 2. qatomic_store_release finished -> true
>> 3. after that point ret and error_is_read are not modified
>>
>> Other threads can:
>>
>> - qatomic_read finished, just to check are we finished or not
>>
>> - if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true.
>>
>>>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>>>       /* To reference all call states from BlockCopyState */
>>>       QLIST_ENTRY(BlockCopyCallState) list;
>>>       /* OUT parameters */
>>> -    bool cancelled;
>>> +    bool cancelled; /* atomic */
>>
>> Logic around cancelled is simpler:
>>
>> - false at start
>>
>> - qatomic_read is allowed from any thread
>>
>> - qatomic_write to true is allowed from any thread
>>
>> - never write to false
>>
>> Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation.
>>
>>>       /* Fields protected by lock in BlockCopyState */
>>>       bool error_is_read;
>>>       int ret;
>>> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>> -    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
>>> +    while (bytes && aio_task_pool_status(aio) == 0 &&
>>> +           !qatomic_read(&call_state->cancelled)) {
>>>           BlockCopyTask *task;
>>>           int64_t status_bytes;
>>> @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>>>       do {
>>>           ret = block_copy_dirty_clusters(call_state);
>>> -        if (ret == 0 && !call_state->cancelled) {
>>> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>>>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>>>                   /*
>>>                    * Check that there is no task we still need to
>>> @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>>>            * 2. We have waited for some intersecting block-copy request
>>>            *    It may have failed and produced new dirty bits.
>>>            */
>>> -    } while (ret > 0 && !call_state->cancelled);
>>> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>>> -    call_state->finished = true;
>>> +    qatomic_store_release(&call_state->finished, true);
>>
>> so, all writes to ret and error_is_read are finished to this point.
>>
>>>       if (call_state->cb) {
>>>           call_state->cb(call_state->cb_opaque);
>>> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
>>>           return;
>>>       }
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough.
> 
> So what you say makes sense, the only thing that I wonder is: wouldn't it be better to have the acquire without assertion (or assert afterwards), just to be sure that we delete when finished is true?
> 

Hmm. I think neither compiler nor processor should reorder read structure field and free() call on the whole structure :)

And anyway for block_copy_call_free() caller is responsible for the structure not being used by other thread.

> 
>>
>>>   }
>>>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>>>   {
>>> -    return call_state->cancelled;
>>> +    return qatomic_read(&call_state->cancelled);
>>>   }
>>>   int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>>>   {
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.
>>
>> Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.
>>
>> So, let's use simple qatomic_read here too.
> 
> Same applies here.

Here I agree with Paolo, assertion works better as written..

So we can just keep it as is.

> 
>>
>>>       if (error_is_read) {
>>>           *error_is_read = call_state->error_is_read;
>>>       }
>>> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>>>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>>>   {
>>> -    call_state->cancelled = true;
>>> +    qatomic_set(&call_state->cancelled, true);
>>>       block_copy_kick(call_state);
>>>   }
>>>
>>
>> Uhh :)
>>
>> Ok, that looks close too. Or in other words, I feel that I have good enough understanding of all the thread-safe logic that you have implemented :)
> 
> Good! :)
> 
> Emanuele
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-22  9:36       ` Vladimir Sementsov-Ogievskiy
@ 2021-06-22 10:20         ` Paolo Bonzini
  2021-06-22 10:39           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2021-06-22 10:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>> It does.  If it returns true, you still want the load of finished to 
>> happen before the reads that follow.
> 
> Hmm.. The worst case if we use just qatomic_read is that assertion will 
> not crash when it actually should. That doesn't break the logic. But 
> that's not good anyway.
> 
> OK, I agree, let's keep it.

You can also have a finished job, but get a stale value for 
error_is_read or ret.  The issue is not in getting the stale value per 
se, but in block_copy_call_status's caller not expecting it.

(I understand you agree, but I guess it can be interesting to learn 
about this too).

Paolo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-22 10:20         ` Paolo Bonzini
@ 2021-06-22 10:39           ` Vladimir Sementsov-Ogievskiy
  2021-06-22 20:57             ` Emanuele Giuseppe Esposito
  2021-06-23 10:06             ` Paolo Bonzini
  0 siblings, 2 replies; 30+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-22 10:39 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

22.06.2021 13:20, Paolo Bonzini wrote:
> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>> It does.  If it returns true, you still want the load of finished to happen before the reads that follow.
>>
>> Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway.
>>
>> OK, I agree, let's keep it.
> 
> You can also have a finished job, but get a stale value for error_is_read or ret.  The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it.
> 
> (I understand you agree, but I guess it can be interesting to learn about this too).
> 

Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true?

That means that we must do it not in assertion (to not be compiled out):

bool finished = load_acquire()

assert(finished);

... read reat and error_is_read ...


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-22 10:39           ` Vladimir Sementsov-Ogievskiy
@ 2021-06-22 20:57             ` Emanuele Giuseppe Esposito
  2021-06-23 10:06             ` Paolo Bonzini
  1 sibling, 0 replies; 30+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-22 20:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz



On 22/06/2021 12:39, Vladimir Sementsov-Ogievskiy wrote:
> 22.06.2021 13:20, Paolo Bonzini wrote:
>> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>>> It does.  If it returns true, you still want the load of finished to 
>>>> happen before the reads that follow.
>>>
>>> Hmm.. The worst case if we use just qatomic_read is that assertion 
>>> will not crash when it actually should. That doesn't break the logic. 
>>> But that's not good anyway.
>>>
>>> OK, I agree, let's keep it.
>>
>> You can also have a finished job, but get a stale value for 
>> error_is_read or ret.  The issue is not in getting the stale value per 
>> se, but in block_copy_call_status's caller not expecting it.
>>
>> (I understand you agree, but I guess it can be interesting to learn 
>> about this too).
>>
> 
> Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
> explicitly doing load_acquire(finished) and checking that it's true?
> 
> That means that we must do it not in assertion (to not be compiled out):
> 
> bool finished = load_acquire()
> 
> assert(finished);
> 
> ... read reat and error_is_read ...
> 
> 

If I understand correctly, this was what I was trying to say before: 
maybe it's better that we make sure that @finished is set before reading 
@ret and @error_is_read. And because assert can be disabled, we can do 
like you wrote above.

Anyways, let's wait Paolo's answer for this. Once this is ready, I will 
send v5.

Emanuele



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-06-22 10:39           ` Vladimir Sementsov-Ogievskiy
  2021-06-22 20:57             ` Emanuele Giuseppe Esposito
@ 2021-06-23 10:06             ` Paolo Bonzini
  1 sibling, 0 replies; 30+ messages in thread
From: Paolo Bonzini @ 2021-06-23 10:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 22/06/21 12:39, Vladimir Sementsov-Ogievskiy wrote:
> 22.06.2021 13:20, Paolo Bonzini wrote:
>> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>> OK, I agree, let's keep it.
>>
>> You can also have a finished job, but get a stale value for 
>> error_is_read or ret.  The issue is not in getting the stale value per 
>> se, but in block_copy_call_status's caller not expecting it.
> 
> Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
> explicitly doing load_acquire(finished) and checking that it's true?
> 
> That means that we must do it not in assertion (to not be compiled out):
> 
> bool finished = load_acquire()
> 
> assert(finished);
> 
> ... read reat and error_is_read ...

In reality you must have synchronized in some other way; that outside 
synchronization outside block-copy.c is what guarantees that the 
assertion does not fail.  The simplest cases are:

- a mutex: "unlock" counts as release, "lock" counts as acquire;

- a bottom half: "schedule" counts as release, running counts as acquire.

Therefore, removing the assertion would work just fine because the 
synchronization would be like this:

    write ret/error_is_read
    write finished
    trigger bottom half or something -->    bottom half runs
                                            read ret/error_is_read

So there is no need at all to read ->finished, much less to load it 
outside the assertion.  At the same time there are two problems with 
"assert(qatomic_read(&call_state->finished))".  Note that they are not 
logic issues, they are maintenance issues.

First, if *there is a bug elsewhere* and the above synchronization 
doesn't happen, it may manifest sometimes as an assertion failure and 
sometimes as a memory reordering.  This is what I was talking about in 
the previous message, and it is probably the last thing that you want 
when debugging; since we're adding asserts defensively, we might as well 
do it well.

Second, if somebody later carelessly changes the function to

   if (qatomic_read(&call_state->finished)) {
       ...
   } else {
       error_setg(...);
   }

that would be broken.  Using qatomic_load_acquire makes the code more 
future-proof against a change like the one above.

Paolo



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-06-23 10:07 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-14  7:33 [PATCH v4 0/6] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
2021-06-14  7:33 ` [PATCH v4 1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common Emanuele Giuseppe Esposito
2021-06-19 14:33   ` Vladimir Sementsov-Ogievskiy
2021-06-14  7:33 ` [PATCH v4 2/6] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
2021-06-19 15:05   ` Vladimir Sementsov-Ogievskiy
2021-06-19 18:23   ` Vladimir Sementsov-Ogievskiy
2021-06-14  7:33 ` [PATCH v4 3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
2021-06-19 15:23   ` Vladimir Sementsov-Ogievskiy
2021-06-19 18:31     ` Vladimir Sementsov-Ogievskiy
2021-06-21  8:13       ` Emanuele Giuseppe Esposito
2021-06-22  9:20         ` Vladimir Sementsov-Ogievskiy
2021-06-21  7:59     ` Emanuele Giuseppe Esposito
2021-06-22  9:16       ` Vladimir Sementsov-Ogievskiy
2021-06-19 17:27   ` Vladimir Sementsov-Ogievskiy
2021-06-21  8:21     ` Emanuele Giuseppe Esposito
2021-06-19 18:53   ` Vladimir Sementsov-Ogievskiy
2021-06-21  8:28     ` Emanuele Giuseppe Esposito
2021-06-14  7:33 ` [PATCH v4 4/6] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
2021-06-14  7:33 ` [PATCH v4 5/6] block-copy: add a CoMutex Emanuele Giuseppe Esposito
2021-06-19 19:34   ` Vladimir Sementsov-Ogievskiy
2021-06-14  7:33 ` [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
2021-06-19 20:06   ` Vladimir Sementsov-Ogievskiy
2021-06-21  9:30     ` Emanuele Giuseppe Esposito
2021-06-22  9:56       ` Vladimir Sementsov-Ogievskiy
2021-06-22  8:15     ` Paolo Bonzini
2021-06-22  9:36       ` Vladimir Sementsov-Ogievskiy
2021-06-22 10:20         ` Paolo Bonzini
2021-06-22 10:39           ` Vladimir Sementsov-Ogievskiy
2021-06-22 20:57             ` Emanuele Giuseppe Esposito
2021-06-23 10:06             ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.