All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] block-copy: protect block-copy internal structures
@ 2021-05-18 10:07 Emanuele Giuseppe Esposito
  2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
                   ` (8 more replies)
  0 siblings, 9 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

This serie of patches aims to reduce the usage of the global
AioContexlock in block-copy, by introducing smaller granularity
locks thus on making the block layer thread safe. 

This serie depends on Paolo's coroutine_sleep API and my previous
serie that brings thread safety to the smaller API used by block-copy,
like ratelimit, progressmeter abd co-shared-resource.

What's missing for block-copy to be fully thread-safe is fixing
the CoSleep API to allow cross-thread sleep and wakeup.
Paolo is working on it and will post the patches once his new
CoSleep API is accepted.

Patch 1 introduces the .method field instead of .use_copy_range
and .copy_size, so that it can be later used as atomic.
Patch 2-3 provide comments and refactoring in preparation to
the locks added in patch 4 on BlockCopyTask, patch 5-6 on
BlockCopyCallState and 7 BlockCopyState.

Based-on: <20210517100548.28806-1-pbonzini@redhat.com>
Based-on: <20210518094058.25952-1-eesposit@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
v1 -> v2:
* More field categorized as IN/State/OUT in the various struct, better
  documentation in the structs
* Fix a couple of places where I missed locks [Vladimir, Paolo]


Emanuele Giuseppe Esposito (6):
  block-copy: improve documentation of BlockCopyTask and BlockCopyState
    types and functions
  block-copy: move progress_set_remaining in block_copy_task_end
  block-copy: add a CoMutex to the BlockCopyTask list
  block-copy: add QemuMutex lock for BlockCopyCallState list
  block-copy: atomic .cancelled and .finished fields in
    BlockCopyCallState
  block-copy: protect BlockCopyState .method fields

Paolo Bonzini (1):
  block-copy: streamline choice of copy_range vs. read/write

 block/block-copy.c | 234 +++++++++++++++++++++++++++++----------------
 1 file changed, 150 insertions(+), 84 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
  2021-05-27  8:20   ` Stefan Hajnoczi
  2021-05-18 10:07 ` [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

From: Paolo Bonzini <pbonzini@redhat.com>

Put the logic to determine the copy size in a separate function, so
that there is a simple state machine for the possible methods of
copying data from one BlockDriverState to the other.

While at it, store the common computation of block_copy_max_transfer
into a new field of BlockCopyState, and make sure that we always
obey max_transfer; that's more efficient even for the
COPY_RANGE_READ_WRITE case.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------
 1 file changed, 39 insertions(+), 20 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 37c8e8504b..10ce51a244 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -28,6 +28,13 @@
 #define BLOCK_COPY_MAX_WORKERS 64
 #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
 
+typedef enum {
+    COPY_READ_WRITE_CLUSTER,
+    COPY_READ_WRITE,
+    COPY_RANGE_SMALL,
+    COPY_RANGE_FULL
+} BlockCopyMethod;
+
 static coroutine_fn int block_copy_task_entry(AioTask *task);
 
 typedef struct BlockCopyCallState {
@@ -85,8 +92,8 @@ typedef struct BlockCopyState {
     BdrvDirtyBitmap *copy_bitmap;
     int64_t in_flight_bytes;
     int64_t cluster_size;
-    bool use_copy_range;
-    int64_t copy_size;
+    BlockCopyMethod method;
+    int64_t max_transfer;
     uint64_t len;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QLIST_HEAD(, BlockCopyCallState) calls;
@@ -148,6 +155,23 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
     return true;
 }
 
+static inline int64_t block_copy_chunk_size(BlockCopyState *s)
+{
+    switch (s->method) {
+    case COPY_READ_WRITE_CLUSTER:
+        return s->cluster_size;
+    case COPY_READ_WRITE:
+    case COPY_RANGE_SMALL:
+        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
+                   s->max_transfer);
+    case COPY_RANGE_FULL:
+        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
+                   s->max_transfer);
+    default:
+        abort();
+    }
+}
+
 /*
  * Search for the first dirty area in offset/bytes range and create task at
  * the beginning of it.
@@ -157,8 +181,9 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
                                              int64_t offset, int64_t bytes)
 {
     BlockCopyTask *task;
-    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, call_state->max_chunk);
+    int64_t max_chunk = block_copy_chunk_size(s);
 
+    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
     if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
                                            offset, offset + bytes,
                                            max_chunk, &offset, &bytes))
@@ -265,28 +290,27 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         .len = bdrv_dirty_bitmap_size(copy_bitmap),
         .write_flags = write_flags,
         .mem = shres_create(BLOCK_COPY_MAX_MEM),
+        .max_transfer = QEMU_ALIGN_DOWN(block_copy_max_transfer(source, target)
+                                        , cluster_size),
     };
 
-    if (block_copy_max_transfer(source, target) < cluster_size) {
+    if (s->max_transfer < cluster_size) {
         /*
          * copy_range does not respect max_transfer. We don't want to bother
          * with requests smaller than block-copy cluster size, so fallback to
          * buffered copying (read and write respect max_transfer on their
          * behalf).
          */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;
     } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
         /* Compression supports only cluster-size writes and no copy-range. */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;
     } else {
         /*
          * We enable copy-range, but keep small copy_size, until first
          * successful copy_range (look at block_copy_do_copy).
          */
-        s->use_copy_range = use_copy_range;
-        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
+        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
     }
 
     ratelimit_init(&s->rate_limit);
@@ -369,30 +393,25 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
         return ret;
     }
 
-    if (s->use_copy_range) {
+    if (s->method >= COPY_RANGE_SMALL) {
         ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
                                  0, s->write_flags);
         if (ret < 0) {
             trace_block_copy_copy_range_fail(s, offset, ret);
-            s->use_copy_range = false;
-            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
+            s->method = COPY_READ_WRITE;
             /* Fallback to read+write with allocated buffer */
         } else {
-            if (s->use_copy_range) {
+            if (s->method == COPY_RANGE_SMALL) {
                 /*
                  * Successful copy-range. Now increase copy_size.  copy_range
                  * does not respect max_transfer (it's a TODO), so we factor
                  * that in here.
                  *
-                 * Note: we double-check s->use_copy_range for the case when
+                 * Note: we double-check s->method for the case when
                  * parallel block-copy request unsets it during previous
                  * bdrv_co_copy_range call.
                  */
-                s->copy_size =
-                        MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
-                            QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
-                                                                    s->target),
-                                            s->cluster_size));
+                s->method = COPY_RANGE_FULL;
             }
             goto out;
         }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
  2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 15:00   ` Vladimir Sementsov-Ogievskiy
  2021-05-18 10:07 ` [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

As done in BlockCopyCallState, categorize BlockCopyTask
and BlockCopyState in IN, State and OUT fields.
This is just to understand which field has to be protected with a lock.

BlockCopyTask .zeroes is a special case, because it is only initialized
and then read by the coroutine in block_copy_task_entry.

Also set block_copy_task_create as coroutine_fn because:
1) it is static and only invoked by coroutine functions
2) next patches will introduce and use a CoMutex lock there

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 49 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 10ce51a244..d2d3839dec 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -67,13 +67,28 @@ typedef struct BlockCopyCallState {
 typedef struct BlockCopyTask {
     AioTask task;
 
+    /*
+     * IN parameters. Initialized in block_copy_task_create()
+     * and never changed.
+     */
     BlockCopyState *s;
     BlockCopyCallState *call_state;
     int64_t offset;
-    int64_t bytes;
+    int64_t bytes; /* only re-set in task_shrink, before running the task */
+
+    /*
+     * "local" parameter: used only to communicate between
+     * the caller (block_copy_dirty_clusters) and the aiotask
+     * coroutine running block_copy_task_entry
+     */
     bool zeroes;
-    QLIST_ENTRY(BlockCopyTask) list;
+
+    /* State */
     CoQueue wait_queue; /* coroutines blocked on this task */
+
+    /* To reference all call states from BlockCopyState */
+    QLIST_ENTRY(BlockCopyTask) list;
+
 } BlockCopyTask;
 
 static int64_t task_end(BlockCopyTask *task)
@@ -89,15 +104,25 @@ typedef struct BlockCopyState {
      */
     BdrvChild *source;
     BdrvChild *target;
-    BdrvDirtyBitmap *copy_bitmap;
+
+    /* State */
     int64_t in_flight_bytes;
-    int64_t cluster_size;
     BlockCopyMethod method;
-    int64_t max_transfer;
-    uint64_t len;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QLIST_HEAD(, BlockCopyCallState) calls;
+    /* State fields that use a thread-safe API */
+    BdrvDirtyBitmap *copy_bitmap;
+    ProgressMeter *progress;
+    SharedResource *mem;
+    RateLimit rate_limit;
 
+    /*
+     * IN parameters. Initialized in block_copy_state_new()
+     * and never changed.
+     */
+    int64_t cluster_size;
+    int64_t max_transfer;
+    uint64_t len;
     BdrvRequestFlags write_flags;
 
     /*
@@ -115,12 +140,6 @@ typedef struct BlockCopyState {
      * block_copy_reset_unallocated() every time it does.
      */
     bool skip_unallocated;
-
-    ProgressMeter *progress;
-
-    SharedResource *mem;
-
-    RateLimit rate_limit;
 } BlockCopyState;
 
 static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
@@ -176,9 +195,9 @@ static inline int64_t block_copy_chunk_size(BlockCopyState *s)
  * Search for the first dirty area in offset/bytes range and create task at
  * the beginning of it.
  */
-static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
-                                             BlockCopyCallState *call_state,
-                                             int64_t offset, int64_t bytes)
+static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
+                                                BlockCopyCallState *call_state,
+                                                int64_t offset, int64_t bytes)
 {
     BlockCopyTask *task;
     int64_t max_chunk = block_copy_chunk_size(s);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
  2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
  2021-05-18 10:07 ` [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 15:03   ` Vladimir Sementsov-Ogievskiy
  2021-05-18 10:07 ` [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list Emanuele Giuseppe Esposito
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

Moving this function in task_end ensures to update the progress
anyways, even if there is an error.

It also helps in next patch, allowing task_end to have only
one critical section.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index d2d3839dec..2e610b4142 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -264,6 +264,9 @@ static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
         bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
     }
     QLIST_REMOVE(task, list);
+    progress_set_remaining(task->s->progress,
+                           bdrv_get_dirty_count(task->s->copy_bitmap) +
+                           task->s->in_flight_bytes);
     qemu_co_queue_restart_all(&task->wait_queue);
 }
 
@@ -645,9 +648,6 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
         }
         if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
             block_copy_task_end(task, 0);
-            progress_set_remaining(s->progress,
-                                   bdrv_get_dirty_count(s->copy_bitmap) +
-                                   s->in_flight_bytes);
             trace_block_copy_skip_range(s, task->offset, task->bytes);
             offset = task_end(task);
             bytes = end - offset;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (2 preceding siblings ...)
  2021-05-18 10:07 ` [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 15:19   ` Vladimir Sementsov-Ogievskiy
  2021-05-27  9:07   ` Stefan Hajnoczi
  2021-05-18 10:07 ` [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list Emanuele Giuseppe Esposito
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

Because the list of tasks is only modified by coroutine
functions, add a CoMutex in order to protect them.

Use the same mutex to protect also BlockCopyState in_flight_bytes
field to avoid adding additional syncronization primitives.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 55 +++++++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 2e610b4142..3a949fab64 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -83,7 +83,7 @@ typedef struct BlockCopyTask {
      */
     bool zeroes;
 
-    /* State */
+    /* State. Protected by tasks_lock */
     CoQueue wait_queue; /* coroutines blocked on this task */
 
     /* To reference all call states from BlockCopyState */
@@ -106,8 +106,9 @@ typedef struct BlockCopyState {
     BdrvChild *target;
 
     /* State */
-    int64_t in_flight_bytes;
+    int64_t in_flight_bytes; /* protected by tasks_lock */
     BlockCopyMethod method;
+    CoMutex tasks_lock;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QLIST_HEAD(, BlockCopyCallState) calls;
     /* State fields that use a thread-safe API */
@@ -142,8 +143,10 @@ typedef struct BlockCopyState {
     bool skip_unallocated;
 } BlockCopyState;
 
-static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
-                                            int64_t offset, int64_t bytes)
+/* Called with lock held */
+static BlockCopyTask *find_conflicting_task_locked(BlockCopyState *s,
+                                                   int64_t offset,
+                                                   int64_t bytes)
 {
     BlockCopyTask *t;
 
@@ -163,13 +166,16 @@ static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
 static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
                                              int64_t bytes)
 {
-    BlockCopyTask *task = find_conflicting_task(s, offset, bytes);
+    BlockCopyTask *task;
+
+    QEMU_LOCK_GUARD(&s->tasks_lock);
+    task = find_conflicting_task_locked(s, offset, bytes);
 
     if (!task) {
         return false;
     }
 
-    qemu_co_queue_wait(&task->wait_queue, NULL);
+    qemu_co_queue_wait(&task->wait_queue, &s->tasks_lock);
 
     return true;
 }
@@ -213,11 +219,7 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);
 
-    /* region is dirty, so no existent tasks possible in it */
-    assert(!find_conflicting_task(s, offset, bytes));
-
     bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-    s->in_flight_bytes += bytes;
 
     task = g_new(BlockCopyTask, 1);
     *task = (BlockCopyTask) {
@@ -228,7 +230,13 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
         .bytes = bytes,
     };
     qemu_co_queue_init(&task->wait_queue);
-    QLIST_INSERT_HEAD(&s->tasks, task, list);
+
+    WITH_QEMU_LOCK_GUARD(&s->tasks_lock) {
+        s->in_flight_bytes += bytes;
+        /* region is dirty, so no existent tasks possible in it */
+        assert(!find_conflicting_task_locked(s, offset, bytes));
+        QLIST_INSERT_HEAD(&s->tasks, task, list);
+    }
 
     return task;
 }
@@ -249,25 +257,29 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
 
     assert(new_bytes > 0 && new_bytes < task->bytes);
 
-    task->s->in_flight_bytes -= task->bytes - new_bytes;
     bdrv_set_dirty_bitmap(task->s->copy_bitmap,
                           task->offset + new_bytes, task->bytes - new_bytes);
 
-    task->bytes = new_bytes;
-    qemu_co_queue_restart_all(&task->wait_queue);
+    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
+        task->s->in_flight_bytes -= task->bytes - new_bytes;
+        task->bytes = new_bytes;
+        qemu_co_queue_restart_all(&task->wait_queue);
+    }
 }
 
 static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
 {
-    task->s->in_flight_bytes -= task->bytes;
     if (ret < 0) {
         bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
     }
-    QLIST_REMOVE(task, list);
-    progress_set_remaining(task->s->progress,
-                           bdrv_get_dirty_count(task->s->copy_bitmap) +
-                           task->s->in_flight_bytes);
-    qemu_co_queue_restart_all(&task->wait_queue);
+    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
+        task->s->in_flight_bytes -= task->bytes;
+        QLIST_REMOVE(task, list);
+        progress_set_remaining(task->s->progress,
+                               bdrv_get_dirty_count(task->s->copy_bitmap) +
+                               task->s->in_flight_bytes);
+        qemu_co_queue_restart_all(&task->wait_queue);
+    }
 }
 
 void block_copy_state_free(BlockCopyState *s)
@@ -336,6 +348,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
     }
 
     ratelimit_init(&s->rate_limit);
+    qemu_co_mutex_init(&s->tasks_lock);
     QLIST_INIT(&s->tasks);
     QLIST_INIT(&s->calls);
 
@@ -586,9 +599,11 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 
     if (!ret) {
         bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+        qemu_co_mutex_lock(&s->tasks_lock);
         progress_set_remaining(s->progress,
                                bdrv_get_dirty_count(s->copy_bitmap) +
                                s->in_flight_bytes);
+        qemu_co_mutex_unlock(&s->tasks_lock);
     }
 
     *count = bytes;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (3 preceding siblings ...)
  2021-05-18 10:07 ` [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 15:30   ` Vladimir Sementsov-Ogievskiy
  2021-05-28 10:53   ` Paolo Bonzini
  2021-05-18 10:07 ` [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

As for BlockCopyTask, add a lock to protect BlockCopyCallState
ret and sleep_state fields. Also move ret, finished and cancelled
in the OUT fields of BlockCopyCallState.

Here a QemuMutex is used to protect QemuCoSleep field, since it
can be concurrently invoked also from outside threads.

.finished, .cancelled and reads to .ret and .error_is_read will be
protected in the following patch.

.sleep state is handled in the series "coroutine: new sleep/wake API"

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 3a949fab64..d5ed5932b0 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -55,13 +55,14 @@ typedef struct BlockCopyCallState {
     QLIST_ENTRY(BlockCopyCallState) list;
 
     /* State */
-    int ret;
     bool finished;
-    QemuCoSleep sleep;
-    bool cancelled;
+    QemuCoSleep sleep; /* TODO: protect API with a lock */
 
     /* OUT parameters */
+    bool cancelled;
+    /* Fields protected by calls_lock in BlockCopyState */
     bool error_is_read;
+    int ret;
 } BlockCopyCallState;
 
 typedef struct BlockCopyTask {
@@ -110,6 +111,7 @@ typedef struct BlockCopyState {
     BlockCopyMethod method;
     CoMutex tasks_lock;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
+    QemuMutex calls_lock;
     QLIST_HEAD(, BlockCopyCallState) calls;
     /* State fields that use a thread-safe API */
     BdrvDirtyBitmap *copy_bitmap;
@@ -289,6 +291,7 @@ void block_copy_state_free(BlockCopyState *s)
     }
 
     ratelimit_destroy(&s->rate_limit);
+    qemu_mutex_destroy(&s->calls_lock);
     bdrv_release_dirty_bitmap(s->copy_bitmap);
     shres_destroy(s->mem);
     g_free(s);
@@ -349,6 +352,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
 
     ratelimit_init(&s->rate_limit);
     qemu_co_mutex_init(&s->tasks_lock);
+    qemu_mutex_init(&s->calls_lock);
     QLIST_INIT(&s->tasks);
     QLIST_INIT(&s->calls);
 
@@ -492,11 +496,14 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
 
     ret = block_copy_do_copy(t->s, t->offset, t->bytes, t->zeroes,
                              &error_is_read);
-    if (ret < 0 && !t->call_state->ret) {
-        t->call_state->ret = ret;
-        t->call_state->error_is_read = error_is_read;
-    } else {
-        progress_work_done(t->s->progress, t->bytes);
+
+    WITH_QEMU_LOCK_GUARD(&t->s->calls_lock) {
+        if (ret < 0 && !t->call_state->ret) {
+            t->call_state->ret = ret;
+            t->call_state->error_is_read = error_is_read;
+        } else {
+            progress_work_done(t->s->progress, t->bytes);
+        }
     }
     co_put_to_shres(t->s->mem, t->bytes);
     block_copy_task_end(t, ret);
@@ -740,7 +747,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
 {
     int ret;
 
+    qemu_mutex_lock(&call_state->s->calls_lock);
     QLIST_INSERT_HEAD(&call_state->s->calls, call_state, list);
+    qemu_mutex_unlock(&call_state->s->calls_lock);
 
     do {
         ret = block_copy_dirty_clusters(call_state);
@@ -767,7 +776,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
         call_state->cb(call_state->cb_opaque);
     }
 
+    qemu_mutex_lock(&call_state->s->calls_lock);
     QLIST_REMOVE(call_state, list);
+    qemu_mutex_unlock(&call_state->s->calls_lock);
 
     return ret;
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (4 preceding siblings ...)
  2021-05-18 10:07 ` [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-20 15:34   ` Vladimir Sementsov-Ogievskiy
  2021-05-18 10:07 ` [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields Emanuele Giuseppe Esposito
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index d5ed5932b0..573e96fefb 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -55,11 +55,11 @@ typedef struct BlockCopyCallState {
     QLIST_ENTRY(BlockCopyCallState) list;
 
     /* State */
-    bool finished;
+    bool finished; /* atomic */
     QemuCoSleep sleep; /* TODO: protect API with a lock */
 
     /* OUT parameters */
-    bool cancelled;
+    bool cancelled; /* atomic */
     /* Fields protected by calls_lock in BlockCopyState */
     bool error_is_read;
     int ret;
@@ -646,7 +646,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
 
-    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
+    while (bytes && aio_task_pool_status(aio) == 0 &&
+           !qatomic_read(&call_state->cancelled)) {
         BlockCopyTask *task;
         int64_t status_bytes;
 
@@ -754,7 +755,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
     do {
         ret = block_copy_dirty_clusters(call_state);
 
-        if (ret == 0 && !call_state->cancelled) {
+        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
             ret = block_copy_wait_one(call_state->s, call_state->offset,
                                       call_state->bytes);
         }
@@ -768,9 +769,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
          * 2. We have waited for some intersecting block-copy request
          *    It may have failed and produced new dirty bits.
          */
-    } while (ret > 0 && !call_state->cancelled);
+    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
 
-    call_state->finished = true;
+    qatomic_store_release(&call_state->finished, true);
 
     if (call_state->cb) {
         call_state->cb(call_state->cb_opaque);
@@ -833,35 +834,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
         return;
     }
 
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));
     g_free(call_state);
 }
 
 bool block_copy_call_finished(BlockCopyCallState *call_state)
 {
-    return call_state->finished;
+    return qatomic_load_acquire(&call_state->finished);
 }
 
 bool block_copy_call_succeeded(BlockCopyCallState *call_state)
 {
-    return call_state->finished && !call_state->cancelled &&
-        call_state->ret == 0;
+    return qatomic_load_acquire(&call_state->finished) &&
+           !qatomic_read(&call_state->cancelled) &&
+           call_state->ret == 0;
 }
 
 bool block_copy_call_failed(BlockCopyCallState *call_state)
 {
-    return call_state->finished && !call_state->cancelled &&
-        call_state->ret < 0;
+    return qatomic_load_acquire(&call_state->finished) &&
+           !qatomic_read(&call_state->cancelled) &&
+           call_state->ret < 0;
 }
 
 bool block_copy_call_cancelled(BlockCopyCallState *call_state)
 {
-    return call_state->cancelled;
+    return qatomic_read(&call_state->cancelled);
 }
 
 int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
 {
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));
     if (error_is_read) {
         *error_is_read = call_state->error_is_read;
     }
@@ -870,7 +873,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
 
 void block_copy_call_cancel(BlockCopyCallState *call_state)
 {
-    call_state->cancelled = true;
+    qatomic_set(&call_state->cancelled, true);
     block_copy_kick(call_state);
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (5 preceding siblings ...)
  2021-05-18 10:07 ` [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
@ 2021-05-18 10:07 ` Emanuele Giuseppe Esposito
  2021-05-21 17:10   ` Vladimir Sementsov-Ogievskiy
  2021-05-20 13:47 ` [PATCH v2 0/7] block-copy: protect block-copy internal structures Vladimir Sementsov-Ogievskiy
  2021-05-27 10:31 ` Stefan Hajnoczi
  8 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-18 10:07 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, John Snow

With tasks and calls lock protecting all State fields,
.method is the last BlockCopyState field left unprotected.
Set it as atomic.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 573e96fefb..ebccb7fbc6 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -108,7 +108,7 @@ typedef struct BlockCopyState {
 
     /* State */
     int64_t in_flight_bytes; /* protected by tasks_lock */
-    BlockCopyMethod method;
+    BlockCopyMethod method; /* atomic */
     CoMutex tasks_lock;
     QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
     QemuMutex calls_lock;
@@ -184,7 +184,7 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
 
 static inline int64_t block_copy_chunk_size(BlockCopyState *s)
 {
-    switch (s->method) {
+    switch (qatomic_read(&s->method)) {
     case COPY_READ_WRITE_CLUSTER:
         return s->cluster_size;
     case COPY_READ_WRITE:
@@ -338,16 +338,17 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
          * buffered copying (read and write respect max_transfer on their
          * behalf).
          */
-        s->method = COPY_READ_WRITE_CLUSTER;
+        qatomic_set(&s->method, COPY_READ_WRITE_CLUSTER);
     } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
         /* Compression supports only cluster-size writes and no copy-range. */
-        s->method = COPY_READ_WRITE_CLUSTER;
+        qatomic_set(&s->method, COPY_READ_WRITE_CLUSTER);
     } else {
         /*
          * We enable copy-range, but keep small copy_size, until first
          * successful copy_range (look at block_copy_do_copy).
          */
-        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
+        qatomic_set(&s->method, use_copy_range ? COPY_RANGE_SMALL :
+                                                 COPY_READ_WRITE);
     }
 
     ratelimit_init(&s->rate_limit);
@@ -432,26 +433,24 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
         return ret;
     }
 
-    if (s->method >= COPY_RANGE_SMALL) {
+    if (qatomic_read(&s->method) >= COPY_RANGE_SMALL) {
         ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
                                  0, s->write_flags);
         if (ret < 0) {
             trace_block_copy_copy_range_fail(s, offset, ret);
-            s->method = COPY_READ_WRITE;
+            qatomic_set(&s->method, COPY_READ_WRITE);
             /* Fallback to read+write with allocated buffer */
         } else {
-            if (s->method == COPY_RANGE_SMALL) {
-                /*
-                 * Successful copy-range. Now increase copy_size.  copy_range
-                 * does not respect max_transfer (it's a TODO), so we factor
-                 * that in here.
-                 *
-                 * Note: we double-check s->method for the case when
-                 * parallel block-copy request unsets it during previous
-                 * bdrv_co_copy_range call.
-                 */
-                s->method = COPY_RANGE_FULL;
-            }
+            /*
+             * Successful copy-range. Now increase copy_size.  copy_range
+             * does not respect max_transfer (it's a TODO), so we factor
+             * that in here.
+             *
+             * Note: we double-check s->method for the case when
+             * parallel block-copy request unsets it during previous
+             * bdrv_co_copy_range call.
+             */
+            qatomic_cmpxchg(&s->method, COPY_RANGE_SMALL, COPY_RANGE_FULL);
             goto out;
         }
     }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 0/7] block-copy: protect block-copy internal structures
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (6 preceding siblings ...)
  2021-05-18 10:07 ` [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields Emanuele Giuseppe Esposito
@ 2021-05-20 13:47 ` Vladimir Sementsov-Ogievskiy
  2021-05-20 14:33   ` Emanuele Giuseppe Esposito
  2021-05-27 10:31 ` Stefan Hajnoczi
  8 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 13:47 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> This serie of patches aims to reduce the usage of the global
> AioContexlock in block-copy, by introducing smaller granularity
> locks thus on making the block layer thread safe.
> 
> This serie depends on Paolo's coroutine_sleep API and my previous
> serie that brings thread safety to the smaller API used by block-copy,
> like ratelimit, progressmeter abd co-shared-resource.
> 
> What's missing for block-copy to be fully thread-safe is fixing
> the CoSleep API to allow cross-thread sleep and wakeup.
> Paolo is working on it and will post the patches once his new
> CoSleep API is accepted.
> 
> Patch 1 introduces the .method field instead of .use_copy_range
> and .copy_size, so that it can be later used as atomic.
> Patch 2-3 provide comments and refactoring in preparation to
> the locks added in patch 4 on BlockCopyTask, patch 5-6 on
> BlockCopyCallState and 7 BlockCopyState.
> 
> Based-on: <20210517100548.28806-1-pbonzini@redhat.com>
> Based-on: <20210518094058.25952-1-eesposit@redhat.com>

Hi! I failed to apply this all. Could you please export your branch with your patches at some public git repo?

> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> v1 -> v2:
> * More field categorized as IN/State/OUT in the various struct, better
>    documentation in the structs
> * Fix a couple of places where I missed locks [Vladimir, Paolo]
> 
> 
> Emanuele Giuseppe Esposito (6):
>    block-copy: improve documentation of BlockCopyTask and BlockCopyState
>      types and functions
>    block-copy: move progress_set_remaining in block_copy_task_end
>    block-copy: add a CoMutex to the BlockCopyTask list
>    block-copy: add QemuMutex lock for BlockCopyCallState list
>    block-copy: atomic .cancelled and .finished fields in
>      BlockCopyCallState
>    block-copy: protect BlockCopyState .method fields
> 
> Paolo Bonzini (1):
>    block-copy: streamline choice of copy_range vs. read/write
> 
>   block/block-copy.c | 234 +++++++++++++++++++++++++++++----------------
>   1 file changed, 150 insertions(+), 84 deletions(-)
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 0/7] block-copy: protect block-copy internal structures
  2021-05-20 13:47 ` [PATCH v2 0/7] block-copy: protect block-copy internal structures Vladimir Sementsov-Ogievskiy
@ 2021-05-20 14:33   ` Emanuele Giuseppe Esposito
  0 siblings, 0 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-20 14:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 20/05/2021 15:47, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> This serie of patches aims to reduce the usage of the global
>> AioContexlock in block-copy, by introducing smaller granularity
>> locks thus on making the block layer thread safe.
>>
>> This serie depends on Paolo's coroutine_sleep API and my previous
>> serie that brings thread safety to the smaller API used by block-copy,
>> like ratelimit, progressmeter abd co-shared-resource.
>>
>> What's missing for block-copy to be fully thread-safe is fixing
>> the CoSleep API to allow cross-thread sleep and wakeup.
>> Paolo is working on it and will post the patches once his new
>> CoSleep API is accepted.
>>
>> Patch 1 introduces the .method field instead of .use_copy_range
>> and .copy_size, so that it can be later used as atomic.
>> Patch 2-3 provide comments and refactoring in preparation to
>> the locks added in patch 4 on BlockCopyTask, patch 5-6 on
>> BlockCopyCallState and 7 BlockCopyState.
>>
>> Based-on: <20210517100548.28806-1-pbonzini@redhat.com>
>> Based-on: <20210518094058.25952-1-eesposit@redhat.com>
> 
> Hi! I failed to apply this all. Could you please export your branch with 
> your patches at some public git repo?

Hi, thank you for applying the patches. My branch is here:

https://gitlab.com/eesposit/qemu/-/commits/dataplane_new

Emanuele
> 
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>> v1 -> v2:
>> * More field categorized as IN/State/OUT in the various struct, better
>>    documentation in the structs
>> * Fix a couple of places where I missed locks [Vladimir, Paolo]
>>
>>
>> Emanuele Giuseppe Esposito (6):
>>    block-copy: improve documentation of BlockCopyTask and BlockCopyState
>>      types and functions
>>    block-copy: move progress_set_remaining in block_copy_task_end
>>    block-copy: add a CoMutex to the BlockCopyTask list
>>    block-copy: add QemuMutex lock for BlockCopyCallState list
>>    block-copy: atomic .cancelled and .finished fields in
>>      BlockCopyCallState
>>    block-copy: protect BlockCopyState .method fields
>>
>> Paolo Bonzini (1):
>>    block-copy: streamline choice of copy_range vs. read/write
>>
>>   block/block-copy.c | 234 +++++++++++++++++++++++++++++----------------
>>   1 file changed, 150 insertions(+), 84 deletions(-)
>>
> 
> 



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
@ 2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
  2021-05-20 15:06     ` Emanuele Giuseppe Esposito
  2021-05-21 15:10     ` Paolo Bonzini
  2021-05-27  8:20   ` Stefan Hajnoczi
  1 sibling, 2 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 14:42 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> Put the logic to determine the copy size in a separate function, so
> that there is a simple state machine for the possible methods of
> copying data from one BlockDriverState to the other.

Honestly, for me 4-state state-maching + function to determine copy-size doesn't seem better than two simple variables copy_size and use_copy_range.

What's the benefit of it?

> 
> While at it, store the common computation of block_copy_max_transfer
> into a new field of BlockCopyState, and make sure that we always
> obey max_transfer; that's more efficient even for the
> COPY_RANGE_READ_WRITE case.

hmm, maybe. It could be a separate patch.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------

stats agree with me, that its' not a simplification.

>   1 file changed, 39 insertions(+), 20 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 37c8e8504b..10ce51a244 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -28,6 +28,13 @@
>   #define BLOCK_COPY_MAX_WORKERS 64
>   #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
>   
> +typedef enum {
> +    COPY_READ_WRITE_CLUSTER,
> +    COPY_READ_WRITE,
> +    COPY_RANGE_SMALL,
> +    COPY_RANGE_FULL
> +} BlockCopyMethod;
> +
>   static coroutine_fn int block_copy_task_entry(AioTask *task);
>   
>   typedef struct BlockCopyCallState {
> @@ -85,8 +92,8 @@ typedef struct BlockCopyState {
>       BdrvDirtyBitmap *copy_bitmap;
>       int64_t in_flight_bytes;
>       int64_t cluster_size;
> -    bool use_copy_range;
> -    int64_t copy_size;
> +    BlockCopyMethod method;
> +    int64_t max_transfer;
>       uint64_t len;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>       QLIST_HEAD(, BlockCopyCallState) calls;
> @@ -148,6 +155,23 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>       return true;
>   }
>   
> +static inline int64_t block_copy_chunk_size(BlockCopyState *s)

"inline" word does nothing in static definitions in c files. Compiler make a decision independently of it.

> +{
> +    switch (s->method) {
> +    case COPY_READ_WRITE_CLUSTER:
> +        return s->cluster_size;
> +    case COPY_READ_WRITE:
> +    case COPY_RANGE_SMALL:
> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
> +                   s->max_transfer);
> +    case COPY_RANGE_FULL:
> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
> +                   s->max_transfer);
> +    default:
> +        abort();
> +    }
> +}
> +
>   /*
>    * Search for the first dirty area in offset/bytes range and create task at
>    * the beginning of it.
> @@ -157,8 +181,9 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>                                                int64_t offset, int64_t bytes)
>   {
>       BlockCopyTask *task;
> -    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, call_state->max_chunk);
> +    int64_t max_chunk = block_copy_chunk_size(s);
>   
> +    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
>       if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
>                                              offset, offset + bytes,
>                                              max_chunk, &offset, &bytes))
> @@ -265,28 +290,27 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>           .len = bdrv_dirty_bitmap_size(copy_bitmap),
>           .write_flags = write_flags,
>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
> +        .max_transfer = QEMU_ALIGN_DOWN(block_copy_max_transfer(source, target)
> +                                        , cluster_size),
>       };
>   
> -    if (block_copy_max_transfer(source, target) < cluster_size) {
> +    if (s->max_transfer < cluster_size) {
>           /*
>            * copy_range does not respect max_transfer. We don't want to bother
>            * with requests smaller than block-copy cluster size, so fallback to
>            * buffered copying (read and write respect max_transfer on their
>            * behalf).
>            */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>           /* Compression supports only cluster-size writes and no copy-range. */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
>       } else {
>           /*
>            * We enable copy-range, but keep small copy_size, until first
>            * successful copy_range (look at block_copy_do_copy).
>            */
> -        s->use_copy_range = use_copy_range;
> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>       }
>   
>       ratelimit_init(&s->rate_limit);
> @@ -369,30 +393,25 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>           return ret;
>       }
>   
> -    if (s->use_copy_range) {
> +    if (s->method >= COPY_RANGE_SMALL) {

I don't like such condition:
1. it forces me to go to enum definition to understand the logic
2. it's error prone: it's very possibly to forget to update it, when updating the enum, and logic will be broken.

No, I don't like moving to state machine for this simple thing.

>           ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
>                                    0, s->write_flags);
>           if (ret < 0) {
>               trace_block_copy_copy_range_fail(s, offset, ret);
> -            s->use_copy_range = false;
> -            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +            s->method = COPY_READ_WRITE;
>               /* Fallback to read+write with allocated buffer */
>           } else {
> -            if (s->use_copy_range) {
> +            if (s->method == COPY_RANGE_SMALL) {
>                   /*
>                    * Successful copy-range. Now increase copy_size.  copy_range
>                    * does not respect max_transfer (it's a TODO), so we factor
>                    * that in here.
>                    *
> -                 * Note: we double-check s->use_copy_range for the case when
> +                 * Note: we double-check s->method for the case when
>                    * parallel block-copy request unsets it during previous
>                    * bdrv_co_copy_range call.
>                    */
> -                s->copy_size =
> -                        MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
> -                            QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
> -                                                                    s->target),
> -                                            s->cluster_size));
> +                s->method = COPY_RANGE_FULL;
>               }
>               goto out;
>           }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions
  2021-05-18 10:07 ` [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
@ 2021-05-20 15:00   ` Vladimir Sementsov-Ogievskiy
  2021-05-20 15:15     ` Emanuele Giuseppe Esposito
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:00 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> As done in BlockCopyCallState, categorize BlockCopyTask
> and BlockCopyState in IN, State and OUT fields.
> This is just to understand which field has to be protected with a lock.
> 
> BlockCopyTask .zeroes is a special case, because it is only initialized
> and then read by the coroutine in block_copy_task_entry.
> 
> Also set block_copy_task_create as coroutine_fn because:
> 1) it is static and only invoked by coroutine functions
> 2) next patches will introduce and use a CoMutex lock there

this change is unrelated, why not to put it into commit, which adds use of CoMutex in that function?

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 49 ++++++++++++++++++++++++++++++++--------------
>   1 file changed, 34 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 10ce51a244..d2d3839dec 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -67,13 +67,28 @@ typedef struct BlockCopyCallState {
>   typedef struct BlockCopyTask {
>       AioTask task;
>   
> +    /*
> +     * IN parameters. Initialized in block_copy_task_create()
> +     * and never changed.
> +     */
>       BlockCopyState *s;
>       BlockCopyCallState *call_state;
>       int64_t offset;
> -    int64_t bytes;
> +    int64_t bytes; /* only re-set in task_shrink, before running the task */
> +
> +    /*
> +     * "local" parameter: used only to communicate between
> +     * the caller (block_copy_dirty_clusters) and the aiotask
> +     * coroutine running block_copy_task_entry
> +     */

I a bit don't follow. bytes and offset are used for the same thing.. and bytes modified the same way, before running task, as you write in a comment. Why zeroes is in a separate group?

>       bool zeroes;
> -    QLIST_ENTRY(BlockCopyTask) list;
> +
> +    /* State */
>       CoQueue wait_queue; /* coroutines blocked on this task */
> +
> +    /* To reference all call states from BlockCopyState */
> +    QLIST_ENTRY(BlockCopyTask) list;
> +

extra new-line?

>   } BlockCopyTask;
>   
>   static int64_t task_end(BlockCopyTask *task)
> @@ -89,15 +104,25 @@ typedef struct BlockCopyState {
>        */
>       BdrvChild *source;
>       BdrvChild *target;
> -    BdrvDirtyBitmap *copy_bitmap;
> +

you add an empty line before group, it looks good

> +    /* State */
>       int64_t in_flight_bytes;
> -    int64_t cluster_size;
>       BlockCopyMethod method;
> -    int64_t max_transfer;
> -    uint64_t len;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>       QLIST_HEAD(, BlockCopyCallState) calls;

but not here..

> +    /* State fields that use a thread-safe API */
> +    BdrvDirtyBitmap *copy_bitmap;
> +    ProgressMeter *progress;
> +    SharedResource *mem;
> +    RateLimit rate_limit;
>   
> +    /*
> +     * IN parameters. Initialized in block_copy_state_new()
> +     * and never changed.
> +     */
> +    int64_t cluster_size;
> +    int64_t max_transfer;
> +    uint64_t len;
>       BdrvRequestFlags write_flags;
>   
>       /*
> @@ -115,12 +140,6 @@ typedef struct BlockCopyState {
>        * block_copy_reset_unallocated() every time it does.
>        */
>       bool skip_unallocated;
> -
> -    ProgressMeter *progress;
> -
> -    SharedResource *mem;
> -
> -    RateLimit rate_limit;
>   } BlockCopyState;
>   
>   static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
> @@ -176,9 +195,9 @@ static inline int64_t block_copy_chunk_size(BlockCopyState *s)
>    * Search for the first dirty area in offset/bytes range and create task at
>    * the beginning of it.
>    */
> -static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
> -                                             BlockCopyCallState *call_state,
> -                                             int64_t offset, int64_t bytes)
> +static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
> +                                                BlockCopyCallState *call_state,
> +                                                int64_t offset, int64_t bytes)
>   {
>       BlockCopyTask *task;
>       int64_t max_chunk = block_copy_chunk_size(s);
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end
  2021-05-18 10:07 ` [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
@ 2021-05-20 15:03   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:03 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> Moving this function in task_end ensures to update the progress
> anyways, even if there is an error.
> 
> It also helps in next patch, allowing task_end to have only
> one critical section.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/block-copy.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index d2d3839dec..2e610b4142 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -264,6 +264,9 @@ static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
>           bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
>       }
>       QLIST_REMOVE(task, list);
> +    progress_set_remaining(task->s->progress,
> +                           bdrv_get_dirty_count(task->s->copy_bitmap) +
> +                           task->s->in_flight_bytes);
>       qemu_co_queue_restart_all(&task->wait_queue);
>   }
>   
> @@ -645,9 +648,6 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>           }
>           if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>               block_copy_task_end(task, 0);
> -            progress_set_remaining(s->progress,
> -                                   bdrv_get_dirty_count(s->copy_bitmap) +
> -                                   s->in_flight_bytes);
>               trace_block_copy_skip_range(s, task->offset, task->bytes);
>               offset = task_end(task);
>               bytes = end - offset;
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-20 15:06     ` Emanuele Giuseppe Esposito
  2021-05-20 15:24       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 15:10     ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-20 15:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 20/05/2021 16:42, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Put the logic to determine the copy size in a separate function, so
>> that there is a simple state machine for the possible methods of
>> copying data from one BlockDriverState to the other.
> 
> Honestly, for me 4-state state-maching + function to determine copy-size 
> doesn't seem better than two simple variables copy_size and use_copy_range.
> 
> What's the benefit of it?

It helps having a single field (method) instead of two (use_copy_range, 
copy_size), especially if we need to take care of protecting it for 
concurrent access. See patch 7.
> 
>>
>> While at it, store the common computation of block_copy_max_transfer
>> into a new field of BlockCopyState, and make sure that we always
>> obey max_transfer; that's more efficient even for the
>> COPY_RANGE_READ_WRITE case.
> 
> hmm, maybe. It could be a separate patch.
> 
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------
> 
> stats agree with me, that its' not a simplification.
> 
>>   1 file changed, 39 insertions(+), 20 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 37c8e8504b..10ce51a244 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -28,6 +28,13 @@
>>   #define BLOCK_COPY_MAX_WORKERS 64
>>   #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
>> +typedef enum {
>> +    COPY_READ_WRITE_CLUSTER,
>> +    COPY_READ_WRITE,
>> +    COPY_RANGE_SMALL,
>> +    COPY_RANGE_FULL
>> +} BlockCopyMethod;
>> +
>>   static coroutine_fn int block_copy_task_entry(AioTask *task);
>>   typedef struct BlockCopyCallState {
>> @@ -85,8 +92,8 @@ typedef struct BlockCopyState {
>>       BdrvDirtyBitmap *copy_bitmap;
>>       int64_t in_flight_bytes;
>>       int64_t cluster_size;
>> -    bool use_copy_range;
>> -    int64_t copy_size;
>> +    BlockCopyMethod method;
>> +    int64_t max_transfer;
>>       uint64_t len;
>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all 
>> block-copy calls */
>>       QLIST_HEAD(, BlockCopyCallState) calls;
>> @@ -148,6 +155,23 @@ static bool coroutine_fn 
>> block_copy_wait_one(BlockCopyState *s, int64_t offset,
>>       return true;
>>   }
>> +static inline int64_t block_copy_chunk_size(BlockCopyState *s)
> 
> "inline" word does nothing in static definitions in c files. Compiler 
> make a decision independently of it.

Typo
> 
>> +{
>> +    switch (s->method) {
>> +    case COPY_READ_WRITE_CLUSTER:
>> +        return s->cluster_size;
>> +    case COPY_READ_WRITE:
>> +    case COPY_RANGE_SMALL:
>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
>> +                   s->max_transfer);
>> +    case COPY_RANGE_FULL:
>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>> +                   s->max_transfer);
>> +    default:
>> +        abort();
>> +    }
>> +}
>> +
>>   /*
>>    * Search for the first dirty area in offset/bytes range and create 
>> task at
>>    * the beginning of it.
>> @@ -157,8 +181,9 @@ static BlockCopyTask 
>> *block_copy_task_create(BlockCopyState *s,
>>                                                int64_t offset, int64_t 
>> bytes)
>>   {
>>       BlockCopyTask *task;
>> -    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, 
>> call_state->max_chunk);
>> +    int64_t max_chunk = block_copy_chunk_size(s);
>> +    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
>>       if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
>>                                              offset, offset + bytes,
>>                                              max_chunk, &offset, &bytes))
>> @@ -265,28 +290,27 @@ BlockCopyState *block_copy_state_new(BdrvChild 
>> *source, BdrvChild *target,
>>           .len = bdrv_dirty_bitmap_size(copy_bitmap),
>>           .write_flags = write_flags,
>>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
>> +        .max_transfer = 
>> QEMU_ALIGN_DOWN(block_copy_max_transfer(source, target)
>> +                                        , cluster_size),
>>       };
>> -    if (block_copy_max_transfer(source, target) < cluster_size) {
>> +    if (s->max_transfer < cluster_size) {
>>           /*
>>            * copy_range does not respect max_transfer. We don't want 
>> to bother
>>            * with requests smaller than block-copy cluster size, so 
>> fallback to
>>            * buffered copying (read and write respect max_transfer on 
>> their
>>            * behalf).
>>            */
>> -        s->use_copy_range = false;
>> -        s->copy_size = cluster_size;
>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>>           /* Compression supports only cluster-size writes and no 
>> copy-range. */
>> -        s->use_copy_range = false;
>> -        s->copy_size = cluster_size;
>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>       } else {
>>           /*
>>            * We enable copy-range, but keep small copy_size, until first
>>            * successful copy_range (look at block_copy_do_copy).
>>            */
>> -        s->use_copy_range = use_copy_range;
>> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>>       }
>>       ratelimit_init(&s->rate_limit);
>> @@ -369,30 +393,25 @@ static int coroutine_fn 
>> block_copy_do_copy(BlockCopyState *s,
>>           return ret;
>>       }
>> -    if (s->use_copy_range) {
>> +    if (s->method >= COPY_RANGE_SMALL) {
> 
> I don't like such condition:
> 1. it forces me to go to enum definition to understand the logic
> 2. it's error prone: it's very possibly to forget to update it, when 
> updating the enum, and logic will be broken.
> 
> No, I don't like moving to state machine for this simple thing.
> 
>>           ret = bdrv_co_copy_range(s->source, offset, s->target, 
>> offset, nbytes,
>>                                    0, s->write_flags);
>>           if (ret < 0) {
>>               trace_block_copy_copy_range_fail(s, offset, ret);
>> -            s->use_copy_range = false;
>> -            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>> +            s->method = COPY_READ_WRITE;
>>               /* Fallback to read+write with allocated buffer */
>>           } else {
>> -            if (s->use_copy_range) {
>> +            if (s->method == COPY_RANGE_SMALL) {
>>                   /*
>>                    * Successful copy-range. Now increase copy_size.  
>> copy_range
>>                    * does not respect max_transfer (it's a TODO), so 
>> we factor
>>                    * that in here.
>>                    *
>> -                 * Note: we double-check s->use_copy_range for the 
>> case when
>> +                 * Note: we double-check s->method for the case when
>>                    * parallel block-copy request unsets it during 
>> previous
>>                    * bdrv_co_copy_range call.
>>                    */
>> -                s->copy_size =
>> -                        MIN(MAX(s->cluster_size, 
>> BLOCK_COPY_MAX_COPY_RANGE),
>> -                            
>> QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>> -                                                                    
>> s->target),
>> -                                            s->cluster_size));
>> +                s->method = COPY_RANGE_FULL;
>>               }
>>               goto out;
>>           }
>>
> 
> 



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions
  2021-05-20 15:00   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-20 15:15     ` Emanuele Giuseppe Esposito
  0 siblings, 0 replies; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-20 15:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 20/05/2021 17:00, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> As done in BlockCopyCallState, categorize BlockCopyTask
>> and BlockCopyState in IN, State and OUT fields.
>> This is just to understand which field has to be protected with a lock.
>>
>> BlockCopyTask .zeroes is a special case, because it is only initialized
>> and then read by the coroutine in block_copy_task_entry.
>>
>> Also set block_copy_task_create as coroutine_fn because:
>> 1) it is static and only invoked by coroutine functions
>> 2) next patches will introduce and use a CoMutex lock there
> 
> this change is unrelated, why not to put it into commit, which adds use 
> of CoMutex in that function?

Ok I will move it in patch 4.

> 
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

[...]
> 
> you add an empty line before group, it looks good
> 
>> +    /* State */
>>       int64_t in_flight_bytes;
>> -    int64_t cluster_size;
>>       BlockCopyMethod method;
>> -    int64_t max_transfer;
>> -    uint64_t len;
>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all 
>> block-copy calls */
>>       QLIST_HEAD(, BlockCopyCallState) calls;
> 
> but not here..

Because these are still State fields, so in the same State group. It is 
a different sub-category.

> 
>> +    /* State fields that use a thread-safe API */
>> +    BdrvDirtyBitmap *copy_bitmap;
>> +    ProgressMeter *progress;
>> +    SharedResource *mem;
>> +    RateLimit rate_limit;
>> +    /*
>> +     * IN parameters. Initialized in block_copy_state_new()
>> +     * and never changed.
>> +     */
>> +    int64_t cluster_size;
>> +    int64_t max_transfer;
>> +    uint64_t len;
>>       BdrvRequestFlags write_flags;
>>       /*

Thank you,
Emanuele



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-18 10:07 ` [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list Emanuele Giuseppe Esposito
@ 2021-05-20 15:19   ` Vladimir Sementsov-Ogievskiy
  2021-05-25 10:07     ` Emanuele Giuseppe Esposito
  2021-05-27  9:07   ` Stefan Hajnoczi
  1 sibling, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:19 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> Because the list of tasks is only modified by coroutine
> functions, add a CoMutex in order to protect them.
> 
> Use the same mutex to protect also BlockCopyState in_flight_bytes
> field to avoid adding additional syncronization primitives.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 55 +++++++++++++++++++++++++++++-----------------
>   1 file changed, 35 insertions(+), 20 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 2e610b4142..3a949fab64 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -83,7 +83,7 @@ typedef struct BlockCopyTask {
>        */
>       bool zeroes;
>   
> -    /* State */
> +    /* State. Protected by tasks_lock */
>       CoQueue wait_queue; /* coroutines blocked on this task */
>   
>       /* To reference all call states from BlockCopyState */
> @@ -106,8 +106,9 @@ typedef struct BlockCopyState {
>       BdrvChild *target;
>   
>       /* State */
> -    int64_t in_flight_bytes;
> +    int64_t in_flight_bytes; /* protected by tasks_lock */

only this field is protected? or the whole State section?

>       BlockCopyMethod method;
> +    CoMutex tasks_lock;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>       QLIST_HEAD(, BlockCopyCallState) calls;
>       /* State fields that use a thread-safe API */
> @@ -142,8 +143,10 @@ typedef struct BlockCopyState {
>       bool skip_unallocated;
>   } BlockCopyState;
>   
> -static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
> -                                            int64_t offset, int64_t bytes)
> +/* Called with lock held */
> +static BlockCopyTask *find_conflicting_task_locked(BlockCopyState *s,
> +                                                   int64_t offset,
> +                                                   int64_t bytes)
>   {
>       BlockCopyTask *t;
>   
> @@ -163,13 +166,16 @@ static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
>   static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>                                                int64_t bytes)
>   {
> -    BlockCopyTask *task = find_conflicting_task(s, offset, bytes);
> +    BlockCopyTask *task;
> +
> +    QEMU_LOCK_GUARD(&s->tasks_lock);
> +    task = find_conflicting_task_locked(s, offset, bytes);
>   
>       if (!task) {
>           return false;
>       }
>   
> -    qemu_co_queue_wait(&task->wait_queue, NULL);
> +    qemu_co_queue_wait(&task->wait_queue, &s->tasks_lock);
>   
>       return true;
>   }
> @@ -213,11 +219,7 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);
>   

Looking at block_copy_task_create():

First, !bdrv_dirty_bitmap_next_dirty_area() doesn't take bitmaps lock, so it's not protected at all.

Next, even if we take bitmaps lock in bdrv_dirty_bitmap_next_dirty_area() or around it, it doesn't bring thread-safety to block_copy_task_create():

Imagine block_copy_task_create() is called from two threads simultaneously. Both calls will get same dirty area and create equal tasks. Then, "assert(!find_conflicting_task_locked(s, offset, bytes));" will crash.


> -    /* region is dirty, so no existent tasks possible in it */
> -    assert(!find_conflicting_task(s, offset, bytes));
> -
>       bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> -    s->in_flight_bytes += bytes;
>   
>       task = g_new(BlockCopyTask, 1);
>       *task = (BlockCopyTask) {
> @@ -228,7 +230,13 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>           .bytes = bytes,
>       };
>       qemu_co_queue_init(&task->wait_queue);
> -    QLIST_INSERT_HEAD(&s->tasks, task, list);
> +
> +    WITH_QEMU_LOCK_GUARD(&s->tasks_lock) {
> +        s->in_flight_bytes += bytes;
> +        /* region is dirty, so no existent tasks possible in it */
> +        assert(!find_conflicting_task_locked(s, offset, bytes));
> +        QLIST_INSERT_HEAD(&s->tasks, task, list);
> +    }
>   
>       return task;
>   }
> @@ -249,25 +257,29 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
>   
>       assert(new_bytes > 0 && new_bytes < task->bytes);
>   
> -    task->s->in_flight_bytes -= task->bytes - new_bytes;
>       bdrv_set_dirty_bitmap(task->s->copy_bitmap,
>                             task->offset + new_bytes, task->bytes - new_bytes);

Then look here. Imagine, immediately after bdrv_set_dirty_bitmap() in parallel thread the new task is created, which consumes these new dirty bits. Again, it will find conflicting task (this one, as task->bytes is not modified yet) and crash.

>   
> -    task->bytes = new_bytes;
> -    qemu_co_queue_restart_all(&task->wait_queue);
> +    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
> +        task->s->in_flight_bytes -= task->bytes - new_bytes;
> +        task->bytes = new_bytes;
> +        qemu_co_queue_restart_all(&task->wait_queue);
> +    }
>   }
>   
>   static void coroutine_fn block_copy_task_end(BlockCopyTask *task, int ret)
>   {
> -    task->s->in_flight_bytes -= task->bytes;
>       if (ret < 0) {
>           bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
>       }
> -    QLIST_REMOVE(task, list);
> -    progress_set_remaining(task->s->progress,
> -                           bdrv_get_dirty_count(task->s->copy_bitmap) +
> -                           task->s->in_flight_bytes);
> -    qemu_co_queue_restart_all(&task->wait_queue);
> +    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
> +        task->s->in_flight_bytes -= task->bytes;
> +        QLIST_REMOVE(task, list);
> +        progress_set_remaining(task->s->progress,
> +                               bdrv_get_dirty_count(task->s->copy_bitmap) +
> +                               task->s->in_flight_bytes);
> +        qemu_co_queue_restart_all(&task->wait_queue);
> +    }
>   }
>   
>   void block_copy_state_free(BlockCopyState *s)
> @@ -336,6 +348,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>       }
>   
>       ratelimit_init(&s->rate_limit);
> +    qemu_co_mutex_init(&s->tasks_lock);
>       QLIST_INIT(&s->tasks);
>       QLIST_INIT(&s->calls);
>   
> @@ -586,9 +599,11 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>   
>       if (!ret) {
>           bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +        qemu_co_mutex_lock(&s->tasks_lock);
>           progress_set_remaining(s->progress,
>                                  bdrv_get_dirty_count(s->copy_bitmap) +
>                                  s->in_flight_bytes);
> +        qemu_co_mutex_unlock(&s->tasks_lock);
>       }
>   
>       *count = bytes;
> 

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-20 15:06     ` Emanuele Giuseppe Esposito
@ 2021-05-20 15:24       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:24 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

20.05.2021 18:06, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 20/05/2021 16:42, Vladimir Sementsov-Ogievskiy wrote:
>> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> Put the logic to determine the copy size in a separate function, so
>>> that there is a simple state machine for the possible methods of
>>> copying data from one BlockDriverState to the other.
>>
>> Honestly, for me 4-state state-maching + function to determine copy-size doesn't seem better than two simple variables copy_size and use_copy_range.
>>
>> What's the benefit of it?
> 
> It helps having a single field (method) instead of two (use_copy_range, copy_size), especially if we need to take care of protecting it for concurrent access. See patch 7.

What's the problem with protecting two fields?

(me looking at patch 7) What's the reason of introducing atomic operations? It makes things more complicated (we have two synchronization mechanisms (mutex + atomics) instead of one (mutext)), with no benefit.

>>
>>>
>>> While at it, store the common computation of block_copy_max_transfer
>>> into a new field of BlockCopyState, and make sure that we always
>>> obey max_transfer; that's more efficient even for the
>>> COPY_RANGE_READ_WRITE case.
>>
>> hmm, maybe. It could be a separate patch.
>>
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>   block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------
>>
>> stats agree with me, that its' not a simplification.
>>
>>>   1 file changed, 39 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 37c8e8504b..10ce51a244 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -28,6 +28,13 @@
>>>   #define BLOCK_COPY_MAX_WORKERS 64
>>>   #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
>>> +typedef enum {
>>> +    COPY_READ_WRITE_CLUSTER,
>>> +    COPY_READ_WRITE,
>>> +    COPY_RANGE_SMALL,
>>> +    COPY_RANGE_FULL
>>> +} BlockCopyMethod;
>>> +
>>>   static coroutine_fn int block_copy_task_entry(AioTask *task);
>>>   typedef struct BlockCopyCallState {
>>> @@ -85,8 +92,8 @@ typedef struct BlockCopyState {
>>>       BdrvDirtyBitmap *copy_bitmap;
>>>       int64_t in_flight_bytes;
>>>       int64_t cluster_size;
>>> -    bool use_copy_range;
>>> -    int64_t copy_size;
>>> +    BlockCopyMethod method;
>>> +    int64_t max_transfer;
>>>       uint64_t len;
>>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>>>       QLIST_HEAD(, BlockCopyCallState) calls;
>>> @@ -148,6 +155,23 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>>>       return true;
>>>   }
>>> +static inline int64_t block_copy_chunk_size(BlockCopyState *s)
>>
>> "inline" word does nothing in static definitions in c files. Compiler make a decision independently of it.
> 
> Typo
>>
>>> +{
>>> +    switch (s->method) {
>>> +    case COPY_READ_WRITE_CLUSTER:
>>> +        return s->cluster_size;
>>> +    case COPY_READ_WRITE:
>>> +    case COPY_RANGE_SMALL:
>>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
>>> +                   s->max_transfer);
>>> +    case COPY_RANGE_FULL:
>>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>>> +                   s->max_transfer);
>>> +    default:
>>> +        abort();
>>> +    }
>>> +}
>>> +
>>>   /*
>>>    * Search for the first dirty area in offset/bytes range and create task at
>>>    * the beginning of it.
>>> @@ -157,8 +181,9 @@ static BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>>>                                                int64_t offset, int64_t bytes)
>>>   {
>>>       BlockCopyTask *task;
>>> -    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, call_state->max_chunk);
>>> +    int64_t max_chunk = block_copy_chunk_size(s);
>>> +    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
>>>       if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
>>>                                              offset, offset + bytes,
>>>                                              max_chunk, &offset, &bytes))
>>> @@ -265,28 +290,27 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>>>           .len = bdrv_dirty_bitmap_size(copy_bitmap),
>>>           .write_flags = write_flags,
>>>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
>>> +        .max_transfer = QEMU_ALIGN_DOWN(block_copy_max_transfer(source, target)
>>> +                                        , cluster_size),
>>>       };
>>> -    if (block_copy_max_transfer(source, target) < cluster_size) {
>>> +    if (s->max_transfer < cluster_size) {
>>>           /*
>>>            * copy_range does not respect max_transfer. We don't want to bother
>>>            * with requests smaller than block-copy cluster size, so fallback to
>>>            * buffered copying (read and write respect max_transfer on their
>>>            * behalf).
>>>            */
>>> -        s->use_copy_range = false;
>>> -        s->copy_size = cluster_size;
>>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>>>           /* Compression supports only cluster-size writes and no copy-range. */
>>> -        s->use_copy_range = false;
>>> -        s->copy_size = cluster_size;
>>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>>       } else {
>>>           /*
>>>            * We enable copy-range, but keep small copy_size, until first
>>>            * successful copy_range (look at block_copy_do_copy).
>>>            */
>>> -        s->use_copy_range = use_copy_range;
>>> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>>> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>>>       }
>>>       ratelimit_init(&s->rate_limit);
>>> @@ -369,30 +393,25 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>>>           return ret;
>>>       }
>>> -    if (s->use_copy_range) {
>>> +    if (s->method >= COPY_RANGE_SMALL) {
>>
>> I don't like such condition:
>> 1. it forces me to go to enum definition to understand the logic
>> 2. it's error prone: it's very possibly to forget to update it, when updating the enum, and logic will be broken.
>>
>> No, I don't like moving to state machine for this simple thing.
>>
>>>           ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
>>>                                    0, s->write_flags);
>>>           if (ret < 0) {
>>>               trace_block_copy_copy_range_fail(s, offset, ret);
>>> -            s->use_copy_range = false;
>>> -            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>>> +            s->method = COPY_READ_WRITE;
>>>               /* Fallback to read+write with allocated buffer */
>>>           } else {
>>> -            if (s->use_copy_range) {
>>> +            if (s->method == COPY_RANGE_SMALL) {
>>>                   /*
>>>                    * Successful copy-range. Now increase copy_size. copy_range
>>>                    * does not respect max_transfer (it's a TODO), so we factor
>>>                    * that in here.
>>>                    *
>>> -                 * Note: we double-check s->use_copy_range for the case when
>>> +                 * Note: we double-check s->method for the case when
>>>                    * parallel block-copy request unsets it during previous
>>>                    * bdrv_co_copy_range call.
>>>                    */
>>> -                s->copy_size =
>>> -                        MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>>> - QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>>> - s->target),
>>> -                                            s->cluster_size));
>>> +                s->method = COPY_RANGE_FULL;
>>>               }
>>>               goto out;
>>>           }
>>>
>>
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-18 10:07 ` [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list Emanuele Giuseppe Esposito
@ 2021-05-20 15:30   ` Vladimir Sementsov-Ogievskiy
  2021-05-21 15:01     ` Paolo Bonzini
  2021-05-28 10:53   ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:30 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> As for BlockCopyTask, add a lock to protect BlockCopyCallState
> ret and sleep_state fields. Also move ret, finished and cancelled
> in the OUT fields of BlockCopyCallState.
> 
> Here a QemuMutex is used to protect QemuCoSleep field, since it
> can be concurrently invoked also from outside threads.
> 
> .finished, .cancelled and reads to .ret and .error_is_read will be
> protected in the following patch.
> 
> .sleep state is handled in the series "coroutine: new sleep/wake API"

Could we live with one mutex for all needs? Why to add one more?

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 27 +++++++++++++++++++--------
>   1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 3a949fab64..d5ed5932b0 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -55,13 +55,14 @@ typedef struct BlockCopyCallState {
>       QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* State */
> -    int ret;
>       bool finished;
> -    QemuCoSleep sleep;
> -    bool cancelled;
> +    QemuCoSleep sleep; /* TODO: protect API with a lock */
>   
>       /* OUT parameters */
> +    bool cancelled;
> +    /* Fields protected by calls_lock in BlockCopyState */
>       bool error_is_read;
> +    int ret;
>   } BlockCopyCallState;
>   
>   typedef struct BlockCopyTask {
> @@ -110,6 +111,7 @@ typedef struct BlockCopyState {
>       BlockCopyMethod method;
>       CoMutex tasks_lock;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
> +    QemuMutex calls_lock;
>       QLIST_HEAD(, BlockCopyCallState) calls;
>       /* State fields that use a thread-safe API */
>       BdrvDirtyBitmap *copy_bitmap;
> @@ -289,6 +291,7 @@ void block_copy_state_free(BlockCopyState *s)
>       }
>   
>       ratelimit_destroy(&s->rate_limit);
> +    qemu_mutex_destroy(&s->calls_lock);
>       bdrv_release_dirty_bitmap(s->copy_bitmap);
>       shres_destroy(s->mem);
>       g_free(s);
> @@ -349,6 +352,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>   
>       ratelimit_init(&s->rate_limit);
>       qemu_co_mutex_init(&s->tasks_lock);
> +    qemu_mutex_init(&s->calls_lock);
>       QLIST_INIT(&s->tasks);
>       QLIST_INIT(&s->calls);
>   
> @@ -492,11 +496,14 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
>   
>       ret = block_copy_do_copy(t->s, t->offset, t->bytes, t->zeroes,
>                                &error_is_read);
> -    if (ret < 0 && !t->call_state->ret) {
> -        t->call_state->ret = ret;
> -        t->call_state->error_is_read = error_is_read;
> -    } else {
> -        progress_work_done(t->s->progress, t->bytes);
> +
> +    WITH_QEMU_LOCK_GUARD(&t->s->calls_lock) {
> +        if (ret < 0 && !t->call_state->ret) {
> +            t->call_state->ret = ret;
> +            t->call_state->error_is_read = error_is_read;
> +        } else {
> +            progress_work_done(t->s->progress, t->bytes);
> +        }
>       }
>       co_put_to_shres(t->s->mem, t->bytes);
>       block_copy_task_end(t, ret);
> @@ -740,7 +747,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>   {
>       int ret;
>   
> +    qemu_mutex_lock(&call_state->s->calls_lock);
>       QLIST_INSERT_HEAD(&call_state->s->calls, call_state, list);
> +    qemu_mutex_unlock(&call_state->s->calls_lock);
>   
>       do {
>           ret = block_copy_dirty_clusters(call_state);
> @@ -767,7 +776,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>           call_state->cb(call_state->cb_opaque);
>       }
>   
> +    qemu_mutex_lock(&call_state->s->calls_lock);
>       QLIST_REMOVE(call_state, list);
> +    qemu_mutex_unlock(&call_state->s->calls_lock);
>   
>       return ret;
>   }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-18 10:07 ` [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
@ 2021-05-20 15:34   ` Vladimir Sementsov-Ogievskiy
  2021-05-21 15:02     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-20 15:34 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> By adding acquire/release pairs, we ensure that .ret and .error_is_read
> fields are written by block_copy_dirty_clusters before .finished is true.

As I already said, please, can we live with one mutex for now? finished, ret, error_is_read, all these variables are changing rarely. I doubt that performance is improved by these atomic operations. But complexity of the architecture increases exponentially.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 33 ++++++++++++++++++---------------
>   1 file changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index d5ed5932b0..573e96fefb 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -55,11 +55,11 @@ typedef struct BlockCopyCallState {
>       QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* State */
> -    bool finished;
> +    bool finished; /* atomic */
>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>   
>       /* OUT parameters */
> -    bool cancelled;
> +    bool cancelled; /* atomic */
>       /* Fields protected by calls_lock in BlockCopyState */
>       bool error_is_read;
>       int ret;
> @@ -646,7 +646,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
> +    while (bytes && aio_task_pool_status(aio) == 0 &&
> +           !qatomic_read(&call_state->cancelled)) {
>           BlockCopyTask *task;
>           int64_t status_bytes;
>   
> @@ -754,7 +755,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>       do {
>           ret = block_copy_dirty_clusters(call_state);
>   
> -        if (ret == 0 && !call_state->cancelled) {
> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>               ret = block_copy_wait_one(call_state->s, call_state->offset,
>                                         call_state->bytes);
>           }
> @@ -768,9 +769,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>            * 2. We have waited for some intersecting block-copy request
>            *    It may have failed and produced new dirty bits.
>            */
> -    } while (ret > 0 && !call_state->cancelled);
> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>   
> -    call_state->finished = true;
> +    qatomic_store_release(&call_state->finished, true);
>   
>       if (call_state->cb) {
>           call_state->cb(call_state->cb_opaque);
> @@ -833,35 +834,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
>           return;
>       }
>   
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));
>       g_free(call_state);
>   }
>   
>   bool block_copy_call_finished(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished;
> +    return qatomic_load_acquire(&call_state->finished);
>   }
>   
>   bool block_copy_call_succeeded(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret == 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret == 0;
>   }
>   
>   bool block_copy_call_failed(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret < 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret < 0;
>   }
>   
>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>   {
> -    return call_state->cancelled;
> +    return qatomic_read(&call_state->cancelled);
>   }
>   
>   int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   {
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));
>       if (error_is_read) {
>           *error_is_read = call_state->error_is_read;
>       }
> @@ -870,7 +873,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   
>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>   {
> -    call_state->cancelled = true;
> +    qatomic_set(&call_state->cancelled, true);
>       block_copy_kick(call_state);
>   }
>   
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-20 15:30   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 15:01     ` Paolo Bonzini
  2021-05-25 10:58       ` Emanuele Giuseppe Esposito
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-21 15:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 20/05/21 17:30, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> As for BlockCopyTask, add a lock to protect BlockCopyCallState
>> ret and sleep_state fields. Also move ret, finished and cancelled
>> in the OUT fields of BlockCopyCallState.
>>
>> Here a QemuMutex is used to protect QemuCoSleep field, since it
>> can be concurrently invoked also from outside threads.
>>
>> .finished, .cancelled and reads to .ret and .error_is_read will be
>> protected in the following patch.
>>
>> .sleep state is handled in the series "coroutine: new sleep/wake API"
> 
> Could we live with one mutex for all needs? Why to add one more?

This patch should just go away; the QemuMutex will not be needed once 
QemuCoSleep is thread safe, while right now it is still racy.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-20 15:34   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 15:02     ` Paolo Bonzini
  2021-05-21 15:53       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 15:56       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-21 15:02 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 20/05/21 17:34, Vladimir Sementsov-Ogievskiy wrote:
> 
>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>> fields are written by block_copy_dirty_clusters before .finished is true.
> 
> As I already said, please, can we live with one mutex for now? finished, 
> ret, error_is_read, all these variables are changing rarely. I doubt 
> that performance is improved by these atomic operations. But complexity 
> of the architecture increases exponentially.

The problem is that these are used outside coroutines. 
load-acquire/store-release is the simplest way to handle a "finished" 
flag really.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
  2021-05-20 15:06     ` Emanuele Giuseppe Esposito
@ 2021-05-21 15:10     ` Paolo Bonzini
  2021-05-21 15:48       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 17:51       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-21 15:10 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 20/05/21 16:42, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Put the logic to determine the copy size in a separate function, so
>> that there is a simple state machine for the possible methods of
>> copying data from one BlockDriverState to the other.
> 
> Honestly, for me 4-state state-maching + function to determine copy-size 
> doesn't seem better than two simple variables copy_size and use_copy_range.

There were six states before (2 for s->use_copy_range, three for s->copy_size),
of which two were unused.  The heuristics for going between copy and read/write
were quite illegible.

> What's the benefit of it?

Less duplication here, for example:

+    if (s->max_transfer < cluster_size) {
           /*
            * copy_range does not respect max_transfer. We don't want to bother
            * with requests smaller than block-copy cluster size, so fallback to
            * buffered copying (read and write respect max_transfer on their
            * behalf).
            */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;
       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
           /* Compression supports only cluster-size writes and no copy-range. */
-        s->use_copy_range = false;
-        s->copy_size = cluster_size;
+        s->method = COPY_READ_WRITE_CLUSTER;

and here:

               trace_block_copy_copy_range_fail(s, offset, ret);
-            s->use_copy_range = false;
-            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
+            s->method = COPY_READ_WRITE;
...
           /*
            * We enable copy-range, but keep small copy_size, until first
            * successful copy_range (look at block_copy_do_copy).
            */
-        s->use_copy_range = use_copy_range;
-        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
+        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;

where it's not obvious that the two assignments to copy_size should be
the same (and they're suboptimal, too, since they don't obey max_transfer).

... plus...

>> While at it, store the common computation of block_copy_max_transfer
>> into a new field of BlockCopyState, and make sure that we always
>> obey max_transfer; that's more efficient even for the
>> COPY_RANGE_READ_WRITE case.

having a function makes it easier to spot slight differences that are
just bugs, such as this one.

>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------
> 
> stats agree with me, that its' not a simplification.

Stats don't say everything.  Not having something like this:

                 s->copy_size =
                         MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
                             QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
                                                                     s->target),
                                             s->cluster_size));

in the inner loop is already worth the extra lines for the
function declaration, for example.

Paolo

>>   1 file changed, 39 insertions(+), 20 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 37c8e8504b..10ce51a244 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -28,6 +28,13 @@
>>   #define BLOCK_COPY_MAX_WORKERS 64
>>   #define BLOCK_COPY_SLICE_TIME 100000000ULL /* ns */
>> +typedef enum {
>> +    COPY_READ_WRITE_CLUSTER,
>> +    COPY_READ_WRITE,
>> +    COPY_RANGE_SMALL,
>> +    COPY_RANGE_FULL
>> +} BlockCopyMethod;
>> +
>>   static coroutine_fn int block_copy_task_entry(AioTask *task);
>>   typedef struct BlockCopyCallState {
>> @@ -85,8 +92,8 @@ typedef struct BlockCopyState {
>>       BdrvDirtyBitmap *copy_bitmap;
>>       int64_t in_flight_bytes;
>>       int64_t cluster_size;
>> -    bool use_copy_range;
>> -    int64_t copy_size;
>> +    BlockCopyMethod method;
>> +    int64_t max_transfer;
>>       uint64_t len;
>>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all 
>> block-copy calls */
>>       QLIST_HEAD(, BlockCopyCallState) calls;
>> @@ -148,6 +155,23 @@ static bool coroutine_fn 
>> block_copy_wait_one(BlockCopyState *s, int64_t offset,
>>       return true;
>>   }
>> +static inline int64_t block_copy_chunk_size(BlockCopyState *s)
> 
> "inline" word does nothing in static definitions in c files. Compiler 
> make a decision independently of it.
> 
>> +{
>> +    switch (s->method) {
>> +    case COPY_READ_WRITE_CLUSTER:
>> +        return s->cluster_size;
>> +    case COPY_READ_WRITE:
>> +    case COPY_RANGE_SMALL:
>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER),
>> +                   s->max_transfer);
>> +    case COPY_RANGE_FULL:
>> +        return MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>> +                   s->max_transfer);
>> +    default:
>> +        abort();
>> +    }
>> +}
>> +
>>   /*
>>    * Search for the first dirty area in offset/bytes range and create 
>> task at
>>    * the beginning of it.
>> @@ -157,8 +181,9 @@ static BlockCopyTask 
>> *block_copy_task_create(BlockCopyState *s,
>>                                                int64_t offset, int64_t 
>> bytes)
>>   {
>>       BlockCopyTask *task;
>> -    int64_t max_chunk = MIN_NON_ZERO(s->copy_size, 
>> call_state->max_chunk);
>> +    int64_t max_chunk = block_copy_chunk_size(s);
>> +    max_chunk = MIN_NON_ZERO(max_chunk, call_state->max_chunk);
>>       if (!bdrv_dirty_bitmap_next_dirty_area(s->copy_bitmap,
>>                                              offset, offset + bytes,
>>                                              max_chunk, &offset, &bytes))
>> @@ -265,28 +290,27 @@ BlockCopyState *block_copy_state_new(BdrvChild 
>> *source, BdrvChild *target,
>>           .len = bdrv_dirty_bitmap_size(copy_bitmap),
>>           .write_flags = write_flags,
>>           .mem = shres_create(BLOCK_COPY_MAX_MEM),
>> +        .max_transfer = 
>> QEMU_ALIGN_DOWN(block_copy_max_transfer(source, target)
>> +                                        , cluster_size),
>>       };
>> -    if (block_copy_max_transfer(source, target) < cluster_size) {
>> +    if (s->max_transfer < cluster_size) {
>>           /*
>>            * copy_range does not respect max_transfer. We don't want 
>> to bother
>>            * with requests smaller than block-copy cluster size, so 
>> fallback to
>>            * buffered copying (read and write respect max_transfer on 
>> their
>>            * behalf).
>>            */
>> -        s->use_copy_range = false;
>> -        s->copy_size = cluster_size;
>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>>           /* Compression supports only cluster-size writes and no 
>> copy-range. */
>> -        s->use_copy_range = false;
>> -        s->copy_size = cluster_size;
>> +        s->method = COPY_READ_WRITE_CLUSTER;
>>       } else {
>>           /*
>>            * We enable copy-range, but keep small copy_size, until first
>>            * successful copy_range (look at block_copy_do_copy).
>>            */
>> -        s->use_copy_range = use_copy_range;
>> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>>       }
>>       ratelimit_init(&s->rate_limit);
>> @@ -369,30 +393,25 @@ static int coroutine_fn 
>> block_copy_do_copy(BlockCopyState *s,
>>           return ret;
>>       }
>> -    if (s->use_copy_range) {
>> +    if (s->method >= COPY_RANGE_SMALL) {
> 
> I don't like such condition:
> 1. it forces me to go to enum definition to understand the logic
> 2. it's error prone: it's very possibly to forget to update it, when 
> updating the enum, and logic will be broken.
> 
> No, I don't like moving to state machine for this simple thing.
> 
>>           ret = bdrv_co_copy_range(s->source, offset, s->target, 
>> offset, nbytes,
>>                                    0, s->write_flags);
>>           if (ret < 0) {
>>               trace_block_copy_copy_range_fail(s, offset, ret);
>> -            s->use_copy_range = false;
>> -            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
>> +            s->method = COPY_READ_WRITE;
>>               /* Fallback to read+write with allocated buffer */
>>           } else {
>> -            if (s->use_copy_range) {
>> +            if (s->method == COPY_RANGE_SMALL) {
>>                   /*
>>                    * Successful copy-range. Now increase copy_size.  
>> copy_range
>>                    * does not respect max_transfer (it's a TODO), so 
>> we factor
>>                    * that in here.
>>                    *
>> -                 * Note: we double-check s->use_copy_range for the 
>> case when
>> +                 * Note: we double-check s->method for the case when
>>                    * parallel block-copy request unsets it during 
>> previous
>>                    * bdrv_co_copy_range call.
>>                    */
>> -                s->copy_size =
>> -                        MIN(MAX(s->cluster_size, 
>> BLOCK_COPY_MAX_COPY_RANGE),
>> -                            
>> QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>> -                                                                    
>> s->target),
>> -                                            s->cluster_size));
>> +                s->method = COPY_RANGE_FULL;
>>               }
>>               goto out;
>>           }
>>
> 
> 



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-21 15:10     ` Paolo Bonzini
@ 2021-05-21 15:48       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 16:43         ` Paolo Bonzini
  2021-05-21 17:51       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-21 15:48 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

21.05.2021 18:10, Paolo Bonzini wrote:
> On 20/05/21 16:42, Vladimir Sementsov-Ogievskiy wrote:
>> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> Put the logic to determine the copy size in a separate function, so
>>> that there is a simple state machine for the possible methods of
>>> copying data from one BlockDriverState to the other.
>>
>> Honestly, for me 4-state state-maching + function to determine copy-size doesn't seem better than two simple variables copy_size and use_copy_range.
> 
> There were six states before (2 for s->use_copy_range, three for s->copy_size),
> of which two were unused.  The heuristics for going between copy and read/write
> were quite illegible.
> 
>> What's the benefit of it?
> 
> Less duplication here, for example:
> 
> +    if (s->max_transfer < cluster_size) {
>            /*
>             * copy_range does not respect max_transfer. We don't want to bother
>             * with requests smaller than block-copy cluster size, so fallback to
>             * buffered copying (read and write respect max_transfer on their
>             * behalf).
>             */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
>        } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>            /* Compression supports only cluster-size writes and no copy-range. */
> -        s->use_copy_range = false;
> -        s->copy_size = cluster_size;
> +        s->method = COPY_READ_WRITE_CLUSTER;
> 
> and here:
> 
>                trace_block_copy_copy_range_fail(s, offset, ret);
> -            s->use_copy_range = false;
> -            s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +            s->method = COPY_READ_WRITE;
> ...
>            /*
>             * We enable copy-range, but keep small copy_size, until first
>             * successful copy_range (look at block_copy_do_copy).
>             */
> -        s->use_copy_range = use_copy_range;
> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
> 
> where it's not obvious that the two assignments to copy_size should be
> the same (and they're suboptimal, too, since they don't obey max_transfer).
> 
> ... plus...
> 
>>> While at it, store the common computation of block_copy_max_transfer
>>> into a new field of BlockCopyState, and make sure that we always
>>> obey max_transfer; that's more efficient even for the
>>> COPY_RANGE_READ_WRITE case.
> 
> having a function makes it easier to spot slight differences that are
> just bugs, such as this one.
> 
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>   block/block-copy.c | 59 ++++++++++++++++++++++++++++++----------------
>>
>> stats agree with me, that its' not a simplification.
> 
> Stats don't say everything.  Not having something like this:
> 
>                  s->copy_size =
>                          MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>                              QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>                                                                      s->target),
>                                              s->cluster_size));
> 
> in the inner loop is already worth the extra lines for the
> function declaration, for example.


After my "[PATCH v2 00/33] block: publish backup-top filter" copy_range path becomes unused. I keep it thinking about further moving qemu-img convert to block-copy. But I don't even have a plan when to start this work. So, if we want to do something around copy_range here to prepare for thread-safety, let's just drop it for now as unused on top of "[PATCH v2 00/33] block: publish backup-top filter" (you can take one patch that drop copy_range support in backup to your series to not make a dependency). It's not difficult to reimplement it later.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-21 15:02     ` Paolo Bonzini
@ 2021-05-21 15:53       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 15:56       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-21 15:53 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

21.05.2021 18:02, Paolo Bonzini wrote:
> On 20/05/21 17:34, Vladimir Sementsov-Ogievskiy wrote:
>>
>>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>>> fields are written by block_copy_dirty_clusters before .finished is true.
>>
>> As I already said, please, can we live with one mutex for now? finished, ret, error_is_read, all these variables are changing rarely. I doubt that performance is improved by these atomic operations. But complexity of the architecture increases exponentially.
> 
> The problem is that these are used outside coroutines.

Hmm. I think, if some bit of block-copy is not in a coroutine yet, it's simple to move it to coroutine

> load-acquire/store-release is the simplest way to handle a "finished" flag really.
> 
> Paolo
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-21 15:02     ` Paolo Bonzini
  2021-05-21 15:53       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 15:56       ` Vladimir Sementsov-Ogievskiy
  2021-05-21 16:47         ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-21 15:56 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

21.05.2021 18:02, Paolo Bonzini wrote:
> On 20/05/21 17:34, Vladimir Sementsov-Ogievskiy wrote:
>>
>>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>>> fields are written by block_copy_dirty_clusters before .finished is true.
>>
>> As I already said, please, can we live with one mutex for now? finished, ret, error_is_read, all these variables are changing rarely. I doubt that performance is improved by these atomic operations. But complexity of the architecture increases exponentially.
> 
> The problem is that these are used outside coroutines. load-acquire/store-release is the simplest way to handle a "finished" flag really.
> 

Related, maybe we can support CoMutex outside of coroutine context?

Create a coroutine, which will lock the mutex and unlock it for us... Or something like this.. It will help the task of making everything thread-safe a lot


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-21 15:48       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 16:43         ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-21 16:43 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 21/05/21 17:48, Vladimir Sementsov-Ogievskiy wrote:
> I keep it thinking about further moving qemu-img convert to block-copy. 
> But I don't even have a plan when to start this work. So, if we want to 
> do something around copy_range here to prepare for thread-safety, let's 
> just drop it for now as unused on top of "[PATCH v2 00/33] block: 
> publish backup-top filter" (you can take one patch that drop copy_range 
> support in backup to your series to not make a dependency). It's not 
> difficult to reimplement it later.

Then we'll have this same conversation later.  I would hope that this 
series goes in before yours.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  2021-05-21 15:56       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 16:47         ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-21 16:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 21/05/21 17:56, Vladimir Sementsov-Ogievskiy wrote:
> 21.05.2021 18:02, Paolo Bonzini wrote:
>> On 20/05/21 17:34, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>>>> fields are written by block_copy_dirty_clusters before .finished is 
>>>> true.
>>>
>>> As I already said, please, can we live with one mutex for now? 
>>> finished, ret, error_is_read, all these variables are changing 
>>> rarely. I doubt that performance is improved by these atomic 
>>> operations. But complexity of the architecture increases exponentially.
>>
>> The problem is that these are used outside coroutines. 
>> load-acquire/store-release is the simplest way to handle a "finished" 
>> flag really.
>>
> 
> Related, maybe we can support CoMutex outside of coroutine context?
> 
> Create a coroutine, which will lock the mutex and unlock it for us... Or 
> something like this.. It will help the task of making everything 
> thread-safe a lot

No, it's not possible because the coroutine will yield to the caller if 
the mutex is contended, but the caller will not be able to use the data 
that is protected by the mutex.

There is no reason to have stuff like block_copy_call_status be a 
coroutine_fn.  Even if it's only called from a coroutine today I'd 
rather have the code more future proof.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-18 10:07 ` [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields Emanuele Giuseppe Esposito
@ 2021-05-21 17:10   ` Vladimir Sementsov-Ogievskiy
  2021-05-25 10:18     ` Emanuele Giuseppe Esposito
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-21 17:10 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> With tasks and calls lock protecting all State fields,
> .method is the last BlockCopyState field left unprotected.
> Set it as atomic.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

OK, in 06 some things are out of coroutine. Here could we just reuse mutex?

I believe, that we don't need any kind of protection for .method inside block_copy_state_new(), as it's just a creation and initialization of new structure.

And other things are called from coroutines. So, seems no reasons for additional atomic access logic?

> ---
>   block/block-copy.c | 37 ++++++++++++++++++-------------------
>   1 file changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 573e96fefb..ebccb7fbc6 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -108,7 +108,7 @@ typedef struct BlockCopyState {
>   
>       /* State */
>       int64_t in_flight_bytes; /* protected by tasks_lock */
> -    BlockCopyMethod method;
> +    BlockCopyMethod method; /* atomic */
>       CoMutex tasks_lock;
>       QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls */
>       QemuMutex calls_lock;
> @@ -184,7 +184,7 @@ static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
>   
>   static inline int64_t block_copy_chunk_size(BlockCopyState *s)
>   {
> -    switch (s->method) {
> +    switch (qatomic_read(&s->method)) {
>       case COPY_READ_WRITE_CLUSTER:
>           return s->cluster_size;
>       case COPY_READ_WRITE:
> @@ -338,16 +338,17 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
>            * buffered copying (read and write respect max_transfer on their
>            * behalf).
>            */
> -        s->method = COPY_READ_WRITE_CLUSTER;
> +        qatomic_set(&s->method, COPY_READ_WRITE_CLUSTER);
>       } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
>           /* Compression supports only cluster-size writes and no copy-range. */
> -        s->method = COPY_READ_WRITE_CLUSTER;
> +        qatomic_set(&s->method, COPY_READ_WRITE_CLUSTER);
>       } else {
>           /*
>            * We enable copy-range, but keep small copy_size, until first
>            * successful copy_range (look at block_copy_do_copy).
>            */
> -        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
> +        qatomic_set(&s->method, use_copy_range ? COPY_RANGE_SMALL :
> +                                                 COPY_READ_WRITE);
>       }
>   
>       ratelimit_init(&s->rate_limit);
> @@ -432,26 +433,24 @@ static int coroutine_fn block_copy_do_copy(BlockCopyState *s,
>           return ret;
>       }
>   
> -    if (s->method >= COPY_RANGE_SMALL) {
> +    if (qatomic_read(&s->method) >= COPY_RANGE_SMALL) {
>           ret = bdrv_co_copy_range(s->source, offset, s->target, offset, nbytes,
>                                    0, s->write_flags);
>           if (ret < 0) {
>               trace_block_copy_copy_range_fail(s, offset, ret);
> -            s->method = COPY_READ_WRITE;
> +            qatomic_set(&s->method, COPY_READ_WRITE);
>               /* Fallback to read+write with allocated buffer */
>           } else {
> -            if (s->method == COPY_RANGE_SMALL) {
> -                /*
> -                 * Successful copy-range. Now increase copy_size.  copy_range
> -                 * does not respect max_transfer (it's a TODO), so we factor
> -                 * that in here.
> -                 *
> -                 * Note: we double-check s->method for the case when
> -                 * parallel block-copy request unsets it during previous
> -                 * bdrv_co_copy_range call.
> -                 */
> -                s->method = COPY_RANGE_FULL;
> -            }
> +            /*
> +             * Successful copy-range. Now increase copy_size.  copy_range
> +             * does not respect max_transfer (it's a TODO), so we factor
> +             * that in here.
> +             *
> +             * Note: we double-check s->method for the case when
> +             * parallel block-copy request unsets it during previous
> +             * bdrv_co_copy_range call.
> +             */
> +            qatomic_cmpxchg(&s->method, COPY_RANGE_SMALL, COPY_RANGE_FULL);
>               goto out;
>           }
>       }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-21 15:10     ` Paolo Bonzini
  2021-05-21 15:48       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-21 17:51       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-21 17:51 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

21.05.2021 18:10, Paolo Bonzini wrote:
> Stats don't say everything.  Not having something like this:
> 
>                  s->copy_size =
>                          MIN(MAX(s->cluster_size, BLOCK_COPY_MAX_COPY_RANGE),
>                              QEMU_ALIGN_DOWN(block_copy_max_transfer(s->source,
>                                                                      s->target),
>                                              s->cluster_size));


Yes, this one always looked bad..

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-20 15:19   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-25 10:07     ` Emanuele Giuseppe Esposito
  2021-05-25 10:25       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-25 10:07 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 20/05/2021 17:19, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> Because the list of tasks is only modified by coroutine
>> functions, add a CoMutex in order to protect them.
>>
>> Use the same mutex to protect also BlockCopyState in_flight_bytes
>> field to avoid adding additional syncronization primitives.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 55 +++++++++++++++++++++++++++++-----------------
>>   1 file changed, 35 insertions(+), 20 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 2e610b4142..3a949fab64 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -83,7 +83,7 @@ typedef struct BlockCopyTask {
>>        */
>>       bool zeroes;
>> -    /* State */
>> +    /* State. Protected by tasks_lock */
>>       CoQueue wait_queue; /* coroutines blocked on this task */
>>       /* To reference all call states from BlockCopyState */
>> @@ -106,8 +106,9 @@ typedef struct BlockCopyState {
>>       BdrvChild *target;
>>       /* State */
>> -    int64_t in_flight_bytes;
>> +    int64_t in_flight_bytes; /* protected by tasks_lock */
> 
> only this field is protected? or the whole State section?

I guess you figured it already, here there is only in_flight_bytes 
because in next patches I take care of the others.

[...]

>>   }
>> @@ -213,11 +219,7 @@ static coroutine_fn BlockCopyTask 
>> *block_copy_task_create(BlockCopyState *s,
>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>       bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);
> 
> Looking at block_copy_task_create():
> 
> First, !bdrv_dirty_bitmap_next_dirty_area() doesn't take bitmaps lock, 
> so it's not protected at all.
> 
> Next, even if we take bitmaps lock in 
> bdrv_dirty_bitmap_next_dirty_area() or around it, it doesn't bring 
> thread-safety to block_copy_task_create():

The simplest solution here seems to protect 
bdrv_dirty_bitmap_next_dirty_area and also bdrv_reset_dirty_bitmap with 
the tasks lock, so that once it is released the bitmap is updated 
accordingly and from my understanding no other task can get that region.

Btw, out of curiosity, why is bdrv_dirty_bitmap_next_dirty_area not 
protected by a lock? Can't we have a case where two threads (like you 
just mention below) check the bitmap? Or am I missing something?

> 
> Imagine block_copy_task_create() is called from two threads 
> simultaneously. Both calls will get same dirty area and create equal 
> tasks. Then, "assert(!find_conflicting_task_locked(s, offset, bytes));" 
> will crash.
> 


> 
>> -    /* region is dirty, so no existent tasks possible in it */
>> -    assert(!find_conflicting_task(s, offset, bytes));
>> -
>>       bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>> -    s->in_flight_bytes += bytes;
>>       task = g_new(BlockCopyTask, 1);
>>       *task = (BlockCopyTask) {
>> @@ -228,7 +230,13 @@ static coroutine_fn BlockCopyTask 
>> *block_copy_task_create(BlockCopyState *s,
>>           .bytes = bytes,
>>       };
>>       qemu_co_queue_init(&task->wait_queue);
>> -    QLIST_INSERT_HEAD(&s->tasks, task, list);
>> +
>> +    WITH_QEMU_LOCK_GUARD(&s->tasks_lock) {
>> +        s->in_flight_bytes += bytes;
>> +        /* region is dirty, so no existent tasks possible in it */
>> +        assert(!find_conflicting_task_locked(s, offset, bytes));
>> +        QLIST_INSERT_HEAD(&s->tasks, task, list);
>> +    }
>>       return task;
>>   }
>> @@ -249,25 +257,29 @@ static void coroutine_fn 
>> block_copy_task_shrink(BlockCopyTask *task,
>>       assert(new_bytes > 0 && new_bytes < task->bytes);
>> -    task->s->in_flight_bytes -= task->bytes - new_bytes;
>>       bdrv_set_dirty_bitmap(task->s->copy_bitmap,
>>                             task->offset + new_bytes, task->bytes - 
>> new_bytes);
> 
> Then look here. Imagine, immediately after bdrv_set_dirty_bitmap() in 
> parallel thread the new task is created, which consumes these new dirty 
> bits. Again, it will find conflicting task (this one, as task->bytes is 
> not modified yet) and crash.

Also here, the lock guard should be enlarged to cover also the dirty 
bitmap. Thank you for pointing these cases!

Emanuele

> 
>> -    task->bytes = new_bytes;
>> -    qemu_co_queue_restart_all(&task->wait_queue);
>> +    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
>> +        task->s->in_flight_bytes -= task->bytes - new_bytes;
>> +        task->bytes = new_bytes;
>> +        qemu_co_queue_restart_all(&task->wait_queue);
>> +    }
>>   }



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-21 17:10   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-25 10:18     ` Emanuele Giuseppe Esposito
  2021-05-25 11:00       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-25 10:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini, John Snow



On 21/05/2021 19:10, Vladimir Sementsov-Ogievskiy wrote:
> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>> With tasks and calls lock protecting all State fields,
>> .method is the last BlockCopyState field left unprotected.
>> Set it as atomic.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> 
> OK, in 06 some things are out of coroutine. Here could we just reuse mutex?
> 
> I believe, that we don't need any kind of protection for .method inside 
> block_copy_state_new(), as it's just a creation and initialization of 
> new structure.

I agree here, will remove the atomic_set in block_copy_state_new.
> 
> And other things are called from coroutines. So, seems no reasons for 
> additional atomic access logic?

But... why should I use a mutex? I think the .method usage is pretty
straightforward, adding a lock (which one, tasks_lock? does not seem 
appropriate) would just cover also functions that do not need it, since 
the field is modified in if-else statements (see block_copy_do_copy).
It looks to me that an atomic here won't hurt, and it's pretty 
straightforward to understand.

Thank you,
Emanuele



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-25 10:07     ` Emanuele Giuseppe Esposito
@ 2021-05-25 10:25       ` Vladimir Sementsov-Ogievskiy
  2021-05-26 14:58         ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-25 10:25 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

25.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 20/05/2021 17:19, Vladimir Sementsov-Ogievskiy wrote:
>> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>>> Because the list of tasks is only modified by coroutine
>>> functions, add a CoMutex in order to protect them.
>>>
>>> Use the same mutex to protect also BlockCopyState in_flight_bytes
>>> field to avoid adding additional syncronization primitives.
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/block-copy.c | 55 +++++++++++++++++++++++++++++-----------------
>>>   1 file changed, 35 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 2e610b4142..3a949fab64 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -83,7 +83,7 @@ typedef struct BlockCopyTask {
>>>        */
>>>       bool zeroes;
>>> -    /* State */
>>> +    /* State. Protected by tasks_lock */
>>>       CoQueue wait_queue; /* coroutines blocked on this task */
>>>       /* To reference all call states from BlockCopyState */
>>> @@ -106,8 +106,9 @@ typedef struct BlockCopyState {
>>>       BdrvChild *target;
>>>       /* State */
>>> -    int64_t in_flight_bytes;
>>> +    int64_t in_flight_bytes; /* protected by tasks_lock */
>>
>> only this field is protected? or the whole State section?
> 
> I guess you figured it already, here there is only in_flight_bytes because in next patches I take care of the others.

Honestly, I don't like this way of introducing thread-safety. As I show below, we should protect not only structures, but also the logic working with these structures. And simple protecting access to some variable doesn't make much sense. Critical sections should be wide enough to protect the logic. So I'd prefer introducing all the critical sections together with the mutex in one patch, to have the whole picture.

> 
> [...]
> 
>>>   }
>>> @@ -213,11 +219,7 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>       bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);
>>
>> Looking at block_copy_task_create():
>>
>> First, !bdrv_dirty_bitmap_next_dirty_area() doesn't take bitmaps lock, so it's not protected at all.
>>
>> Next, even if we take bitmaps lock in bdrv_dirty_bitmap_next_dirty_area() or around it, it doesn't bring thread-safety to block_copy_task_create():
> 
> The simplest solution here seems to protect bdrv_dirty_bitmap_next_dirty_area and also bdrv_reset_dirty_bitmap with the tasks lock, so that once it is released the bitmap is updated accordingly and from my understanding no other task can get that region.
> 

Yes, we just need to protect larger areas by outer lock, to protect the logic, not only structures itself.

> Btw, out of curiosity, why is bdrv_dirty_bitmap_next_dirty_area not protected by a lock? Can't we have a case where two threads (like you just mention below) check the bitmap? Or am I missing something?

Hmm, don't know) Probably it's a bug, may be not. We need to check its callers.

> 
>>
>> Imagine block_copy_task_create() is called from two threads simultaneously. Both calls will get same dirty area and create equal tasks. Then, "assert(!find_conflicting_task_locked(s, offset, bytes));" will crash.
>>
> 
> 
>>
>>> -    /* region is dirty, so no existent tasks possible in it */
>>> -    assert(!find_conflicting_task(s, offset, bytes));
>>> -
>>>       bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
>>> -    s->in_flight_bytes += bytes;
>>>       task = g_new(BlockCopyTask, 1);
>>>       *task = (BlockCopyTask) {
>>> @@ -228,7 +230,13 @@ static coroutine_fn BlockCopyTask *block_copy_task_create(BlockCopyState *s,
>>>           .bytes = bytes,
>>>       };
>>>       qemu_co_queue_init(&task->wait_queue);
>>> -    QLIST_INSERT_HEAD(&s->tasks, task, list);
>>> +
>>> +    WITH_QEMU_LOCK_GUARD(&s->tasks_lock) {
>>> +        s->in_flight_bytes += bytes;
>>> +        /* region is dirty, so no existent tasks possible in it */
>>> +        assert(!find_conflicting_task_locked(s, offset, bytes));
>>> +        QLIST_INSERT_HEAD(&s->tasks, task, list);
>>> +    }
>>>       return task;
>>>   }
>>> @@ -249,25 +257,29 @@ static void coroutine_fn block_copy_task_shrink(BlockCopyTask *task,
>>>       assert(new_bytes > 0 && new_bytes < task->bytes);
>>> -    task->s->in_flight_bytes -= task->bytes - new_bytes;
>>>       bdrv_set_dirty_bitmap(task->s->copy_bitmap,
>>>                             task->offset + new_bytes, task->bytes - new_bytes);
>>
>> Then look here. Imagine, immediately after bdrv_set_dirty_bitmap() in parallel thread the new task is created, which consumes these new dirty bits. Again, it will find conflicting task (this one, as task->bytes is not modified yet) and crash.
> 
> Also here, the lock guard should be enlarged to cover also the dirty bitmap. Thank you for pointing these cases!
> 

Yes. And then we'll come to most of the logic covered by mutex, and we'll not need atomic operations. And nothing bad in doing simple operations with variables under mutex, it's a lot simpler than bothering with atomic operations.

> 
>>
>>> -    task->bytes = new_bytes;
>>> -    qemu_co_queue_restart_all(&task->wait_queue);
>>> +    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
>>> +        task->s->in_flight_bytes -= task->bytes - new_bytes;
>>> +        task->bytes = new_bytes;
>>> +        qemu_co_queue_restart_all(&task->wait_queue);
>>> +    }
>>>   }
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-21 15:01     ` Paolo Bonzini
@ 2021-05-25 10:58       ` Emanuele Giuseppe Esposito
  2021-05-26 14:49         ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-25 10:58 UTC (permalink / raw)
  To: Paolo Bonzini, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz



On 21/05/2021 17:01, Paolo Bonzini wrote:
> On 20/05/21 17:30, Vladimir Sementsov-Ogievskiy wrote:
>> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>>> As for BlockCopyTask, add a lock to protect BlockCopyCallState
>>> ret and sleep_state fields. Also move ret, finished and cancelled
>>> in the OUT fields of BlockCopyCallState.
>>>
>>> Here a QemuMutex is used to protect QemuCoSleep field, since it
>>> can be concurrently invoked also from outside threads.

Actually I don't even protect it here, I should have deleted the above 
line. I left a TODO for the QemuCoSleep field.

>>>
>>> .finished, .cancelled and reads to .ret and .error_is_read will be
>>> protected in the following patch.
>>>
>>> .sleep state is handled in the series "coroutine: new sleep/wake API"
>>
>> Could we live with one mutex for all needs? Why to add one more?
> 
> This patch should just go away; the QemuMutex will not be needed once 
> QemuCoSleep is thread safe, while right now it is still racy.

At this point, I would just rename the other lock (tasks_lock) in "lock" 
or "state_lock", and substitute it in the calls_lock usages of this 
patch. Depending on how it comes out, I may merge this with the previous 
patch.

Thank you,
Emanuele



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-25 10:18     ` Emanuele Giuseppe Esposito
@ 2021-05-25 11:00       ` Vladimir Sementsov-Ogievskiy
  2021-05-26 14:48         ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-25 11:00 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Paolo Bonzini, Stefan Hajnoczi,
	qemu-devel

25.05.2021 13:18, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 21/05/2021 19:10, Vladimir Sementsov-Ogievskiy wrote:
>> 18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
>>> With tasks and calls lock protecting all State fields,
>>> .method is the last BlockCopyState field left unprotected.
>>> Set it as atomic.
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>
>> OK, in 06 some things are out of coroutine. Here could we just reuse mutex?
>>
>> I believe, that we don't need any kind of protection for .method inside block_copy_state_new(), as it's just a creation and initialization of new structure.
> 
> I agree here, will remove the atomic_set in block_copy_state_new.
>>
>> And other things are called from coroutines. So, seems no reasons for additional atomic access logic?
> 
> But... why should I use a mutex? I think the .method usage is pretty
> straightforward, adding a lock (which one, tasks_lock? does not seem appropriate)

Paolo said patch 05 should go away. So, we have only one mutex. We can name it just "lock" and use for all the needs, like in qcow2.

> would just cover also functions that do not need it, since the field is modified in if-else statements (see block_copy_do_copy).
> It looks to me that an atomic here won't hurt, and it's pretty straightforward to understand.
> 
> Thank you,
> Emanuele
> 

Hmm. OK, let me think:

First look at block_copy_do_copy(). It's called only from block_copy_task_entry. block_copy_task_entry() has mutex-critical-section anyway around handling return value. That means that we can simply move s->method modification logic to this already existing critical section.

Next, block_copy_chunk_size() is called only from block_copy_task_create(), where we should have critical section too.

So, no reason for atomics, as we already have critical sections.


I think it's significant:

Drawbacks of atomics:

1. Code becomes harder to read. Just because instead of simple access to variable, we have to call atomic access functions. And the code become the mess of different qatomic_* calls.

2. The thread-safety is harder to analyze. You argue that use is straightforward: yes, it's obvious that atomic access protect the variable itself. But what about the logic? It's the same as questions I asked about critical sections in a patch 04. With critical sections things are clear: just protect the whole logic with a critical sections and you are sure that no other critical section intersects. With atomics you should analyze for example: are existing critical sections OK with the fact that the variable may be atomically changed by other thread not locking the mutex. It's not a simple question in general.

Probable benefits of atomics:

1. Performance.. anything else?

So, if we have some lockless performance-critical mechanism, atomics are needed. Still, in general lockless algorithms are a lot trickier and harder to understand than simple critical sections. Still, sometimes it worth the complexity.

But, if we already have the mutex to protect most of the logic inside some subsystem (block-copy for example), it's better to just protect the remaining bit of things in the subsystem by same mutex, than to add drawbacks of atomics with no reason. Especially when this remaining bit of accesses follows or goes directly before existing critical section. I don't believe that in this case atomics may bring better performance. I even think, that performance may become worse (remember atomic operations are not free, and simple accesses to variable may be faster).

And I really doubt, that someone can demonstrate a performance benefit of atomic accesses in block-layer. IO operations are a lot longer anyway than all these simple variable accesses.

So, I'm against adding atomics just because they won't hurt :)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-25 11:00       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-26 14:48         ` Paolo Bonzini
  2021-05-26 17:18           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-26 14:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 25/05/21 13:00, Vladimir Sementsov-Ogievskiy wrote:
> 
> Hmm. OK, let me think:
> 
> First look at block_copy_do_copy(). It's called only from 
> block_copy_task_entry. block_copy_task_entry() has 
> mutex-critical-section anyway around handling return value. That means 
> that we can simply move s->method modification logic to this already 
> existing critical section.
> 
> Next, block_copy_chunk_size() is called only from 
> block_copy_task_create(), where we should have critical section too.

block_copy_do_copy would have to release the mutex around the 
reads/writes (including the bdrv_co_pwrite_zeroes that has nothing to do 
with s->method).

There's also the "goto out" if the copy-range operation succeeds, which 
makes things a bit more complicated.  The goto suggests using 
QEMU_WITH_LOCK_GUARD, but that doesn't work too well either, because 
there are two accesses (one before the bdrv_co_copy_range and one after).

So I understand why you want to avoid atomics and I agree that they are 
more complicated than the other solutions, on the other hand I think 
this patch is the simplest code.

Paolo

> So, no reason for atomics, as we already have critical sections.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-25 10:58       ` Emanuele Giuseppe Esposito
@ 2021-05-26 14:49         ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-26 14:49 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 25/05/21 12:58, Emanuele Giuseppe Esposito wrote:
> 
> At this point, I would just rename the other lock (tasks_lock) in "lock" 
> or "state_lock", and substitute it in the calls_lock usages of this 
> patch. Depending on how it comes out, I may merge this with the previous 
> patch.

Renaming the lock is a good idea indeed.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-25 10:25       ` Vladimir Sementsov-Ogievskiy
@ 2021-05-26 14:58         ` Paolo Bonzini
  2021-05-26 16:13           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-26 14:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 25/05/21 12:25, Vladimir Sementsov-Ogievskiy wrote:
>>> Next, even if we take bitmaps lock in 
>>> bdrv_dirty_bitmap_next_dirty_area() or around it, it doesn't bring 
>>> thread-safety to block_copy_task_create():
>>
>> The simplest solution here seems to protect 
>> bdrv_dirty_bitmap_next_dirty_area and also bdrv_reset_dirty_bitmap 
>> with the tasks lock, so that once it is released the bitmap is updated 
>> accordingly and from my understanding no other task can get that region.
> 
> Yes, we just need to protect larger areas by outer lock, to protect the 
> logic, not only structures itself.

Locks protects data, not code; code must ensure invariants are respected 
at the end of critical sections.  Here we have both problems:

- it's protecting too little data (the bitmap is not protected).  This 
is a block/dirty-bitmap.c bug.

- it's not respecting the invariant that tasks always reflected a 
superset of what is set in the dirty bitmap.  This is solved by making 
the critical section larger.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-26 14:58         ` Paolo Bonzini
@ 2021-05-26 16:13           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-26 16:13 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

26.05.2021 17:58, Paolo Bonzini wrote:
> On 25/05/21 12:25, Vladimir Sementsov-Ogievskiy wrote:
>>>> Next, even if we take bitmaps lock in bdrv_dirty_bitmap_next_dirty_area() or around it, it doesn't bring thread-safety to block_copy_task_create():
>>>
>>> The simplest solution here seems to protect bdrv_dirty_bitmap_next_dirty_area and also bdrv_reset_dirty_bitmap with the tasks lock, so that once it is released the bitmap is updated accordingly and from my understanding no other task can get that region.
>>
>> Yes, we just need to protect larger areas by outer lock, to protect the logic, not only structures itself.
> 
> Locks protects data, not code; code must ensure invariants are respected at the end of critical sections.  Here we have both problems:

Hmm. Anyway, if code doesn't work with data that is potentially shared with other threads, it doesn't need any protection. So, I agree.. I just wanted to say that, if we have two data structures A and B, each protected by own lock, it doesn't mean that our code is thread-safe. In your terminology we can say that the whole data (which is a combination of A and B) is not protected by partial locks, we need another lock to protect the combination.

> 
> - it's protecting too little data (the bitmap is not protected).  This is a block/dirty-bitmap.c bug.
> 
> - it's not respecting the invariant that tasks always reflected a superset of what is set in the dirty bitmap.  This is solved by making the critical section larger.
> 



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-26 14:48         ` Paolo Bonzini
@ 2021-05-26 17:18           ` Vladimir Sementsov-Ogievskiy
  2021-05-28 10:24             ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-26 17:18 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

26.05.2021 17:48, Paolo Bonzini wrote:
> On 25/05/21 13:00, Vladimir Sementsov-Ogievskiy wrote:
>>
>> Hmm. OK, let me think:
>>
>> First look at block_copy_do_copy(). It's called only from block_copy_task_entry. block_copy_task_entry() has mutex-critical-section anyway around handling return value. That means that we can simply move s->method modification logic to this already existing critical section.
>>
>> Next, block_copy_chunk_size() is called only from block_copy_task_create(), where we should have critical section too.
> 
> block_copy_do_copy would have to release the mutex around the reads/writes (including the bdrv_co_pwrite_zeroes that has nothing to do with s->method).
> 
> There's also the "goto out" if the copy-range operation succeeds, which makes things a bit more complicated.  The goto suggests using QEMU_WITH_LOCK_GUARD, but that doesn't work too well either, because there are two accesses (one before the bdrv_co_copy_range and one after).

Hmm, this "goto out" makes no sense actually, and should be removed.. It's a mistake or a kind of forgotten thing to refactor after some changes.

> 
> So I understand why you want to avoid atomics and I agree that they are more complicated than the other solutions, on the other hand I think this patch is the simplest code.
> 

I think it's better to pass s->method as parameter to block_copy_do_copy. Then block_copy_do_copy() doesn't need any kind of synchronization.

In block_copy_task_entry() we'll have the whole logic of handling result of block_copy_do_copy (including changing of s->method). And then, only one question is:

before calling block_copy_do_copy() we should get s->method. Either by atomic operation or under separate critical section.. I'd be OK either way.


It's actually the original idea of block_copy_do_copy() function: do only simple copy, don't interact with the state. It's a kind of wrapper on top of bdrv_co<io functions>.

So, actually updating s->use_copy_range here was a bad idea.

Note also the comment above block_copy_do_copy():

  " No sync here: nor bitmap neighter intersecting requests handling, only copy."

Somehow, the comment conflicts with introducing synchronization primitives inside the function, although it was about another things :))

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
  2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-27  8:20   ` Stefan Hajnoczi
  2021-05-27 19:04     ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Stefan Hajnoczi @ 2021-05-27  8:20 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel,
	Max Reitz, Paolo Bonzini, John Snow

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

On Tue, May 18, 2021 at 12:07:51PM +0200, Emanuele Giuseppe Esposito wrote:
>      } else {
>          /*
>           * We enable copy-range, but keep small copy_size, until first
>           * successful copy_range (look at block_copy_do_copy).
>           */

Is this comment still correct? It appears that this branch of the if
statement does not always enable copy-range (the !use_copy_range case).

> -        s->use_copy_range = use_copy_range;
> -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
  2021-05-18 10:07 ` [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list Emanuele Giuseppe Esposito
  2021-05-20 15:19   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-27  9:07   ` Stefan Hajnoczi
  1 sibling, 0 replies; 48+ messages in thread
From: Stefan Hajnoczi @ 2021-05-27  9:07 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel,
	Max Reitz, Paolo Bonzini, John Snow

[-- Attachment #1: Type: text/plain, Size: 271 bytes --]

On Tue, May 18, 2021 at 12:07:54PM +0200, Emanuele Giuseppe Esposito wrote:
> -static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
> -                                            int64_t offset, int64_t bytes)
> +/* Called with lock held */

s/lock/tasks_lock/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 0/7] block-copy: protect block-copy internal structures
  2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
                   ` (7 preceding siblings ...)
  2021-05-20 13:47 ` [PATCH v2 0/7] block-copy: protect block-copy internal structures Vladimir Sementsov-Ogievskiy
@ 2021-05-27 10:31 ` Stefan Hajnoczi
  8 siblings, 0 replies; 48+ messages in thread
From: Stefan Hajnoczi @ 2021-05-27 10:31 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel,
	Max Reitz, Paolo Bonzini, John Snow

[-- Attachment #1: Type: text/plain, Size: 3382 bytes --]

On Tue, May 18, 2021 at 12:07:50PM +0200, Emanuele Giuseppe Esposito wrote:
> This serie of patches aims to reduce the usage of the global
> AioContexlock in block-copy, by introducing smaller granularity
> locks thus on making the block layer thread safe. 
> 
> This serie depends on Paolo's coroutine_sleep API and my previous
> serie that brings thread safety to the smaller API used by block-copy,
> like ratelimit, progressmeter abd co-shared-resource.
> 
> What's missing for block-copy to be fully thread-safe is fixing
> the CoSleep API to allow cross-thread sleep and wakeup.
> Paolo is working on it and will post the patches once his new
> CoSleep API is accepted.
> 
> Patch 1 introduces the .method field instead of .use_copy_range
> and .copy_size, so that it can be later used as atomic.
> Patch 2-3 provide comments and refactoring in preparation to
> the locks added in patch 4 on BlockCopyTask, patch 5-6 on
> BlockCopyCallState and 7 BlockCopyState.
> 
> Based-on: <20210517100548.28806-1-pbonzini@redhat.com>
> Based-on: <20210518094058.25952-1-eesposit@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Here is my understanding of thread-safety in your
https://gitlab.com/eesposit/qemu.git dataplane_new branch:

Reading the code was much more satisfying than trying to review the
patches. That's probably because I'm not familiar with the block-copy.c
implementation and needed the context. I noticed you already addressed
some of Vladimir's comments in your git branch, so that may have helped
too.

The backup block job and the backup-top filter driver have a
BlockCopyState. QEMU threads that call bdrv_co_preadv/pwritev() invoke
the block_copy() coroutine function, so BlockCopyState needs to be
protected between threads. Additionally, the backup block job invokes
block_copy_async() to perform a background copy operation.

The relationships are as follows:

BlockCopyState - shared state that any thread can access
BlockCopyCallState - per-block_copy()/block_copy_async() state, not
                     accessed by other coroutines/threads
BlockCopyTask - per-aiotask state, find_conflicting_task_locked()
                accesses this for a given BlockCopyState

What is the purpose of the BlockCopyState->calls list?

Existing issue: the various
block_copy_call_finished/succeeded/failed/cancelled/status() APIs are a
bit extreme. They all end up being called by block/backup.c in
succession when it seems like a single call should be enough to report
the status.

It's not immediately obvious to me that BlockCopyCallState needs to be
thread-safe. So I wondered why finished needs to be atomic. Then I
realized the set_speed/cancel code runs in the monitor, so at least
block_copy_call_cancel() and block_copy_kick() need to be thread-safe. I
guess the BlockCopyCallState status APIs were made thread-safe for
consistency even though it's not needed at the moment?

Please add doc comments to block-copy.h explaining the thread-safety of
the APIs (it might be as simple as "all APIs are thread-safe" at the top
of the file).

Summarizing everything, this series adds BlockCopyState->lock to protect
shared state and makes BlockCopyCallState's status atomic so it can be
queried from threads other than the one performing the copy operation.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write
  2021-05-27  8:20   ` Stefan Hajnoczi
@ 2021-05-27 19:04     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-27 19:04 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Emanuele Giuseppe Esposito, Kevin Wolf,
	Vladimir Sementsov-Ogievskiy, open list:Block layer core,
	qemu-devel, Max Reitz, John Snow

[-- Attachment #1: Type: text/plain, Size: 802 bytes --]

Il gio 27 mag 2021, 14:07 Stefan Hajnoczi <stefanha@redhat.com> ha scritto:

> On Tue, May 18, 2021 at 12:07:51PM +0200, Emanuele Giuseppe Esposito wrote:
> >      } else {
> >          /*
> >           * We enable copy-range, but keep small copy_size, until first
> >           * successful copy_range (look at block_copy_do_copy).
> >           */
>
> Is this comment still correct? It appears that this branch of the if
> statement does not always enable copy-range (the !use_copy_range case).
>

It is correct, but it applies only if use_copy_range is true (the patch
doesn't change this).

Paolo

> -        s->use_copy_range = use_copy_range;
> > -        s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER);
> > +        s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
>

[-- Attachment #2: Type: text/html, Size: 1478 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-26 17:18           ` Vladimir Sementsov-Ogievskiy
@ 2021-05-28 10:24             ` Paolo Bonzini
  2021-05-28 11:01               ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-28 10:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 26/05/21 19:18, Vladimir Sementsov-Ogievskiy wrote:
> 
> It's actually the original idea of block_copy_do_copy() function: do 
> only simple copy, don't interact with the state. It's a kind of wrapper 
> on top of bdrv_co<io functions>.
> 
> So, actually updating s->use_copy_range here was a bad idea.

It's still more complicated, because you need to add some kind of

	method = s->method;
	ret = block_copy_do_copy(..., method);
	if (ret < 0 && method <= COPY_RANGE_SMALL) {
	    method = COPY_RANGE_READ_WRITE;
	    ret = block_copy_do_copy(..., method);
         }
	lock();
         if (method == s->method) {
             /* compute new method */
         }

which makes it more complicated than this patch IMO.  But yeah at least 
it's a viable alternative to the atomics.

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list
  2021-05-18 10:07 ` [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list Emanuele Giuseppe Esposito
  2021-05-20 15:30   ` Vladimir Sementsov-Ogievskiy
@ 2021-05-28 10:53   ` Paolo Bonzini
  1 sibling, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-28 10:53 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Stefan Hajnoczi, John Snow

On 18/05/21 12:07, Emanuele Giuseppe Esposito wrote:
> +    qemu_mutex_lock(&call_state->s->calls_lock);
>       QLIST_INSERT_HEAD(&call_state->s->calls, call_state, list);
> +    qemu_mutex_unlock(&call_state->s->calls_lock);

Let's just use tasks_lock here (maybe even rename it to just "lock").

Paolo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-28 10:24             ` Paolo Bonzini
@ 2021-05-28 11:01               ` Paolo Bonzini
  2021-05-28 11:52                 ` Vladimir Sementsov-Ogievskiy
  2021-05-28 12:44                 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2021-05-28 11:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, John Snow, qemu-devel, Stefan Hajnoczi, Max Reitz

On 28/05/21 12:24, Paolo Bonzini wrote:
> 
> It's still more complicated, because you need to add some kind of
> 
>      method = s->method;

This would even have to be a separate, one-line critical section...

Paolo

>      ret = block_copy_do_copy(..., method);
>      if (ret < 0 && method <= COPY_RANGE_SMALL) {
>          method = COPY_RANGE_READ_WRITE;
>          ret = block_copy_do_copy(..., method);
>          }
>      lock();
>          if (method == s->method) {
>              /* compute new method */
>          }
> 
> which makes it more complicated than this patch IMO.  But yeah at least 
> it's a viable alternative to the atomics.




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-28 11:01               ` Paolo Bonzini
@ 2021-05-28 11:52                 ` Vladimir Sementsov-Ogievskiy
  2021-05-28 12:44                 ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-28 11:52 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

28.05.2021 14:01, Paolo Bonzini wrote:
> On 28/05/21 12:24, Paolo Bonzini wrote:
>>
>> It's still more complicated, because you need to add some kind of
>>
>>      method = s->method;
> 
> This would even have to be a separate, one-line critical section...
> 

Or atomic operation.. What I don't like that all troubles are for unused code. Many things may change to the moment when we actually reuse this for qemu-img convert.

And, qemu-img convert probably don't need this complicated logic with different methods. It may be better just return error if copy_range failed. What's a good reason for fall-back to normal write if copy-range is explicitly requested by user?

> 
>>      ret = block_copy_do_copy(..., method);
>>      if (ret < 0 && method <= COPY_RANGE_SMALL) {
>>          method = COPY_RANGE_READ_WRITE;
>>          ret = block_copy_do_copy(..., method);
>>          }
>>      lock();
>>          if (method == s->method) {
>>              /* compute new method */
>>          }
>>
>> which makes it more complicated than this patch IMO.  But yeah at least it's a viable alternative to the atomics.
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields
  2021-05-28 11:01               ` Paolo Bonzini
  2021-05-28 11:52                 ` Vladimir Sementsov-Ogievskiy
@ 2021-05-28 12:44                 ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 48+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-28 12:44 UTC (permalink / raw)
  To: Paolo Bonzini, Emanuele Giuseppe Esposito, qemu-block
  Cc: John Snow, Kevin Wolf, Max Reitz, Stefan Hajnoczi, qemu-devel

28.05.2021 14:01, Paolo Bonzini wrote:
> On 28/05/21 12:24, Paolo Bonzini wrote:
>>
>> It's still more complicated, because you need to add some kind of
>>
>>      method = s->method;
> 
> This would even have to be a separate, one-line critical section...
> 

Hm, so, we should set .use_copy_range in task, when it is initialized.

> 
>>      ret = block_copy_do_copy(..., method);
>>      if (ret < 0 && method <= COPY_RANGE_SMALL) {
>>          method = COPY_RANGE_READ_WRITE;
>>          ret = block_copy_do_copy(..., method);
>>          }
>>      lock();
>>          if (method == s->method) {
>>              /* compute new method */
>>          }
>>
>> which makes it more complicated than this patch IMO.  But yeah at least it's a viable alternative to the atomics.
> 
> 

OK, I'm OK with patch as is. Finally I can refactor it later on top if needed.. I'll try now do some refactoring, you'll probably want to base on it, or vise-versa, I'll rebase it later on top of these patches.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2021-05-28 12:48 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18 10:07 [PATCH v2 0/7] block-copy: protect block-copy internal structures Emanuele Giuseppe Esposito
2021-05-18 10:07 ` [PATCH v2 1/7] block-copy: streamline choice of copy_range vs. read/write Emanuele Giuseppe Esposito
2021-05-20 14:42   ` Vladimir Sementsov-Ogievskiy
2021-05-20 15:06     ` Emanuele Giuseppe Esposito
2021-05-20 15:24       ` Vladimir Sementsov-Ogievskiy
2021-05-21 15:10     ` Paolo Bonzini
2021-05-21 15:48       ` Vladimir Sementsov-Ogievskiy
2021-05-21 16:43         ` Paolo Bonzini
2021-05-21 17:51       ` Vladimir Sementsov-Ogievskiy
2021-05-27  8:20   ` Stefan Hajnoczi
2021-05-27 19:04     ` Paolo Bonzini
2021-05-18 10:07 ` [PATCH v2 2/7] block-copy: improve documentation of BlockCopyTask and BlockCopyState types and functions Emanuele Giuseppe Esposito
2021-05-20 15:00   ` Vladimir Sementsov-Ogievskiy
2021-05-20 15:15     ` Emanuele Giuseppe Esposito
2021-05-18 10:07 ` [PATCH v2 3/7] block-copy: move progress_set_remaining in block_copy_task_end Emanuele Giuseppe Esposito
2021-05-20 15:03   ` Vladimir Sementsov-Ogievskiy
2021-05-18 10:07 ` [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list Emanuele Giuseppe Esposito
2021-05-20 15:19   ` Vladimir Sementsov-Ogievskiy
2021-05-25 10:07     ` Emanuele Giuseppe Esposito
2021-05-25 10:25       ` Vladimir Sementsov-Ogievskiy
2021-05-26 14:58         ` Paolo Bonzini
2021-05-26 16:13           ` Vladimir Sementsov-Ogievskiy
2021-05-27  9:07   ` Stefan Hajnoczi
2021-05-18 10:07 ` [PATCH v2 5/7] block-copy: add QemuMutex lock for BlockCopyCallState list Emanuele Giuseppe Esposito
2021-05-20 15:30   ` Vladimir Sementsov-Ogievskiy
2021-05-21 15:01     ` Paolo Bonzini
2021-05-25 10:58       ` Emanuele Giuseppe Esposito
2021-05-26 14:49         ` Paolo Bonzini
2021-05-28 10:53   ` Paolo Bonzini
2021-05-18 10:07 ` [PATCH v2 6/7] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Emanuele Giuseppe Esposito
2021-05-20 15:34   ` Vladimir Sementsov-Ogievskiy
2021-05-21 15:02     ` Paolo Bonzini
2021-05-21 15:53       ` Vladimir Sementsov-Ogievskiy
2021-05-21 15:56       ` Vladimir Sementsov-Ogievskiy
2021-05-21 16:47         ` Paolo Bonzini
2021-05-18 10:07 ` [PATCH v2 7/7] block-copy: protect BlockCopyState .method fields Emanuele Giuseppe Esposito
2021-05-21 17:10   ` Vladimir Sementsov-Ogievskiy
2021-05-25 10:18     ` Emanuele Giuseppe Esposito
2021-05-25 11:00       ` Vladimir Sementsov-Ogievskiy
2021-05-26 14:48         ` Paolo Bonzini
2021-05-26 17:18           ` Vladimir Sementsov-Ogievskiy
2021-05-28 10:24             ` Paolo Bonzini
2021-05-28 11:01               ` Paolo Bonzini
2021-05-28 11:52                 ` Vladimir Sementsov-Ogievskiy
2021-05-28 12:44                 ` Vladimir Sementsov-Ogievskiy
2021-05-20 13:47 ` [PATCH v2 0/7] block-copy: protect block-copy internal structures Vladimir Sementsov-Ogievskiy
2021-05-20 14:33   ` Emanuele Giuseppe Esposito
2021-05-27 10:31 ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.