All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] blkdebug: fix racing condition when iterating on
@ 2021-05-17 14:50 Emanuele Giuseppe Esposito
  2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

When qemu_coroutine_enter is executed in a loop
(even QEMU_FOREACH_SAFE), the new routine can modify the list,
for example removing an element, causing problem when control
is given back to the caller that continues iterating on the same list. 

Patch 1 solves the issue in blkdebug_debug_resume by restarting
the list walk after every coroutine_enter if list has to be fully iterated.
Patches 2,3,4 aim to fix blkdebug_debug_event by gathering
all actions that the rules make in a counter and invoking 
the respective coroutine_yeld only after processing all requests.

Patch 5 is somewhat independent of the others, it adds a lock to
protect rules and suspended_reqs; right now everything works because
it's protected by the AioContext lock.
This is a preparation for the current proposal of removing the AioContext
lock and instead using smaller granularity locks to allow multiple
iothread execution in the same block device.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
v2 -> v3
* Fix "yeld"->"yield" in patches 3-4 [Eric]
* Use lock guard instead of lock/unlock in patch 5 [Eric]

Emanuele Giuseppe Esposito (5):
  blkdebug: refactor removal of a suspended request
  blkdebug: move post-resume handling to resume_req_by_tag
  blkdebug: track all actions
  blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  blkdebug: protect rules and suspended_reqs with a lock

 block/blkdebug.c | 124 +++++++++++++++++++++++++++++++----------------
 1 file changed, 83 insertions(+), 41 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/5] blkdebug: refactor removal of a suspended request
  2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
@ 2021-05-17 14:50 ` Emanuele Giuseppe Esposito
  2021-05-31 19:44   ` Vladimir Sementsov-Ogievskiy
  2021-05-17 14:50 ` [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

Extract to a separate function.  Do not rely on FOREACH_SAFE, which is
only "safe" if the *current* node is removed---not if another node is
removed.  Instead, just walk the entire list from the beginning when
asked to resume all suspended requests with a given tag.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 2c0b9b0ee8..8f19d991fa 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -793,7 +793,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
         printf("blkdebug: Resuming request '%s'\n", r.tag);
     }
 
-    QLIST_REMOVE(&r, next);
     g_free(r.tag);
 }
 
@@ -869,25 +868,35 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
     return 0;
 }
 
-static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
+static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
 {
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *next;
+    BlkdebugSuspendedReq *r;
 
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
+retry:
+    QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
+            QLIST_REMOVE(r, next);
             qemu_coroutine_enter(r->co);
+            if (all) {
+                goto retry;
+            }
             return 0;
         }
     }
     return -ENOENT;
 }
 
+static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
+{
+    BDRVBlkdebugState *s = bs->opaque;
+
+    return resume_req_by_tag(s, tag, false);
+}
+
 static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
                                             const char *tag)
 {
     BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *r_next;
     BlkdebugRule *rule, *next;
     int i, ret = -ENOENT;
 
@@ -900,11 +909,8 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
             }
         }
     }
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
-        if (!strcmp(r->tag, tag)) {
-            qemu_coroutine_enter(r->co);
-            ret = 0;
-        }
+    if (resume_req_by_tag(s, tag, true) == 0) {
+        ret = 0;
     }
     return ret;
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag
  2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
  2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
@ 2021-05-17 14:50 ` Emanuele Giuseppe Esposito
  2021-05-31 19:50   ` Vladimir Sementsov-Ogievskiy
  2021-05-17 14:50 ` [PATCH v3 3/5] blkdebug: track all actions Emanuele Giuseppe Esposito
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

We want to move qemu_coroutine_yield() after the loop on rules,
because QLIST_FOREACH_SAFE is wrong if the rule list is modified
while the coroutine has yielded.  Therefore move the suspended
request to the heap and clean it up from the remove side.
All that is left is for blkdebug_debug_event to handle the
yielding.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 8f19d991fa..e37f999254 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -775,25 +775,20 @@ static void blkdebug_close(BlockDriverState *bs)
 static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
 {
     BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq r;
+    BlkdebugSuspendedReq *r;
 
-    r = (BlkdebugSuspendedReq) {
-        .co         = qemu_coroutine_self(),
-        .tag        = g_strdup(rule->options.suspend.tag),
-    };
+    r = g_new(BlkdebugSuspendedReq, 1);
+
+    r->co         = qemu_coroutine_self();
+    r->tag        = g_strdup(rule->options.suspend.tag);
 
     remove_rule(rule);
-    QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);
+    QLIST_INSERT_HEAD(&s->suspended_reqs, r, next);
 
     if (!qtest_enabled()) {
-        printf("blkdebug: Suspended request '%s'\n", r.tag);
+        printf("blkdebug: Suspended request '%s'\n", r->tag);
     }
     qemu_coroutine_yield();
-    if (!qtest_enabled()) {
-        printf("blkdebug: Resuming request '%s'\n", r.tag);
-    }
-
-    g_free(r.tag);
 }
 
 static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
@@ -875,8 +870,18 @@ static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
 retry:
     QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
+            Coroutine *co = r->co;
+
+            if (!qtest_enabled()) {
+                printf("blkdebug: Resuming request '%s'\n", r->tag);
+            }
+
             QLIST_REMOVE(r, next);
-            qemu_coroutine_enter(r->co);
+            g_free(r->tag);
+            g_free(r);
+
+            qemu_coroutine_enter(co);
+
             if (all) {
                 goto retry;
             }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/5] blkdebug: track all actions
  2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
  2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
  2021-05-17 14:50 ` [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
@ 2021-05-17 14:50 ` Emanuele Giuseppe Esposito
  2021-05-31 19:57   ` Vladimir Sementsov-Ogievskiy
  2021-05-17 14:50 ` [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
  2021-05-17 14:50 ` [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
  4 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

Add a counter for each action that a rule can trigger.
This is mainly used to keep track of how many coroutine_yield()
we need to perform after processing all rules in the list.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index e37f999254..388b5ed615 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -74,6 +74,7 @@ enum {
     ACTION_INJECT_ERROR,
     ACTION_SET_STATE,
     ACTION_SUSPEND,
+    ACTION__MAX,
 };
 
 typedef struct BlkdebugRule {
@@ -791,22 +792,22 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     qemu_coroutine_yield();
 }
 
-static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
-    bool injected)
+static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
+                         int *action_count)
 {
     BDRVBlkdebugState *s = bs->opaque;
 
     /* Only process rules for the current state */
     if (rule->state && rule->state != s->state) {
-        return injected;
+        return;
     }
 
     /* Take the action */
+    action_count[rule->action]++;
     switch (rule->action) {
     case ACTION_INJECT_ERROR:
-        if (!injected) {
+        if (action_count[ACTION_INJECT_ERROR] == 1) {
             QSIMPLEQ_INIT(&s->active_rules);
-            injected = true;
         }
         QSIMPLEQ_INSERT_HEAD(&s->active_rules, rule, active_next);
         break;
@@ -819,21 +820,19 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
         suspend_request(bs, rule);
         break;
     }
-    return injected;
 }
 
 static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
     BDRVBlkdebugState *s = bs->opaque;
     struct BlkdebugRule *rule, *next;
-    bool injected;
+    int actions_count[ACTION__MAX] = { 0 };
 
     assert((int)event >= 0 && event < BLKDBG__MAX);
 
-    injected = false;
     s->new_state = s->state;
     QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
-        injected = process_rule(bs, rule, injected);
+        process_rule(bs, rule, actions_count);
     }
     s->state = s->new_state;
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (2 preceding siblings ...)
  2021-05-17 14:50 ` [PATCH v3 3/5] blkdebug: track all actions Emanuele Giuseppe Esposito
@ 2021-05-17 14:50 ` Emanuele Giuseppe Esposito
  2021-06-01  7:58   ` Vladimir Sementsov-Ogievskiy
  2021-05-17 14:50 ` [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
  4 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

That would be unsafe in case a rule other than the current one
is removed while the coroutine has yielded.
Keep FOREACH_SAFE because suspend_request deletes the current rule.

After this patch, *all* matching rules are deleted before suspending
the coroutine, rather than just one.
This doesn't affect the existing testcases.

Use actions_count to see how many yield to issue.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 388b5ed615..dffd869b32 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -789,7 +789,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     if (!qtest_enabled()) {
         printf("blkdebug: Suspended request '%s'\n", r->tag);
     }
-    qemu_coroutine_yield();
 }
 
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
@@ -834,6 +833,12 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
     QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
         process_rule(bs, rule, actions_count);
     }
+
+    while (actions_count[ACTION_SUSPEND] > 0) {
+        qemu_coroutine_yield();
+        actions_count[ACTION_SUSPEND]--;
+    }
+
     s->state = s->new_state;
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock
  2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (3 preceding siblings ...)
  2021-05-17 14:50 ` [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
@ 2021-05-17 14:50 ` Emanuele Giuseppe Esposito
  2021-06-01  8:39   ` Vladimir Sementsov-Ogievskiy
  4 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-05-17 14:50 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito, qemu-devel, Max Reitz,
	Paolo Bonzini

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 53 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 40 insertions(+), 13 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index dffd869b32..cf8b088ce7 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -54,6 +54,7 @@ typedef struct BDRVBlkdebugState {
     /* For blkdebug_refresh_filename() */
     char *config_file;
 
+    QemuMutex lock;
     QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
     QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
     QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
@@ -245,7 +246,9 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
     };
 
     /* Add the rule */
+    qemu_mutex_lock(&s->lock);
     QLIST_INSERT_HEAD(&s->rules[event], rule, next);
+    qemu_mutex_unlock(&s->lock);
 
     return 0;
 }
@@ -468,6 +471,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
     uint64_t align;
 
+    qemu_mutex_init(&s->lock);
     opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
     if (!qemu_opts_absorb_qdict(opts, options, errp)) {
         ret = -EINVAL;
@@ -568,6 +572,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     ret = 0;
 out:
     if (ret < 0) {
+        qemu_mutex_destroy(&s->lock);
         g_free(s->config_file);
     }
     qemu_opts_del(opts);
@@ -582,6 +587,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     int error;
     bool immediately;
 
+    qemu_mutex_lock(&s->lock);
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
         uint64_t inject_offset = rule->options.inject.offset;
 
@@ -595,6 +601,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     }
 
     if (!rule || !rule->options.inject.error) {
+        qemu_mutex_unlock(&s->lock);
         return 0;
     }
 
@@ -606,6 +613,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
         remove_rule(rule);
     }
 
+    qemu_mutex_unlock(&s->lock);
     if (!immediately) {
         aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
         qemu_coroutine_yield();
@@ -771,8 +779,10 @@ static void blkdebug_close(BlockDriverState *bs)
     }
 
     g_free(s->config_file);
+    qemu_mutex_destroy(&s->lock);
 }
 
+/* Called with lock held.  */
 static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
 {
     BDRVBlkdebugState *s = bs->opaque;
@@ -791,6 +801,7 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     }
 }
 
+/* Called with lock held.  */
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
                          int *action_count)
 {
@@ -829,9 +840,11 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 
     assert((int)event >= 0 && event < BLKDBG__MAX);
 
-    s->new_state = s->state;
-    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
-        process_rule(bs, rule, actions_count);
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        s->new_state = s->state;
+        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
+            process_rule(bs, rule, actions_count);
+        }
     }
 
     while (actions_count[ACTION_SUSPEND] > 0) {
@@ -839,7 +852,9 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
         actions_count[ACTION_SUSPEND]--;
     }
 
+    qemu_mutex_lock(&s->lock);
     s->state = s->new_state;
+    qemu_mutex_unlock(&s->lock);
 }
 
 static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
@@ -862,11 +877,14 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
         .options.suspend.tag = g_strdup(tag),
     };
 
+    qemu_mutex_lock(&s->lock);
     QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
+    qemu_mutex_unlock(&s->lock);
 
     return 0;
 }
 
+/* Called with lock held.  */
 static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
 {
     BlkdebugSuspendedReq *r;
@@ -884,7 +902,9 @@ retry:
             g_free(r->tag);
             g_free(r);
 
+            qemu_mutex_unlock(&s->lock);
             qemu_coroutine_enter(co);
+            qemu_mutex_lock(&s->lock);
 
             if (all) {
                 goto retry;
@@ -898,8 +918,12 @@ retry:
 static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
 {
     BDRVBlkdebugState *s = bs->opaque;
+    int rc;
 
-    return resume_req_by_tag(s, tag, false);
+    qemu_mutex_lock(&s->lock);
+    rc = resume_req_by_tag(s, tag, false);
+    qemu_mutex_unlock(&s->lock);
+    return rc;
 }
 
 static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
@@ -909,17 +933,19 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
     BlkdebugRule *rule, *next;
     int i, ret = -ENOENT;
 
-    for (i = 0; i < BLKDBG__MAX; i++) {
-        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
-            if (rule->action == ACTION_SUSPEND &&
-                !strcmp(rule->options.suspend.tag, tag)) {
-                remove_rule(rule);
-                ret = 0;
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        for (i = 0; i < BLKDBG__MAX; i++) {
+            QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
+                if (rule->action == ACTION_SUSPEND &&
+                    !strcmp(rule->options.suspend.tag, tag)) {
+                    remove_rule(rule);
+                    ret = 0;
+                }
             }
         }
-    }
-    if (resume_req_by_tag(s, tag, true) == 0) {
-        ret = 0;
+        if (resume_req_by_tag(s, tag, true) == 0) {
+            ret = 0;
+        }
     }
     return ret;
 }
@@ -929,6 +955,7 @@ static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
     BDRVBlkdebugState *s = bs->opaque;
     BlkdebugSuspendedReq *r;
 
+    QEMU_LOCK_GUARD(&s->lock);
     QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
             return true;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/5] blkdebug: refactor removal of a suspended request
  2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
@ 2021-05-31 19:44   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-31 19:44 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
> Extract to a separate function.  Do not rely on FOREACH_SAFE, which is
> only "safe" if the *current* node is removed---not if another node is
> removed.  Instead, just walk the entire list from the beginning when
> asked to resume all suspended requests with a given tag.
> 
> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/blkdebug.c | 28 +++++++++++++++++-----------
>   1 file changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 2c0b9b0ee8..8f19d991fa 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -793,7 +793,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>           printf("blkdebug: Resuming request '%s'\n", r.tag);
>       }
>   
> -    QLIST_REMOVE(&r, next);
>       g_free(r.tag);
>   }
>   
> @@ -869,25 +868,35 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>       return 0;
>   }
>   
> -static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
> +static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
>   {
> -    BDRVBlkdebugState *s = bs->opaque;
> -    BlkdebugSuspendedReq *r, *next;
> +    BlkdebugSuspendedReq *r;
>   
> -    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
> +retry:
> +    QLIST_FOREACH(r, &s->suspended_reqs, next) {
>           if (!strcmp(r->tag, tag)) {
> +            QLIST_REMOVE(r, next);
>               qemu_coroutine_enter(r->co);
> +            if (all) {
> +                goto retry;
> +            }
>               return 0;
>           }
>       }
>       return -ENOENT;
>   }
>   
> +static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
> +{
> +    BDRVBlkdebugState *s = bs->opaque;
> +
> +    return resume_req_by_tag(s, tag, false);
> +}
> +
>   static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>                                               const char *tag)
>   {
>       BDRVBlkdebugState *s = bs->opaque;
> -    BlkdebugSuspendedReq *r, *r_next;
>       BlkdebugRule *rule, *next;
>       int i, ret = -ENOENT;
>   
> @@ -900,11 +909,8 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>               }
>           }
>       }
> -    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
> -        if (!strcmp(r->tag, tag)) {
> -            qemu_coroutine_enter(r->co);
> -            ret = 0;
> -        }
> +    if (resume_req_by_tag(s, tag, true) == 0) {
> +        ret = 0;
>       }
>       return ret;
>   }
> 

Interesting, could we really have several suspended_reqs with same tag, keeping in mind suspend_requests() removes rule before creating suspended_req with same tag.. Probably user could create new rule with same tag, when existing requests is suspended? Not sure is such behavior expected or should be abandoned. Still, it's all not about that patch:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag
  2021-05-17 14:50 ` [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
@ 2021-05-31 19:50   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-31 19:50 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
> We want to move qemu_coroutine_yield() after the loop on rules,
> because QLIST_FOREACH_SAFE is wrong if the rule list is modified
> while the coroutine has yielded.  Therefore move the suspended
> request to the heap and clean it up from the remove side.
> All that is left is for blkdebug_debug_event to handle the
> yielding.
> 
> Co-developed-by: Paolo Bonzini<pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/5] blkdebug: track all actions
  2021-05-17 14:50 ` [PATCH v3 3/5] blkdebug: track all actions Emanuele Giuseppe Esposito
@ 2021-05-31 19:57   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-05-31 19:57 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
> Add a counter for each action that a rule can trigger.
> This is mainly used to keep track of how many coroutine_yield()
> we need to perform after processing all rules in the list.
> 
> Co-developed-by: Paolo Bonzini<pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  2021-05-17 14:50 ` [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
@ 2021-06-01  7:58   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-01  7:58 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
> That would be unsafe in case a rule other than the current one
> is removed while the coroutine has yielded.
> Keep FOREACH_SAFE because suspend_request deletes the current rule.
> 
> After this patch, *all* matching rules are deleted before suspending
> the coroutine, rather than just one.
> This doesn't affect the existing testcases.
> 
> Use actions_count to see how many yield to issue.
> 
> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/blkdebug.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 388b5ed615..dffd869b32 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -789,7 +789,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>       if (!qtest_enabled()) {
>           printf("blkdebug: Suspended request '%s'\n", r->tag);
>       }
> -    qemu_coroutine_yield();
>   }
>   
>   static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
> @@ -834,6 +833,12 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
>       QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>           process_rule(bs, rule, actions_count);
>       }
> +
> +    while (actions_count[ACTION_SUSPEND] > 0) {
> +        qemu_coroutine_yield();
> +        actions_count[ACTION_SUSPEND]--;
> +    }
> +
>       s->state = s->new_state;
>   }
>   
> 

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

two thoughts:

  - suspend_request() should probably be renamed
  - actions_count logic is a bit overcomplicated, actually we need only suspend_count.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock
  2021-05-17 14:50 ` [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
@ 2021-06-01  8:39   ` Vladimir Sementsov-Ogievskiy
  2021-06-02 12:59     ` Emanuele Giuseppe Esposito
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-01  8:39 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/blkdebug.c | 53 ++++++++++++++++++++++++++++++++++++------------
>   1 file changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index dffd869b32..cf8b088ce7 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -54,6 +54,7 @@ typedef struct BDRVBlkdebugState {
>       /* For blkdebug_refresh_filename() */
>       char *config_file;
>   
> +    QemuMutex lock;

we'll need a comments, which fields are protected by the lock, like in other your series.

and also a comment which functions are thread-safe after the patch.

remove_rule() lacks comment, like "Called with lock held or from .bdrv_close"

>       QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
>       QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
>       QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
> @@ -245,7 +246,9 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
>       };
>   
>       /* Add the rule */
> +    qemu_mutex_lock(&s->lock);
>       QLIST_INSERT_HEAD(&s->rules[event], rule, next);
> +    qemu_mutex_unlock(&s->lock);
>   
>       return 0;
>   }
> @@ -468,6 +471,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
>       int ret;
>       uint64_t align;
>   
> +    qemu_mutex_init(&s->lock);
>       opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>       if (!qemu_opts_absorb_qdict(opts, options, errp)) {
>           ret = -EINVAL;
> @@ -568,6 +572,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
>       ret = 0;
>   out:
>       if (ret < 0) {
> +        qemu_mutex_destroy(&s->lock);
>           g_free(s->config_file);
>       }
>       qemu_opts_del(opts);
> @@ -582,6 +587,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>       int error;
>       bool immediately;
>   
> +    qemu_mutex_lock(&s->lock);
>       QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
>           uint64_t inject_offset = rule->options.inject.offset;
>   
> @@ -595,6 +601,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>       }
>   
>       if (!rule || !rule->options.inject.error) {
> +        qemu_mutex_unlock(&s->lock);
>           return 0;
>       }
>   
> @@ -606,6 +613,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>           remove_rule(rule);
>       }
>   
> +    qemu_mutex_unlock(&s->lock);
>       if (!immediately) {
>           aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
>           qemu_coroutine_yield();
> @@ -771,8 +779,10 @@ static void blkdebug_close(BlockDriverState *bs)
>       }
>   
>       g_free(s->config_file);
> +    qemu_mutex_destroy(&s->lock);
>   }
>   
> +/* Called with lock held.  */
>   static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>   {
>       BDRVBlkdebugState *s = bs->opaque;
> @@ -791,6 +801,7 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>       }
>   }
>   
> +/* Called with lock held.  */
>   static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
>                            int *action_count)
>   {
> @@ -829,9 +840,11 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
>   
>       assert((int)event >= 0 && event < BLKDBG__MAX);
>   
> -    s->new_state = s->state;
> -    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
> -        process_rule(bs, rule, actions_count);
> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
> +        s->new_state = s->state;
> +        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
> +            process_rule(bs, rule, actions_count);
> +        }
>       }
>   
>       while (actions_count[ACTION_SUSPEND] > 0) {
> @@ -839,7 +852,9 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
>           actions_count[ACTION_SUSPEND]--;
>       }
>   
> +    qemu_mutex_lock(&s->lock);
>       s->state = s->new_state;
> +    qemu_mutex_unlock(&s->lock);
>   }

Preexising: honestly, I don't understand the logic around state and new_state.. (and therefore, I'm not sure, is it in good relation with patch 04)

It come in the commit

commit 8f96b5be92fbd74798b97b1dc1ff5fbbe249ed11
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Fri Sep 28 17:23:00 2012 +0200

     blkdebug: process all set_state rules in the old state
     
     Currently it is impossible to write a blkdebug script that ping-pongs
     between two states, because the second set-state rule will use the
     state that is set in the first.  If you have
     
         [set-state]
         event = "..."
         state = "1"
         new_state = "2"
     
         [set-state]
         event = "..."
         state = "2"
         new_state = "1"
     
     for example the state will remain locked at 1.  This can be fixed
     by first processing all rules, and then setting the state.

But I don't understand, whey it should remain locked..

And this logic is not safe: another event may happen during the yield, and will operate on the same s->state and s->new_state, leading the the mess.

>   
>   static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
> @@ -862,11 +877,14 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>           .options.suspend.tag = g_strdup(tag),
>       };
>   
> +    qemu_mutex_lock(&s->lock);
>       QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
> +    qemu_mutex_unlock(&s->lock);
>   
>       return 0;
>   }
>   
> +/* Called with lock held.  */

I think, it would be a very good practice to include into convention:

May temporarily release lock.

>   static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
>   {
>       BlkdebugSuspendedReq *r;
> @@ -884,7 +902,9 @@ retry:
>               g_free(r->tag);
>               g_free(r);
>   
> +            qemu_mutex_unlock(&s->lock);
>               qemu_coroutine_enter(co);
> +            qemu_mutex_lock(&s->lock);
>   
>               if (all) {
>                   goto retry;
> @@ -898,8 +918,12 @@ retry:
>   static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
>   {
>       BDRVBlkdebugState *s = bs->opaque;
> +    int rc;

Hmm, you can simply use QEMU_LOCK_GUARD

>   
> -    return resume_req_by_tag(s, tag, false);
> +    qemu_mutex_lock(&s->lock);
> +    rc = resume_req_by_tag(s, tag, false);
> +    qemu_mutex_unlock(&s->lock);
> +    return rc;
>   }
>   
>   static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
> @@ -909,17 +933,19 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>       BlkdebugRule *rule, *next;
>       int i, ret = -ENOENT;
>   
> -    for (i = 0; i < BLKDBG__MAX; i++) {
> -        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
> -            if (rule->action == ACTION_SUSPEND &&
> -                !strcmp(rule->options.suspend.tag, tag)) {
> -                remove_rule(rule);
> -                ret = 0;
> +    WITH_QEMU_LOCK_GUARD(&s->lock) {

And here, instead of increasing nesting, let's use QEMU_LOCK_GUARD()

> +        for (i = 0; i < BLKDBG__MAX; i++) {
> +            QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
> +                if (rule->action == ACTION_SUSPEND &&
> +                    !strcmp(rule->options.suspend.tag, tag)) {
> +                    remove_rule(rule);
> +                    ret = 0;
> +                }
>               }
>           }
> -    }
> -    if (resume_req_by_tag(s, tag, true) == 0) {
> -        ret = 0;
> +        if (resume_req_by_tag(s, tag, true) == 0) {
> +            ret = 0;
> +        }
>       }
>       return ret;
>   }
> @@ -929,6 +955,7 @@ static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
>       BDRVBlkdebugState *s = bs->opaque;
>       BlkdebugSuspendedReq *r;
>   
> +    QEMU_LOCK_GUARD(&s->lock);
>       QLIST_FOREACH(r, &s->suspended_reqs, next) {
>           if (!strcmp(r->tag, tag)) {
>               return true;
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock
  2021-06-01  8:39   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-02 12:59     ` Emanuele Giuseppe Esposito
  2021-06-02 13:24       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 13+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-02 12:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Max Reitz



On 01/06/2021 10:39, Vladimir Sementsov-Ogievskiy wrote:
> 17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
>> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/blkdebug.c | 53 ++++++++++++++++++++++++++++++++++++------------
>>   1 file changed, 40 insertions(+), 13 deletions(-)
>>
>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>> index dffd869b32..cf8b088ce7 100644
>> --- a/block/blkdebug.c
>> +++ b/block/blkdebug.c
>> @@ -54,6 +54,7 @@ typedef struct BDRVBlkdebugState {
>>       /* For blkdebug_refresh_filename() */
>>       char *config_file;
>> +    QemuMutex lock;
> 
> we'll need a comments, which fields are protected by the lock, like in 
> other your series.
> 
> and also a comment which functions are thread-safe after the patch.
> 
> remove_rule() lacks comment, like "Called with lock held or from 
> .bdrv_close"

Will apply these feedback in next version, thanks.

[...]

>> +/* Called with lock held.  */
>>   static void process_rule(BlockDriverState *bs, struct BlkdebugRule 
>> *rule,
>>                            int *action_count)
>>   {
>> @@ -829,9 +840,11 @@ static void blkdebug_debug_event(BlockDriverState 
>> *bs, BlkdebugEvent event)
>>       assert((int)event >= 0 && event < BLKDBG__MAX);
>> -    s->new_state = s->state;
>> -    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>> -        process_rule(bs, rule, actions_count);
>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
>> +        s->new_state = s->state;
>> +        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>> +            process_rule(bs, rule, actions_count);
>> +        }
>>       }
>>       while (actions_count[ACTION_SUSPEND] > 0) {
>> @@ -839,7 +852,9 @@ static void blkdebug_debug_event(BlockDriverState 
>> *bs, BlkdebugEvent event)
>>           actions_count[ACTION_SUSPEND]--;
>>       }
>> +    qemu_mutex_lock(&s->lock);
>>       s->state = s->new_state;
>> +    qemu_mutex_unlock(&s->lock);
>>   }
> 
> Preexising: honestly, I don't understand the logic around state and 
> new_state.. (and therefore, I'm not sure, is it in good relation with 
> patch 04)
> 
> It come in the commit
> 
> commit 8f96b5be92fbd74798b97b1dc1ff5fbbe249ed11
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Fri Sep 28 17:23:00 2012 +0200
> 
>      blkdebug: process all set_state rules in the old state
>      Currently it is impossible to write a blkdebug script that ping-pongs
>      between two states, because the second set-state rule will use the
>      state that is set in the first.  If you have
>          [set-state]
>          event = "..."
>          state = "1"
>          new_state = "2"
>          [set-state]
>          event = "..."
>          state = "2"
>          new_state = "1"
>      for example the state will remain locked at 1.  This can be fixed
>      by first processing all rules, and then setting the state.
> 
> But I don't understand, whey it should remain locked..

 From what I understand, when bdrv_debug_event is invoked the code 
before this patch will simply change the state in 1 - 2 - 1 in the same 
rule iteration. So following Paolo's example, having these two rules in 
the same rules["..."] list, will make only one bdrv_debug_event change 
the state from 1 to 2, and 2 to 1 (if they are ordered in this way), 
removing both rules from the list.

This is not the expected behavior: we want two bdrv_debug_event to 
trigger the two state changes, so in the first bdrv_debug_event call we 
have 1-2 and next 2-1.

> 
> And this logic is not safe: another event may happen during the yield, 
> and will operate on the same s->state and s->new_state, leading the the 
> mess.

Yes, I think you are right. The state update should go in the same lock 
guard above, like this:

WITH_QEMU_LOCK_GUARD(&s->lock) {
         s->new_state = s->state;
         QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
             process_rule(bs, rule, actions_count);
         }
+       s->state = s->new_state;
     }

     while (actions_count[ACTION_SUSPEND] > 0) {
         qemu_coroutine_yield();
         actions_count[ACTION_SUSPEND]--;
     }

-    qemu_mutex_lock(&s->lock);
-    s->state = s->new_state;
-    qemu_mutex_unlock(&s->lock);

The other comments below make sense to me, will also apply them.

Thank you,
Emanuele

> 
>>   static int blkdebug_debug_breakpoint(BlockDriverState *bs, const 
>> char *event,
>> @@ -862,11 +877,14 @@ static int 
>> blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>>           .options.suspend.tag = g_strdup(tag),
>>       };
>> +    qemu_mutex_lock(&s->lock);
>>       QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
>> +    qemu_mutex_unlock(&s->lock);
>>       return 0;
>>   }
>> +/* Called with lock held.  */
> 
> I think, it would be a very good practice to include into convention:
> 
> May temporarily release lock. >
>>   static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, 
>> bool all)
>>   {
>>       BlkdebugSuspendedReq *r;
>> @@ -884,7 +902,9 @@ retry:
>>               g_free(r->tag);
>>               g_free(r);
>> +            qemu_mutex_unlock(&s->lock);
>>               qemu_coroutine_enter(co);
>> +            qemu_mutex_lock(&s->lock);
>>               if (all) {
>>                   goto retry;
>> @@ -898,8 +918,12 @@ retry:
>>   static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
>>   {
>>       BDRVBlkdebugState *s = bs->opaque;
>> +    int rc;
> 
> Hmm, you can simply use QEMU_LOCK_GUARD

> 
>> -    return resume_req_by_tag(s, tag, false);
>> +    qemu_mutex_lock(&s->lock);
>> +    rc = resume_req_by_tag(s, tag, false);
>> +    qemu_mutex_unlock(&s->lock);
>> +    return rc;
>>   }
>>   static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>> @@ -909,17 +933,19 @@ static int 
>> blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>>       BlkdebugRule *rule, *next;
>>       int i, ret = -ENOENT;
>> -    for (i = 0; i < BLKDBG__MAX; i++) {
>> -        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>> -            if (rule->action == ACTION_SUSPEND &&
>> -                !strcmp(rule->options.suspend.tag, tag)) {
>> -                remove_rule(rule);
>> -                ret = 0;
>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
> 
> And here, instead of increasing nesting, let's use QEMU_LOCK_GUARD()
> 
>> +        for (i = 0; i < BLKDBG__MAX; i++) {
>> +            QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>> +                if (rule->action == ACTION_SUSPEND &&
>> +                    !strcmp(rule->options.suspend.tag, tag)) {
>> +                    remove_rule(rule);
>> +                    ret = 0;
>> +                }
>>               }
>>           }
>> -    }
>> -    if (resume_req_by_tag(s, tag, true) == 0) {
>> -        ret = 0;
>> +        if (resume_req_by_tag(s, tag, true) == 0) {
>> +            ret = 0;
>> +        }
>>       }
>>       return ret;
>>   }
>> @@ -929,6 +955,7 @@ static bool 
>> blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
>>       BDRVBlkdebugState *s = bs->opaque;
>>       BlkdebugSuspendedReq *r;
>> +    QEMU_LOCK_GUARD(&s->lock);
>>       QLIST_FOREACH(r, &s->suspended_reqs, next) {
>>           if (!strcmp(r->tag, tag)) {
>>               return true;
>>
> 
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock
  2021-06-02 12:59     ` Emanuele Giuseppe Esposito
@ 2021-06-02 13:24       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-02 13:24 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Paolo Bonzini

02.06.2021 15:59, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 01/06/2021 10:39, Vladimir Sementsov-Ogievskiy wrote:
>> 17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
>>> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/blkdebug.c | 53 ++++++++++++++++++++++++++++++++++++------------
>>>   1 file changed, 40 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>>> index dffd869b32..cf8b088ce7 100644
>>> --- a/block/blkdebug.c
>>> +++ b/block/blkdebug.c
>>> @@ -54,6 +54,7 @@ typedef struct BDRVBlkdebugState {
>>>       /* For blkdebug_refresh_filename() */
>>>       char *config_file;
>>> +    QemuMutex lock;
>>
>> we'll need a comments, which fields are protected by the lock, like in other your series.
>>
>> and also a comment which functions are thread-safe after the patch.
>>
>> remove_rule() lacks comment, like "Called with lock held or from .bdrv_close"
> 
> Will apply these feedback in next version, thanks.
> 
> [...]
> 
>>> +/* Called with lock held.  */
>>>   static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
>>>                            int *action_count)
>>>   {
>>> @@ -829,9 +840,11 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
>>>       assert((int)event >= 0 && event < BLKDBG__MAX);
>>> -    s->new_state = s->state;
>>> -    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>>> -        process_rule(bs, rule, actions_count);
>>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
>>> +        s->new_state = s->state;
>>> +        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>>> +            process_rule(bs, rule, actions_count);
>>> +        }
>>>       }
>>>       while (actions_count[ACTION_SUSPEND] > 0) {
>>> @@ -839,7 +852,9 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
>>>           actions_count[ACTION_SUSPEND]--;
>>>       }
>>> +    qemu_mutex_lock(&s->lock);
>>>       s->state = s->new_state;
>>> +    qemu_mutex_unlock(&s->lock);
>>>   }
>>
>> Preexising: honestly, I don't understand the logic around state and new_state.. (and therefore, I'm not sure, is it in good relation with patch 04)
>>
>> It come in the commit
>>
>> commit 8f96b5be92fbd74798b97b1dc1ff5fbbe249ed11
>> Author: Paolo Bonzini <pbonzini@redhat.com>
>> Date:   Fri Sep 28 17:23:00 2012 +0200
>>
>>      blkdebug: process all set_state rules in the old state
>>      Currently it is impossible to write a blkdebug script that ping-pongs
>>      between two states, because the second set-state rule will use the
>>      state that is set in the first.  If you have
>>          [set-state]
>>          event = "..."
>>          state = "1"
>>          new_state = "2"
>>          [set-state]
>>          event = "..."
>>          state = "2"
>>          new_state = "1"
>>      for example the state will remain locked at 1.  This can be fixed
>>      by first processing all rules, and then setting the state.
>>
>> But I don't understand, whey it should remain locked..
> 
>  From what I understand, when bdrv_debug_event is invoked the code before this patch will simply change the state in 1 - 2 - 1 in the same rule iteration. So following Paolo's example, having these two rules in the same rules["..."] list, will make only one bdrv_debug_event change the state from 1 to 2, and 2 to 1 (if they are ordered in this way), removing both rules from the list.
> 
> This is not the expected behavior: we want two bdrv_debug_event to trigger the two state changes, so in the first bdrv_debug_event call we have 1-2 and next 2-1.

Oh, understand now, thanks.

> 
>>
>> And this logic is not safe: another event may happen during the yield, and will operate on the same s->state and s->new_state, leading the the mess.
> 
> Yes, I think you are right. The state update should go in the same lock guard above, like this:

Agreed.

Probably good refactoring would be return new_state from process_rule, this way we can drop extra state variable s->new_state and use local variable instead (like we do for actions_count)

> 
> WITH_QEMU_LOCK_GUARD(&s->lock) {
>          s->new_state = s->state;
>          QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>              process_rule(bs, rule, actions_count);
>          }
> +       s->state = s->new_state;
>      }
> 
>      while (actions_count[ACTION_SUSPEND] > 0) {
>          qemu_coroutine_yield();
>          actions_count[ACTION_SUSPEND]--;
>      }
> 
> -    qemu_mutex_lock(&s->lock);
> -    s->state = s->new_state;
> -    qemu_mutex_unlock(&s->lock);
> 
> The other comments below make sense to me, will also apply them.
> 
> Thank you,
> Emanuele
> 
>>
>>>   static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>>> @@ -862,11 +877,14 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>>>           .options.suspend.tag = g_strdup(tag),
>>>       };
>>> +    qemu_mutex_lock(&s->lock);
>>>       QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
>>> +    qemu_mutex_unlock(&s->lock);
>>>       return 0;
>>>   }
>>> +/* Called with lock held.  */
>>
>> I think, it would be a very good practice to include into convention:
>>
>> May temporarily release lock. >
>>>   static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
>>>   {
>>>       BlkdebugSuspendedReq *r;
>>> @@ -884,7 +902,9 @@ retry:
>>>               g_free(r->tag);
>>>               g_free(r);
>>> +            qemu_mutex_unlock(&s->lock);
>>>               qemu_coroutine_enter(co);
>>> +            qemu_mutex_lock(&s->lock);
>>>               if (all) {
>>>                   goto retry;
>>> @@ -898,8 +918,12 @@ retry:
>>>   static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
>>>   {
>>>       BDRVBlkdebugState *s = bs->opaque;
>>> +    int rc;
>>
>> Hmm, you can simply use QEMU_LOCK_GUARD
> 
>>
>>> -    return resume_req_by_tag(s, tag, false);
>>> +    qemu_mutex_lock(&s->lock);
>>> +    rc = resume_req_by_tag(s, tag, false);
>>> +    qemu_mutex_unlock(&s->lock);
>>> +    return rc;
>>>   }
>>>   static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>>> @@ -909,17 +933,19 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>>>       BlkdebugRule *rule, *next;
>>>       int i, ret = -ENOENT;
>>> -    for (i = 0; i < BLKDBG__MAX; i++) {
>>> -        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>>> -            if (rule->action == ACTION_SUSPEND &&
>>> -                !strcmp(rule->options.suspend.tag, tag)) {
>>> -                remove_rule(rule);
>>> -                ret = 0;
>>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
>>
>> And here, instead of increasing nesting, let's use QEMU_LOCK_GUARD()
>>
>>> +        for (i = 0; i < BLKDBG__MAX; i++) {
>>> +            QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>>> +                if (rule->action == ACTION_SUSPEND &&
>>> +                    !strcmp(rule->options.suspend.tag, tag)) {
>>> +                    remove_rule(rule);
>>> +                    ret = 0;
>>> +                }
>>>               }
>>>           }
>>> -    }
>>> -    if (resume_req_by_tag(s, tag, true) == 0) {
>>> -        ret = 0;
>>> +        if (resume_req_by_tag(s, tag, true) == 0) {
>>> +            ret = 0;
>>> +        }
>>>       }
>>>       return ret;
>>>   }
>>> @@ -929,6 +955,7 @@ static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
>>>       BDRVBlkdebugState *s = bs->opaque;
>>>       BlkdebugSuspendedReq *r;
>>> +    QEMU_LOCK_GUARD(&s->lock);
>>>       QLIST_FOREACH(r, &s->suspended_reqs, next) {
>>>           if (!strcmp(r->tag, tag)) {
>>>               return true;
>>>
>>
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-06-02 13:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
2021-05-31 19:44   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
2021-05-31 19:50   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 3/5] blkdebug: track all actions Emanuele Giuseppe Esposito
2021-05-31 19:57   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
2021-06-01  7:58   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
2021-06-01  8:39   ` Vladimir Sementsov-Ogievskiy
2021-06-02 12:59     ` Emanuele Giuseppe Esposito
2021-06-02 13:24       ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.