All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] blkdebug: fix racing condition when iterating on
@ 2021-06-14  8:29 Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 1/6] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

When qemu_coroutine_enter is executed in a loop
(even QEMU_FOREACH_SAFE), the new routine can modify the list,
for example removing an element, causing problem when control
is given back to the caller that continues iterating on the same list. 

Patch 1 solves the issue in blkdebug_debug_resume by restarting
the list walk after every coroutine_enter if list has to be fully iterated.
Patches 2,3,4 aim to fix blkdebug_debug_event by gathering
all actions that the rules make in a counter and invoking 
the respective coroutine_yeld only after processing all requests.

Patch 5-6 are somewhat independent of the others, patch 5 removes the need
of new_state field, and patch 6 adds a lock to
protect rules and suspended_reqs; right now everything works because
it's protected by the AioContext lock.
This is a preparation for the current proposal of removing the AioContext
lock and instead using smaller granularity locks to allow multiple
iothread execution in the same block device.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
v5:
* Add comment in patch 1 to explain why we don't need _SAFE in for loop
* Move the state update (s->state = new_state) in patch 5, to maintain
  the same existing effect in all patches

Emanuele Giuseppe Esposito (6):
  blkdebug: refactor removal of a suspended request
  blkdebug: move post-resume handling to resume_req_by_tag
  blkdebug: track all actions
  blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  block/blkdebug: remove new_state field and instead use a local
    variable
  blkdebug: protect rules and suspended_reqs with a lock

 block/blkdebug.c | 136 ++++++++++++++++++++++++++++++++---------------
 1 file changed, 92 insertions(+), 44 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/6] blkdebug: refactor removal of a suspended request
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

Extract to a separate function.  Do not rely on FOREACH_SAFE, which is
only "safe" if the *current* node is removed---not if another node is
removed.  Instead, just walk the entire list from the beginning when
asked to resume all suspended requests with a given tag.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/blkdebug.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 2c0b9b0ee8..5ccbfcab42 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -793,7 +793,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
         printf("blkdebug: Resuming request '%s'\n", r.tag);
     }
 
-    QLIST_REMOVE(&r, next);
     g_free(r.tag);
 }
 
@@ -869,25 +868,40 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
     return 0;
 }
 
-static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
+static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
 {
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *next;
+    BlkdebugSuspendedReq *r;
 
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
+retry:
+    /*
+     * No need for _SAFE, since a different coroutine can remove another node
+     * (not the current one) in this list, and when the current one is removed
+     * the iteration starts back from beginning anyways.
+     */
+    QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
+            QLIST_REMOVE(r, next);
             qemu_coroutine_enter(r->co);
+            if (all) {
+                goto retry;
+            }
             return 0;
         }
     }
     return -ENOENT;
 }
 
+static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
+{
+    BDRVBlkdebugState *s = bs->opaque;
+
+    return resume_req_by_tag(s, tag, false);
+}
+
 static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
                                             const char *tag)
 {
     BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *r_next;
     BlkdebugRule *rule, *next;
     int i, ret = -ENOENT;
 
@@ -900,11 +914,8 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
             }
         }
     }
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
-        if (!strcmp(r->tag, tag)) {
-            qemu_coroutine_enter(r->co);
-            ret = 0;
-        }
+    if (resume_req_by_tag(s, tag, true) == 0) {
+        ret = 0;
     }
     return ret;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 1/6] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-07-15  9:59   ` Max Reitz
  2021-06-14  8:29 ` [PATCH v5 3/6] blkdebug: track all actions Emanuele Giuseppe Esposito
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

We want to move qemu_coroutine_yield() after the loop on rules,
because QLIST_FOREACH_SAFE is wrong if the rule list is modified
while the coroutine has yielded.  Therefore move the suspended
request to the heap and clean it up from the remove side.
All that is left is for blkdebug_debug_event to handle the
yielding.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/blkdebug.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 5ccbfcab42..e8fdf7b056 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -775,25 +775,20 @@ static void blkdebug_close(BlockDriverState *bs)
 static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
 {
     BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq r;
+    BlkdebugSuspendedReq *r;
 
-    r = (BlkdebugSuspendedReq) {
-        .co         = qemu_coroutine_self(),
-        .tag        = g_strdup(rule->options.suspend.tag),
-    };
+    r = g_new(BlkdebugSuspendedReq, 1);
+
+    r->co         = qemu_coroutine_self();
+    r->tag        = g_strdup(rule->options.suspend.tag);
 
     remove_rule(rule);
-    QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);
+    QLIST_INSERT_HEAD(&s->suspended_reqs, r, next);
 
     if (!qtest_enabled()) {
-        printf("blkdebug: Suspended request '%s'\n", r.tag);
+        printf("blkdebug: Suspended request '%s'\n", r->tag);
     }
     qemu_coroutine_yield();
-    if (!qtest_enabled()) {
-        printf("blkdebug: Resuming request '%s'\n", r.tag);
-    }
-
-    g_free(r.tag);
 }
 
 static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
@@ -880,8 +875,18 @@ retry:
      */
     QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
+            Coroutine *co = r->co;
+
+            if (!qtest_enabled()) {
+                printf("blkdebug: Resuming request '%s'\n", r->tag);
+            }
+
             QLIST_REMOVE(r, next);
-            qemu_coroutine_enter(r->co);
+            g_free(r->tag);
+            g_free(r);
+
+            qemu_coroutine_enter(co);
+
             if (all) {
                 goto retry;
             }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/6] blkdebug: track all actions
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 1/6] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-07-15  9:59   ` Max Reitz
  2021-06-14  8:29 ` [PATCH v5 4/6] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

Add a counter for each action that a rule can trigger.
This is mainly used to keep track of how many coroutine_yield()
we need to perform after processing all rules in the list.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/blkdebug.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index e8fdf7b056..6bdeb2c7b3 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -74,6 +74,7 @@ enum {
     ACTION_INJECT_ERROR,
     ACTION_SET_STATE,
     ACTION_SUSPEND,
+    ACTION__MAX,
 };
 
 typedef struct BlkdebugRule {
@@ -791,22 +792,22 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     qemu_coroutine_yield();
 }
 
-static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
-    bool injected)
+static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
+                         int *action_count)
 {
     BDRVBlkdebugState *s = bs->opaque;
 
     /* Only process rules for the current state */
     if (rule->state && rule->state != s->state) {
-        return injected;
+        return;
     }
 
     /* Take the action */
+    action_count[rule->action]++;
     switch (rule->action) {
     case ACTION_INJECT_ERROR:
-        if (!injected) {
+        if (action_count[ACTION_INJECT_ERROR] == 1) {
             QSIMPLEQ_INIT(&s->active_rules);
-            injected = true;
         }
         QSIMPLEQ_INSERT_HEAD(&s->active_rules, rule, active_next);
         break;
@@ -819,21 +820,19 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
         suspend_request(bs, rule);
         break;
     }
-    return injected;
 }
 
 static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
     BDRVBlkdebugState *s = bs->opaque;
     struct BlkdebugRule *rule, *next;
-    bool injected;
+    int actions_count[ACTION__MAX] = { 0 };
 
     assert((int)event >= 0 && event < BLKDBG__MAX);
 
-    injected = false;
     s->new_state = s->state;
     QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
-        injected = process_rule(bs, rule, injected);
+        process_rule(bs, rule, actions_count);
     }
     s->state = s->new_state;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 4/6] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (2 preceding siblings ...)
  2021-06-14  8:29 ` [PATCH v5 3/6] blkdebug: track all actions Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-06-14  8:29 ` [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable Emanuele Giuseppe Esposito
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

That would be unsafe in case a rule other than the current one
is removed while the coroutine has yielded.
Keep FOREACH_SAFE because suspend_request deletes the current rule.

After this patch, *all* matching rules are deleted before suspending
the coroutine, rather than just one.
This doesn't affect the existing testcases.

Use actions_count to see how many yield to issue.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/blkdebug.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 6bdeb2c7b3..dd82131d1e 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -789,7 +789,6 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     if (!qtest_enabled()) {
         printf("blkdebug: Suspended request '%s'\n", r->tag);
     }
-    qemu_coroutine_yield();
 }
 
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
@@ -834,6 +833,12 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
     QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
         process_rule(bs, rule, actions_count);
     }
+
+    while (actions_count[ACTION_SUSPEND] > 0) {
+        qemu_coroutine_yield();
+        actions_count[ACTION_SUSPEND]--;
+    }
+
     s->state = s->new_state;
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (3 preceding siblings ...)
  2021-06-14  8:29 ` [PATCH v5 4/6] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-06-19 12:38   ` Vladimir Sementsov-Ogievskiy
  2021-06-14  8:29 ` [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
  2021-07-15 10:06 ` [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Max Reitz
  6 siblings, 1 reply; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

There seems to be no benefit in using a field. Replace it with a local
variable, and move the state update before the yields.

The state update has do be done before the yields because now using
a local variable does not allow the new updated state to be visible
by the other yields.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index dd82131d1e..b47c3fd97c 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -40,7 +40,6 @@
 
 typedef struct BDRVBlkdebugState {
     int state;
-    int new_state;
     uint64_t align;
     uint64_t max_transfer;
     uint64_t opt_write_zero;
@@ -792,7 +791,7 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
 }
 
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
-                         int *action_count)
+                         int *action_count, int *new_state)
 {
     BDRVBlkdebugState *s = bs->opaque;
 
@@ -812,7 +811,7 @@ static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
         break;
 
     case ACTION_SET_STATE:
-        s->new_state = rule->options.set_state.new_state;
+        *new_state = rule->options.set_state.new_state;
         break;
 
     case ACTION_SUSPEND:
@@ -825,21 +824,21 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
     BDRVBlkdebugState *s = bs->opaque;
     struct BlkdebugRule *rule, *next;
+    int new_state;
     int actions_count[ACTION__MAX] = { 0 };
 
     assert((int)event >= 0 && event < BLKDBG__MAX);
 
-    s->new_state = s->state;
+    new_state = s->state;
     QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
-        process_rule(bs, rule, actions_count);
+        process_rule(bs, rule, actions_count, &new_state);
     }
+    s->state = new_state;
 
     while (actions_count[ACTION_SUSPEND] > 0) {
         qemu_coroutine_yield();
         actions_count[ACTION_SUSPEND]--;
     }
-
-    s->state = s->new_state;
 }
 
 static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (4 preceding siblings ...)
  2021-06-14  8:29 ` [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable Emanuele Giuseppe Esposito
@ 2021-06-14  8:29 ` Emanuele Giuseppe Esposito
  2021-06-19 12:42   ` Vladimir Sementsov-Ogievskiy
  2021-07-15 10:06 ` [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Max Reitz
  6 siblings, 1 reply; 12+ messages in thread
From: Emanuele Giuseppe Esposito @ 2021-06-14  8:29 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Emanuele Giuseppe Esposito,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	Paolo Bonzini, Eric Blake

First, categorize the structure fields to identify what needs
to be protected and what doesn't.

We essentially need to protect only .state, and the 3 lists in
BDRVBlkdebugState.

Then, add the lock and mark the functions accordingly.

Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/blkdebug.c | 49 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index b47c3fd97c..8b67554bec 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -38,24 +38,27 @@
 #include "qapi/qobject-input-visitor.h"
 #include "sysemu/qtest.h"
 
+/* All APIs are thread-safe */
+
 typedef struct BDRVBlkdebugState {
-    int state;
+    /* IN: initialized in blkdebug_open() and never changed */
     uint64_t align;
     uint64_t max_transfer;
     uint64_t opt_write_zero;
     uint64_t max_write_zero;
     uint64_t opt_discard;
     uint64_t max_discard;
-
+    char *config_file; /* For blkdebug_refresh_filename() */
+    /* initialized in blkdebug_parse_perms() */
     uint64_t take_child_perms;
     uint64_t unshare_child_perms;
 
-    /* For blkdebug_refresh_filename() */
-    char *config_file;
-
+    /* State. Protected by lock */
+    int state;
     QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
     QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
     QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
+    QemuMutex lock;
 } BDRVBlkdebugState;
 
 typedef struct BlkdebugAIOCB {
@@ -64,8 +67,11 @@ typedef struct BlkdebugAIOCB {
 } BlkdebugAIOCB;
 
 typedef struct BlkdebugSuspendedReq {
+    /* IN: initialized in suspend_request() */
     Coroutine *co;
     char *tag;
+
+    /* List entry protected BDRVBlkdebugState's lock */
     QLIST_ENTRY(BlkdebugSuspendedReq) next;
 } BlkdebugSuspendedReq;
 
@@ -77,6 +83,7 @@ enum {
 };
 
 typedef struct BlkdebugRule {
+    /* IN: initialized in add_rule() or blkdebug_debug_breakpoint() */
     BlkdebugEvent event;
     int action;
     int state;
@@ -95,6 +102,8 @@ typedef struct BlkdebugRule {
             char *tag;
         } suspend;
     } options;
+
+    /* List entries protected BDRVBlkdebugState's lock */
     QLIST_ENTRY(BlkdebugRule) next;
     QSIMPLEQ_ENTRY(BlkdebugRule) active_next;
 } BlkdebugRule;
@@ -244,11 +253,14 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
     };
 
     /* Add the rule */
+    qemu_mutex_lock(&s->lock);
     QLIST_INSERT_HEAD(&s->rules[event], rule, next);
+    qemu_mutex_unlock(&s->lock);
 
     return 0;
 }
 
+/* Called with lock held or from .bdrv_close */
 static void remove_rule(BlkdebugRule *rule)
 {
     switch (rule->action) {
@@ -467,6 +479,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
     uint64_t align;
 
+    qemu_mutex_init(&s->lock);
     opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
     if (!qemu_opts_absorb_qdict(opts, options, errp)) {
         ret = -EINVAL;
@@ -567,6 +580,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     ret = 0;
 out:
     if (ret < 0) {
+        qemu_mutex_destroy(&s->lock);
         g_free(s->config_file);
     }
     qemu_opts_del(opts);
@@ -581,6 +595,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     int error;
     bool immediately;
 
+    qemu_mutex_lock(&s->lock);
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
         uint64_t inject_offset = rule->options.inject.offset;
 
@@ -594,6 +609,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     }
 
     if (!rule || !rule->options.inject.error) {
+        qemu_mutex_unlock(&s->lock);
         return 0;
     }
 
@@ -605,6 +621,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
         remove_rule(rule);
     }
 
+    qemu_mutex_unlock(&s->lock);
     if (!immediately) {
         aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
         qemu_coroutine_yield();
@@ -770,8 +787,10 @@ static void blkdebug_close(BlockDriverState *bs)
     }
 
     g_free(s->config_file);
+    qemu_mutex_destroy(&s->lock);
 }
 
+/* Called with lock held.  */
 static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
 {
     BDRVBlkdebugState *s = bs->opaque;
@@ -790,6 +809,7 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
     }
 }
 
+/* Called with lock held.  */
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
                          int *action_count, int *new_state)
 {
@@ -829,11 +849,13 @@ static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 
     assert((int)event >= 0 && event < BLKDBG__MAX);
 
-    new_state = s->state;
-    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
-        process_rule(bs, rule, actions_count, &new_state);
+    WITH_QEMU_LOCK_GUARD(&s->lock) {
+        new_state = s->state;
+        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
+            process_rule(bs, rule, actions_count, &new_state);
+        }
+        s->state = new_state;
     }
-    s->state = new_state;
 
     while (actions_count[ACTION_SUSPEND] > 0) {
         qemu_coroutine_yield();
@@ -861,11 +883,14 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
         .options.suspend.tag = g_strdup(tag),
     };
 
+    qemu_mutex_lock(&s->lock);
     QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
+    qemu_mutex_unlock(&s->lock);
 
     return 0;
 }
 
+/* Called with lock held. May temporarily release lock. */
 static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, bool all)
 {
     BlkdebugSuspendedReq *r;
@@ -888,7 +913,9 @@ retry:
             g_free(r->tag);
             g_free(r);
 
+            qemu_mutex_unlock(&s->lock);
             qemu_coroutine_enter(co);
+            qemu_mutex_lock(&s->lock);
 
             if (all) {
                 goto retry;
@@ -902,7 +929,7 @@ retry:
 static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
 {
     BDRVBlkdebugState *s = bs->opaque;
-
+    QEMU_LOCK_GUARD(&s->lock);
     return resume_req_by_tag(s, tag, false);
 }
 
@@ -913,6 +940,7 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
     BlkdebugRule *rule, *next;
     int i, ret = -ENOENT;
 
+    QEMU_LOCK_GUARD(&s->lock);
     for (i = 0; i < BLKDBG__MAX; i++) {
         QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
             if (rule->action == ACTION_SUSPEND &&
@@ -933,6 +961,7 @@ static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
     BDRVBlkdebugState *s = bs->opaque;
     BlkdebugSuspendedReq *r;
 
+    QEMU_LOCK_GUARD(&s->lock);
     QLIST_FOREACH(r, &s->suspended_reqs, next) {
         if (!strcmp(r->tag, tag)) {
             return true;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable
  2021-06-14  8:29 ` [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable Emanuele Giuseppe Esposito
@ 2021-06-19 12:38   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 12:38 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Max Reitz, Paolo Bonzini, Eric Blake, qemu-devel

14.06.2021 11:29, Emanuele Giuseppe Esposito wrote:
> There seems to be no benefit in using a field. Replace it with a local
> variable, and move the state update before the yields.
> 
> The state update has do be done before the yields because now using
> a local variable does not allow the new updated state to be visible
> by the other yields.

Not sure that this sentence helps.. I just can't imagine how state update should work keeping in mind these yields :) And making state update more atomically makes sense to me just because it's simpler to understand.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock
  2021-06-14  8:29 ` [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
@ 2021-06-19 12:42   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-19 12:42 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Max Reitz, Paolo Bonzini, Eric Blake, qemu-devel

14.06.2021 11:29, Emanuele Giuseppe Esposito wrote:
> First, categorize the structure fields to identify what needs
> to be protected and what doesn't.
> 
> We essentially need to protect only .state, and the 3 lists in
> BDRVBlkdebugState.
> 
> Then, add the lock and mark the functions accordingly.
> 
> Co-developed-by: Paolo Bonzini<pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag
  2021-06-14  8:29 ` [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
@ 2021-07-15  9:59   ` Max Reitz
  0 siblings, 0 replies; 12+ messages in thread
From: Max Reitz @ 2021-07-15  9:59 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Paolo Bonzini, Vladimir Sementsov-Ogievskiy,
	Eric Blake, qemu-devel

On 14.06.21 10:29, Emanuele Giuseppe Esposito wrote:
> We want to move qemu_coroutine_yield() after the loop on rules,
> because QLIST_FOREACH_SAFE is wrong if the rule list is modified
> while the coroutine has yielded.  Therefore move the suspended
> request to the heap and clean it up from the remove side.
> All that is left is for blkdebug_debug_event to handle the
> yielding.
>
> Co-developed-by: Paolo Bonzini<pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
> ---
>   block/blkdebug.c | 31 ++++++++++++++++++-------------
>   1 file changed, 18 insertions(+), 13 deletions(-)
>
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 5ccbfcab42..e8fdf7b056 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -775,25 +775,20 @@ static void blkdebug_close(BlockDriverState *bs)
>   static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>   {
>       BDRVBlkdebugState *s = bs->opaque;
> -    BlkdebugSuspendedReq r;
> +    BlkdebugSuspendedReq *r;
>   
> -    r = (BlkdebugSuspendedReq) {
> -        .co         = qemu_coroutine_self(),
> -        .tag        = g_strdup(rule->options.suspend.tag),
> -    };
> +    r = g_new(BlkdebugSuspendedReq, 1);
> +
> +    r->co         = qemu_coroutine_self();
> +    r->tag        = g_strdup(rule->options.suspend.tag);

Not wrong, but just as a note: I personally would have done the 
initialization like

*r = (BlkdebugSuspendedReq) {
     .co = ...,
     .tag = ...,
};

The advantage is that this sets all fields that aren’t mentioned to zero 
(kind of important, because you don’t use g_new0(), and so now I have to 
manually verify that there are no other fields that would need to be 
initialized (which there aren’t)), and in this special case the diff 
stat also would have been smaller. (But that’s a rare coincidence.)

There are no other fields besides the list entry object (which is fully 
overwritten by QLIST_INSERT_HEAD()), though, so this patch is correct 
and I’m happy with it as-is.

Max



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 3/6] blkdebug: track all actions
  2021-06-14  8:29 ` [PATCH v5 3/6] blkdebug: track all actions Emanuele Giuseppe Esposito
@ 2021-07-15  9:59   ` Max Reitz
  0 siblings, 0 replies; 12+ messages in thread
From: Max Reitz @ 2021-07-15  9:59 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Paolo Bonzini, Vladimir Sementsov-Ogievskiy,
	Eric Blake, qemu-devel

On 14.06.21 10:29, Emanuele Giuseppe Esposito wrote:
> Add a counter for each action that a rule can trigger.
> This is mainly used to keep track of how many coroutine_yield()
> we need to perform after processing all rules in the list.
>
> Co-developed-by: Paolo Bonzini<pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
> ---
>   block/blkdebug.c | 17 ++++++++---------
>   1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index e8fdf7b056..6bdeb2c7b3 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -74,6 +74,7 @@ enum {
>       ACTION_INJECT_ERROR,
>       ACTION_SET_STATE,
>       ACTION_SUSPEND,
> +    ACTION__MAX,
>   };
>   
>   typedef struct BlkdebugRule {
> @@ -791,22 +792,22 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
>       qemu_coroutine_yield();
>   }
>   
> -static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
> -    bool injected)
> +static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
> +                         int *action_count)

I would have liked a comment above this function explaining that 
`action_count` is not merely an int pointer, but actually an 
int[ACTION__MAX] pointer.

But it’s too late to complain about that now. O:)

>   {
>       BDRVBlkdebugState *s = bs->opaque;
>   
>       /* Only process rules for the current state */
>       if (rule->state && rule->state != s->state) {
> -        return injected;
> +        return;
>       }
>   
>       /* Take the action */
> +    action_count[rule->action]++;
>       switch (rule->action) {
>       case ACTION_INJECT_ERROR:
> -        if (!injected) {
> +        if (action_count[ACTION_INJECT_ERROR] == 1) {
>               QSIMPLEQ_INIT(&s->active_rules);

(I don’t quite understand this part – why do we clear the list of active 
rules here?  And why only if a new error is injected?  For example, if I 
have an inject-error rule that should only fire on state 1, and then the 
state changes to state 2, it stays active until a new error is injected, 
which doesn’t make sense to me.  But that has nothing to do with this 
series, of course.  I’m just wondering.)

Max

> -            injected = true;
>           }
>           QSIMPLEQ_INSERT_HEAD(&s->active_rules, rule, active_next);
>           break;
>
>    
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/6] blkdebug: fix racing condition when iterating on
  2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
                   ` (5 preceding siblings ...)
  2021-06-14  8:29 ` [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
@ 2021-07-15 10:06 ` Max Reitz
  6 siblings, 0 replies; 12+ messages in thread
From: Max Reitz @ 2021-07-15 10:06 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-block
  Cc: Kevin Wolf, Paolo Bonzini, Vladimir Sementsov-Ogievskiy,
	Eric Blake, qemu-devel

On 14.06.21 10:29, Emanuele Giuseppe Esposito wrote:
> When qemu_coroutine_enter is executed in a loop
> (even QEMU_FOREACH_SAFE), the new routine can modify the list,
> for example removing an element, causing problem when control
> is given back to the caller that continues iterating on the same list.
>
> Patch 1 solves the issue in blkdebug_debug_resume by restarting
> the list walk after every coroutine_enter if list has to be fully iterated.
> Patches 2,3,4 aim to fix blkdebug_debug_event by gathering
> all actions that the rules make in a counter and invoking
> the respective coroutine_yeld only after processing all requests.
>
> Patch 5-6 are somewhat independent of the others, patch 5 removes the need
> of new_state field, and patch 6 adds a lock to
> protect rules and suspended_reqs; right now everything works because
> it's protected by the AioContext lock.
> This is a preparation for the current proposal of removing the AioContext
> lock and instead using smaller granularity locks to allow multiple
> iothread execution in the same block device.
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> v5:
> * Add comment in patch 1 to explain why we don't need _SAFE in for loop
> * Move the state update (s->state = new_state) in patch 5, to maintain
>    the same existing effect in all patches

I’m not sure whether this actually fixes a user-visible bug…?  The first 
paragraph makes it sound like it, but there is no test, so I’m not sure.

I’m mostly asking because of freeze; but you make it sound like there’s 
a bug, and as this only concerns blkdebug (i.e., a block driver used 
only for testing), I feel like applying this series after soft freeze 
should be fine, so:

Thanks, I’ve applied this series to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-15 10:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-14  8:29 [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
2021-06-14  8:29 ` [PATCH v5 1/6] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
2021-06-14  8:29 ` [PATCH v5 2/6] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
2021-07-15  9:59   ` Max Reitz
2021-06-14  8:29 ` [PATCH v5 3/6] blkdebug: track all actions Emanuele Giuseppe Esposito
2021-07-15  9:59   ` Max Reitz
2021-06-14  8:29 ` [PATCH v5 4/6] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
2021-06-14  8:29 ` [PATCH v5 5/6] block/blkdebug: remove new_state field and instead use a local variable Emanuele Giuseppe Esposito
2021-06-19 12:38   ` Vladimir Sementsov-Ogievskiy
2021-06-14  8:29 ` [PATCH v5 6/6] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
2021-06-19 12:42   ` Vladimir Sementsov-Ogievskiy
2021-07-15 10:06 ` [PATCH v5 0/6] blkdebug: fix racing condition when iterating on Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.