All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock
Date: Wed, 2 Jun 2021 14:59:22 +0200	[thread overview]
Message-ID: <0bd35c32-d987-f007-392a-08c116e026c2@redhat.com> (raw)
In-Reply-To: <43b0217f-4d3a-9380-c91e-ee86fe461c40@virtuozzo.com>



On 01/06/2021 10:39, Vladimir Sementsov-Ogievskiy wrote:
> 17.05.2021 17:50, Emanuele Giuseppe Esposito wrote:
>> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/blkdebug.c | 53 ++++++++++++++++++++++++++++++++++++------------
>>   1 file changed, 40 insertions(+), 13 deletions(-)
>>
>> diff --git a/block/blkdebug.c b/block/blkdebug.c
>> index dffd869b32..cf8b088ce7 100644
>> --- a/block/blkdebug.c
>> +++ b/block/blkdebug.c
>> @@ -54,6 +54,7 @@ typedef struct BDRVBlkdebugState {
>>       /* For blkdebug_refresh_filename() */
>>       char *config_file;
>> +    QemuMutex lock;
> 
> we'll need a comments, which fields are protected by the lock, like in 
> other your series.
> 
> and also a comment which functions are thread-safe after the patch.
> 
> remove_rule() lacks comment, like "Called with lock held or from 
> .bdrv_close"

Will apply these feedback in next version, thanks.

[...]

>> +/* Called with lock held.  */
>>   static void process_rule(BlockDriverState *bs, struct BlkdebugRule 
>> *rule,
>>                            int *action_count)
>>   {
>> @@ -829,9 +840,11 @@ static void blkdebug_debug_event(BlockDriverState 
>> *bs, BlkdebugEvent event)
>>       assert((int)event >= 0 && event < BLKDBG__MAX);
>> -    s->new_state = s->state;
>> -    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>> -        process_rule(bs, rule, actions_count);
>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
>> +        s->new_state = s->state;
>> +        QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
>> +            process_rule(bs, rule, actions_count);
>> +        }
>>       }
>>       while (actions_count[ACTION_SUSPEND] > 0) {
>> @@ -839,7 +852,9 @@ static void blkdebug_debug_event(BlockDriverState 
>> *bs, BlkdebugEvent event)
>>           actions_count[ACTION_SUSPEND]--;
>>       }
>> +    qemu_mutex_lock(&s->lock);
>>       s->state = s->new_state;
>> +    qemu_mutex_unlock(&s->lock);
>>   }
> 
> Preexising: honestly, I don't understand the logic around state and 
> new_state.. (and therefore, I'm not sure, is it in good relation with 
> patch 04)
> 
> It come in the commit
> 
> commit 8f96b5be92fbd74798b97b1dc1ff5fbbe249ed11
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Fri Sep 28 17:23:00 2012 +0200
> 
>      blkdebug: process all set_state rules in the old state
>      Currently it is impossible to write a blkdebug script that ping-pongs
>      between two states, because the second set-state rule will use the
>      state that is set in the first.  If you have
>          [set-state]
>          event = "..."
>          state = "1"
>          new_state = "2"
>          [set-state]
>          event = "..."
>          state = "2"
>          new_state = "1"
>      for example the state will remain locked at 1.  This can be fixed
>      by first processing all rules, and then setting the state.
> 
> But I don't understand, whey it should remain locked..

 From what I understand, when bdrv_debug_event is invoked the code 
before this patch will simply change the state in 1 - 2 - 1 in the same 
rule iteration. So following Paolo's example, having these two rules in 
the same rules["..."] list, will make only one bdrv_debug_event change 
the state from 1 to 2, and 2 to 1 (if they are ordered in this way), 
removing both rules from the list.

This is not the expected behavior: we want two bdrv_debug_event to 
trigger the two state changes, so in the first bdrv_debug_event call we 
have 1-2 and next 2-1.

> 
> And this logic is not safe: another event may happen during the yield, 
> and will operate on the same s->state and s->new_state, leading the the 
> mess.

Yes, I think you are right. The state update should go in the same lock 
guard above, like this:

WITH_QEMU_LOCK_GUARD(&s->lock) {
         s->new_state = s->state;
         QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
             process_rule(bs, rule, actions_count);
         }
+       s->state = s->new_state;
     }

     while (actions_count[ACTION_SUSPEND] > 0) {
         qemu_coroutine_yield();
         actions_count[ACTION_SUSPEND]--;
     }

-    qemu_mutex_lock(&s->lock);
-    s->state = s->new_state;
-    qemu_mutex_unlock(&s->lock);

The other comments below make sense to me, will also apply them.

Thank you,
Emanuele

> 
>>   static int blkdebug_debug_breakpoint(BlockDriverState *bs, const 
>> char *event,
>> @@ -862,11 +877,14 @@ static int 
>> blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
>>           .options.suspend.tag = g_strdup(tag),
>>       };
>> +    qemu_mutex_lock(&s->lock);
>>       QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
>> +    qemu_mutex_unlock(&s->lock);
>>       return 0;
>>   }
>> +/* Called with lock held.  */
> 
> I think, it would be a very good practice to include into convention:
> 
> May temporarily release lock. >
>>   static int resume_req_by_tag(BDRVBlkdebugState *s, const char *tag, 
>> bool all)
>>   {
>>       BlkdebugSuspendedReq *r;
>> @@ -884,7 +902,9 @@ retry:
>>               g_free(r->tag);
>>               g_free(r);
>> +            qemu_mutex_unlock(&s->lock);
>>               qemu_coroutine_enter(co);
>> +            qemu_mutex_lock(&s->lock);
>>               if (all) {
>>                   goto retry;
>> @@ -898,8 +918,12 @@ retry:
>>   static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
>>   {
>>       BDRVBlkdebugState *s = bs->opaque;
>> +    int rc;
> 
> Hmm, you can simply use QEMU_LOCK_GUARD

> 
>> -    return resume_req_by_tag(s, tag, false);
>> +    qemu_mutex_lock(&s->lock);
>> +    rc = resume_req_by_tag(s, tag, false);
>> +    qemu_mutex_unlock(&s->lock);
>> +    return rc;
>>   }
>>   static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>> @@ -909,17 +933,19 @@ static int 
>> blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
>>       BlkdebugRule *rule, *next;
>>       int i, ret = -ENOENT;
>> -    for (i = 0; i < BLKDBG__MAX; i++) {
>> -        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>> -            if (rule->action == ACTION_SUSPEND &&
>> -                !strcmp(rule->options.suspend.tag, tag)) {
>> -                remove_rule(rule);
>> -                ret = 0;
>> +    WITH_QEMU_LOCK_GUARD(&s->lock) {
> 
> And here, instead of increasing nesting, let's use QEMU_LOCK_GUARD()
> 
>> +        for (i = 0; i < BLKDBG__MAX; i++) {
>> +            QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
>> +                if (rule->action == ACTION_SUSPEND &&
>> +                    !strcmp(rule->options.suspend.tag, tag)) {
>> +                    remove_rule(rule);
>> +                    ret = 0;
>> +                }
>>               }
>>           }
>> -    }
>> -    if (resume_req_by_tag(s, tag, true) == 0) {
>> -        ret = 0;
>> +        if (resume_req_by_tag(s, tag, true) == 0) {
>> +            ret = 0;
>> +        }
>>       }
>>       return ret;
>>   }
>> @@ -929,6 +955,7 @@ static bool 
>> blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
>>       BDRVBlkdebugState *s = bs->opaque;
>>       BlkdebugSuspendedReq *r;
>> +    QEMU_LOCK_GUARD(&s->lock);
>>       QLIST_FOREACH(r, &s->suspended_reqs, next) {
>>           if (!strcmp(r->tag, tag)) {
>>               return true;
>>
> 
> 



  reply	other threads:[~2021-06-02 13:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17 14:50 [PATCH v3 0/5] blkdebug: fix racing condition when iterating on Emanuele Giuseppe Esposito
2021-05-17 14:50 ` [PATCH v3 1/5] blkdebug: refactor removal of a suspended request Emanuele Giuseppe Esposito
2021-05-31 19:44   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 2/5] blkdebug: move post-resume handling to resume_req_by_tag Emanuele Giuseppe Esposito
2021-05-31 19:50   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 3/5] blkdebug: track all actions Emanuele Giuseppe Esposito
2021-05-31 19:57   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 4/5] blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE Emanuele Giuseppe Esposito
2021-06-01  7:58   ` Vladimir Sementsov-Ogievskiy
2021-05-17 14:50 ` [PATCH v3 5/5] blkdebug: protect rules and suspended_reqs with a lock Emanuele Giuseppe Esposito
2021-06-01  8:39   ` Vladimir Sementsov-Ogievskiy
2021-06-02 12:59     ` Emanuele Giuseppe Esposito [this message]
2021-06-02 13:24       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bd35c32-d987-f007-392a-08c116e026c2@redhat.com \
    --to=eesposit@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.