All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
	Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
	qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [PATCH v2 2/8] block: protect write threshold QMP commands from concurrent requests
Date: Wed, 5 May 2021 13:29:26 +0200	[thread overview]
Message-ID: <36588682-2390-9d1e-dfe1-33827b1fcf90@redhat.com> (raw)
In-Reply-To: <YJJdl+wNvxgl83Km@stefanha-x1.localdomain>

On 05/05/21 10:55, Stefan Hajnoczi wrote:
>> @@ -119,10 +118,8 @@ void qmp_block_set_write_threshold(const char *node_name,
>>           return;
>>       }
>>   
>> -    aio_context = bdrv_get_aio_context(bs);
>> -    aio_context_acquire(aio_context);
>> -
>> +    /* Avoid a concurrent write_threshold_disable.  */
>> +    bdrv_drained_begin(bs);
> 
> Is there documentation that says it's safe to call
> bdrv_drained_begin(bs) without the AioContext acquired? AIO_WAIT_WHILE()
> contradicts this:
> 
>    The caller's thread must be the IOThread that owns @ctx or the main loop
>    thread (with @ctx acquired exactly once).
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 

Well, I cannot remember why this patch was correct at the time I was
working on them.  Patches 7/8 on the other hand were okay because back
then bdrv_reopen_multiple() called bdrv_drain_all_begin().

Overall, what survives of this series is patches 5 and 6, plus Vladimir's
take on the write threshold code.

Anyway, things have gotten _more_ broken in the meanwhile, and this is
probably what causes the deadlocks that Emanuele has seen with the
remainder of the branch.  Since this patch:

     commit aa1361d54aac43094b98024b8b6c804eb6e41661
     Author: Kevin Wolf <kwolf@redhat.com>
     Date:   Fri Aug 17 18:54:18 2018 +0200

     block: Add missing locking in bdrv_co_drain_bh_cb()
     
     bdrv_do_drained_begin/end() assume that they are called with the
     AioContext lock of bs held. If we call drain functions from a coroutine
     with the AioContext lock held, we yield and schedule a BH to move out of
     coroutine context. This means that the lock for the home context of the
     coroutine is released and must be re-acquired in the bottom half.
     
     Signed-off-by: Kevin Wolf <kwolf@redhat.com>
     Reviewed-by: Max Reitz <mreitz@redhat.com>

AIO_WAIT_WHILE  is going down a path that does not do the release/acquire of
the AioContext, which can and will cause deadlocks when the main thread
tries to acquire the AioContext and the I/O thread is in bdrv_co_drain.

The message that introduced it does not help very much
(https://mail.gnu.org/archive/html/qemu-devel/2018-09/msg01687.html)
but I think this must be reverted.

The next steps however should have less problems with bitrot, in particular
the snapshots have changed very little.  Block job code is very different
but it is all in the main thread.

Paolo



  reply	other threads:[~2021-05-05 11:34 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19  8:55 [PATCH v2 0/8] Block layer thread-safety, continued Emanuele Giuseppe Esposito
2021-04-19  8:55 ` [PATCH v2 1/8] block: prepare write threshold code for thread safety Emanuele Giuseppe Esposito
2021-05-05  8:50   ` Stefan Hajnoczi
2021-05-05  9:54     ` Vladimir Sementsov-Ogievskiy
2021-05-06  9:04       ` Stefan Hajnoczi
2021-05-06  9:14         ` Vladimir Sementsov-Ogievskiy
2021-04-19  8:55 ` [PATCH v2 2/8] block: protect write threshold QMP commands from concurrent requests Emanuele Giuseppe Esposito
2021-05-05  8:55   ` Stefan Hajnoczi
2021-05-05 11:29     ` Paolo Bonzini [this message]
2021-04-19  8:55 ` [PATCH v2 3/8] util: use RCU accessors for notifiers Emanuele Giuseppe Esposito
2021-05-05  9:47   ` Stefan Hajnoczi
2021-04-19  8:55 ` [PATCH v2 4/8] block: make before-write notifiers thread-safe Emanuele Giuseppe Esposito
2021-04-21 21:23   ` Vladimir Sementsov-Ogievskiy
2021-04-21 22:17     ` Vladimir Sementsov-Ogievskiy
2021-04-19  8:55 ` [PATCH v2 5/8] block: add a few more notes on locking Emanuele Giuseppe Esposito
2021-05-05  9:53   ` Stefan Hajnoczi
2021-04-19  8:55 ` [PATCH v2 6/8] block: do not acquire AioContext in check_to_replace_node Emanuele Giuseppe Esposito
2021-05-05 10:10   ` Paolo Bonzini
2021-04-19  8:55 ` [PATCH v2 7/8] block/replication: do not acquire AioContext Emanuele Giuseppe Esposito
2021-05-05  9:57   ` Stefan Hajnoczi
2021-05-05 10:33   ` Paolo Bonzini
2021-04-19  8:55 ` [PATCH v2 8/8] block: do not take AioContext around reopen Emanuele Giuseppe Esposito
2021-04-21 12:24   ` Paolo Bonzini
2021-05-05 10:01   ` Stefan Hajnoczi
2021-04-21 12:25 ` [PATCH v2 0/8] Block layer thread-safety, continued Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36588682-2390-9d1e-dfe1-33827b1fcf90@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.