All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denis Plotnikov <dplotnikov@virtuozzo.com>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v0 0/2] Postponed actions
Date: Mon, 20 Aug 2018 10:40:00 +0300	[thread overview]
Message-ID: <9ff040b2-8dcf-2cf0-76e6-fbef4e1da01e@virtuozzo.com> (raw)
In-Reply-To: <ae864bb4-257e-240c-eeab-3c9c13462c0d@virtuozzo.com>

ping ping!

On 14.08.2018 10:08, Denis Plotnikov wrote:
> 
> 
> On 13.08.2018 19:30, Kevin Wolf wrote:
>> Am 13.08.2018 um 10:32 hat Denis Plotnikov geschrieben:
>>> Ping ping!
>>>
>>> On 16.07.2018 21:59, John Snow wrote:
>>>>
>>>>
>>>> On 07/16/2018 11:01 AM, Denis Plotnikov wrote:
>>>>> Ping!
>>>>>
>>>>
>>>> I never saw a reply to Stefan's question on July 2nd, did you reply
>>>> off-list?
>>>>
>>>> --js
>>> Yes, I did. I talked to Stefan why the patch set appeared.
>>
>> The rest of us still don't know the answer. I had the same question.
>>
>> Kevin
> Yes, that's my fault. I should have post it earlier.
> 
> I reviewed the problem once again and come up with the following 
> explanation.
> Indeed, if the global lock has been taken by the main thread the vCPU 
> threads won't be able to execute mmio ide.
> But, if the main thread will release the lock then nothing will prevent
> vCPU threads form execution what they want, e.g writing to the block 
> device.
> 
> In case of running the mirroring it is possible. Let's take a look
> at the following snippet of mirror_run. This is a part the mirroring 
> completion part.
> 
>              bdrv_drained_begin(bs);
>              cnt = bdrv_get_dirty_count(s->dirty_bitmap);
>  >>>>>>      if (cnt > 0 || mirror_flush(s) < 0) {
>                  bdrv_drained_end(bs);
>                  continue;
>              }
> 
> (X) >>>>    assert(QLIST_EMPTY(&bs->tracked_requests));
> 
> mirror_flush here can yield the current coroutine so nothing more can be 
> executed.
> We could end up with the situation when the main loop have to revolve to 
> poll for another timer/bh to process. While revolving it releases the 
> global lock. If the global lock is waited for by a vCPU (any other) 
> thread, the waiting thread will get the lock and make what it intends.
> 
> This is something that I can observe:
> 
> mirror_flush yields coroutine, the main thread revolves and locks 
> because a vCPU was waiting for the lock. Now the vCPU thread owns the 
> lock and the main thread waits for the lock releasing.
> The vCPU thread does cmd_write_dma and releases the lock. Then, the main
> thread gets the lock and continues to run eventually proceeding with the 
> coroutine yeiled.
> If the vCPU requests aren't completed by the moment we will assert at 
> (X). If the vCPU requests are completed we won't even notice that we had 
> some writes while in the drained section.
> 
> Denis
>>
>>>>> On 29.06.2018 15:40, Denis Plotnikov wrote:
>>>>>> There are cases when a request to a block driver state shouldn't have
>>>>>> appeared producing dangerous race conditions.
>>>>>> This misbehaviour is usually happens with storage devices emulated
>>>>>> without eventfd for guest to host notifications like IDE.
>>>>>>
>>>>>> The issue arises when the context is in the "drained" section
>>>>>> and doesn't expect the request to come, but request comes from the
>>>>>> device not using iothread and which context is processed by the main
>>>>>> loop.
>>>>>>
>>>>>> The main loop apart of the iothread event loop isn't blocked by the
>>>>>> "drained" section.
>>>>>> The request coming and processing while in "drained" section can 
>>>>>> spoil
>>>>>> the
>>>>>> block driver state consistency.
>>>>>>
>>>>>> This behavior can be observed in the following KVM-based case:
>>>>>>
>>>>>> 1. Setup a VM with an IDE disk.
>>>>>> 2. Inside a VM start a disk writing load for the IDE device
>>>>>>      e.g: dd if=<file> of=<file> bs=X count=Y oflag=direct
>>>>>> 3. On the host create a mirroring block job for the IDE device
>>>>>>      e.g: drive_mirror <your_IDE> <your_path>
>>>>>> 4. On the host finish the block job
>>>>>>      e.g: block_job_complete <your_IDE>
>>>>>>     Having done the 4th action, you could get an assert:
>>>>>> assert(QLIST_EMPTY(&bs->tracked_requests)) from mirror_run.
>>>>>> On my setup, the assert is 1/3 reproducible.
>>>>>>
>>>>>> The patch series introduces the mechanism to postpone the requests
>>>>>> until the BDS leaves "drained" section for the devices not using
>>>>>> iothreads.
>>>>>> Also, it modifies the asynchronous block backend infrastructure to 
>>>>>> use
>>>>>> that mechanism to release the assert bug for IDE devices.
>>>>>>
>>>>>> Denis Plotnikov (2):
>>>>>>      async: add infrastructure for postponed actions
>>>>>>      block: postpone the coroutine executing if the BDS's is drained
>>>>>>
>>>>>>     block/block-backend.c | 58 
>>>>>> ++++++++++++++++++++++++++++++---------
>>>>>>     include/block/aio.h   | 63 
>>>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>>>     util/async.c          | 33 +++++++++++++++++++++++
>>>>>>     3 files changed, 142 insertions(+), 12 deletions(-)
>>>>>>
>>>>>
>>>
>>> -- 
>>> Best,
>>> Denis
> 

-- 
Best,
Denis

  reply	other threads:[~2018-08-20  7:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 12:40 [Qemu-devel] [PATCH v0 0/2] Postponed actions Denis Plotnikov
2018-06-29 12:40 ` [Qemu-devel] [PATCH v0 1/2] async: add infrastructure for postponed actions Denis Plotnikov
2018-06-29 12:40 ` [Qemu-devel] [PATCH v0 2/2] block: postpone the coroutine executing if the BDS's is drained Denis Plotnikov
2018-09-10 12:41   ` Kevin Wolf
2018-09-12 12:03     ` Denis Plotnikov
2018-09-12 13:15       ` Kevin Wolf
2018-09-12 14:53         ` Denis Plotnikov
2018-09-12 15:09           ` Kevin Wolf
2018-09-12 17:03         ` Denis V. Lunev
2018-09-13  8:44           ` Kevin Wolf
2018-07-02  1:47 ` [Qemu-devel] [PATCH v0 0/2] Postponed actions no-reply
2018-07-02 15:18 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2018-07-17 10:31   ` Stefan Hajnoczi
2018-07-16 15:01 ` [Qemu-devel] " Denis Plotnikov
2018-07-16 18:59   ` [Qemu-devel] [Qemu-block] " John Snow
2018-07-18  7:53     ` Denis Plotnikov
2018-08-13  8:32     ` Denis Plotnikov
2018-08-13 16:30       ` Kevin Wolf
2018-08-14  7:08         ` Denis Plotnikov
2018-08-20  7:40           ` Denis Plotnikov [this message]
2018-08-20  7:42           ` Denis Plotnikov
2018-08-27  7:05           ` Denis Plotnikov
2018-08-27 16:05             ` John Snow
2018-08-28 10:23               ` Denis Plotnikov
2018-09-10 10:11                 ` Denis Plotnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ff040b2-8dcf-2cf0-76e6-fbef4e1da01e@virtuozzo.com \
    --to=dplotnikov@virtuozzo.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.