All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>,
	Fam Zheng <fam@euphon.net>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Hanna Reitz <hreitz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, John Snow <jsnow@redhat.com>
Subject: Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept
Date: Wed, 2 Mar 2022 16:20:39 +0000	[thread overview]
Message-ID: <Yh+ZV4zqgLvF0/Sw@stefanha-x1.localdomain> (raw)
In-Reply-To: <516a480e-15a0-896f-6ff8-4303e110210e@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 3683 bytes --]

On Wed, Mar 02, 2022 at 02:07:30PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 01.03.2022 17:21, Emanuele Giuseppe Esposito wrote:
> > This serie tries to provide a proof of concept and a clear explanation
> > on why we need to use drains (and more precisely subtree_drains)
> > to replace the aiocontext lock, especially to protect BlockDriverState
> > ->children and ->parent lists.
> > 
> > Just a small recap on the key concepts:
> > * We split block layer APIs in "global state" (GS), "I/O", and
> > "global state or I/O".
> >    GS are running in the main loop, under BQL, and are the only
> >    one allowed to modify the BlockDriverState graph.
> > 
> >    I/O APIs are thread safe and can run in any thread
> > 
> >    "global state or I/O" are essentially all APIs that use
> >    BDRV_POLL_WHILE. This is because there can be only 2 threads
> >    that can use BDRV_POLL_WHILE: main loop and the iothread that
> >    runs the aiocontext.
> > 
> > * Drains allow the caller (either main loop or iothread running
> > the context) to wait all in_flights requests and operations
> > of a BDS: normal drains target a given node and is parents, while
> > subtree ones also include the subgraph of the node. Siblings are
> > not affected by any of these two kind of drains.
> > After bdrv_drained_begin, no more request is allowed to come
> > from the affected nodes. Therefore the only actor left working
> > on a drained part of the graph should be the main loop.
> > 
> > What do we intend to do
> > -----------------------
> > We want to remove the AioContext lock. It is not 100% clear on how
> > many things we are protecting with it, and why.
> > As a starter, we want to protect BlockDriverState ->parents and
> > ->children lists, since they are read by main loop and I/O but
> > only written by main loop under BQL. The function that modifies
> > these lists is bdrv_replace_child_common().
> > 
> > How do we want to do it
> > -----------------------
> > We individuated as ideal subtitute of AioContext lock
> > the subtree_drain API. The reason is simple: draining prevents the iothread to read or write the nodes, so once the main loop finishes
> 
> I'm not sure it's ideal. Unfortunately I'm not really good in all that BQL, AioContext locks and drains. So, I can't give good advice. But here are my doubts:
> 
> Draining is very restrictive measure: even if drained section is very short, at least on bdrv_drained_begin() we have to wait for all current requests, and don't start new ones. That slows down the guest. In the same time there are operations that don't require to stop guest IO requests. For example manipulation with dirty bitmaps - qmp commands block-dirty-bitmap-add block-dirty-bitmap-remove. Or different query requests..
> 
> I see only two real cases, where we do need drain:
> 
> 1. When we need a consistent "point-in-time". For example, when we start backup in transaction with some dirty-bitmap manipulation commands.
> 
> 2. When we need to modify block-graph: if we are going to break relation A -> B, there must not be any in-flight request that want to use this relation.
> 
> All other operations, for which we want some kind of lock (like AioContext lock or something) we actually don't want to stop guest IO.

Yes, I think so too.

Block drivers should already be at a point where block-dirty-bitmap-add
and similar commands can be synchronized with just the qcow2 CoMutex.

Not every aio_context_acquire() needs to be replace by a drained
section. Some can safely be dropped because there are fine-grained locks
in place that provide sufficient protection.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2022-03-02 16:23 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 14:21 [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept Emanuele Giuseppe Esposito
2022-03-01 14:21 ` [RFC PATCH 1/5] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-03-02 16:21   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 2/5] introduce BDRV_POLL_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-03-02 16:22   ` Stefan Hajnoczi
2022-03-09 13:49   ` Eric Blake
2022-03-01 14:21 ` [RFC PATCH 3/5] block/io.c: introduce bdrv_subtree_drained_{begin/end}_unlocked Emanuele Giuseppe Esposito
2022-03-02 16:25   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 4/5] child_job_drained_poll: override polling condition only when in home thread Emanuele Giuseppe Esposito
2022-03-02 16:37   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 5/5] test-bdrv-drain: ensure draining from main loop stops iothreads Emanuele Giuseppe Esposito
2022-03-01 14:26 ` [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept Emanuele Giuseppe Esposito
2022-03-02  9:47 ` Stefan Hajnoczi
2022-03-09 13:26   ` Emanuele Giuseppe Esposito
2022-03-10 15:54     ` Stefan Hajnoczi
2022-03-17 16:23     ` Emanuele Giuseppe Esposito
2022-03-30 10:53       ` Hanna Reitz
2022-03-30 11:55         ` Emanuele Giuseppe Esposito
2022-03-30 14:12           ` Hanna Reitz
2022-03-30 16:02         ` Paolo Bonzini
2022-03-31  9:59           ` Paolo Bonzini
2022-03-31 13:51             ` Emanuele Giuseppe Esposito
2022-03-31 16:40               ` Paolo Bonzini
2022-04-01  8:05                 ` Emanuele Giuseppe Esposito
2022-04-01 11:01                   ` Paolo Bonzini
2022-04-04  9:25                     ` Stefan Hajnoczi
2022-04-04  9:41                       ` Paolo Bonzini
2022-04-04  9:51                         ` Emanuele Giuseppe Esposito
2022-04-04 10:07                           ` Paolo Bonzini
2022-04-05  9:39                         ` Stefan Hajnoczi
2022-04-05 10:43                         ` Kevin Wolf
2022-04-13 13:43                     ` Emanuele Giuseppe Esposito
2022-04-13 14:51                       ` Kevin Wolf
2022-04-13 15:14                         ` Emanuele Giuseppe Esposito
2022-04-13 15:22                           ` Emanuele Giuseppe Esposito
2022-04-13 16:29                           ` Kevin Wolf
2022-04-13 20:43                             ` Paolo Bonzini
2022-04-13 20:46                         ` Paolo Bonzini
2022-03-02 11:07 ` Vladimir Sementsov-Ogievskiy
2022-03-02 16:20   ` Stefan Hajnoczi [this message]
2022-03-09 13:26   ` Emanuele Giuseppe Esposito
2022-03-16 21:55     ` Emanuele Giuseppe Esposito
2022-03-21 12:22       ` Vladimir Sementsov-Ogievskiy
2022-03-21 15:24     ` Vladimir Sementsov-Ogievskiy
2022-03-21 15:44     ` Vladimir Sementsov-Ogievskiy
2022-03-30  9:09       ` Emanuele Giuseppe Esposito
2022-03-30  9:52         ` Vladimir Sementsov-Ogievskiy
2022-03-30  9:58           ` Emanuele Giuseppe Esposito
2022-04-05 10:55             ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yh+ZV4zqgLvF0/Sw@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.