All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
To: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>,
	qemu-block@nongnu.org
Cc: Fam Zheng <fam@euphon.net>, Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com>
Subject: Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept
Date: Wed, 30 Mar 2022 11:09:37 +0200	[thread overview]
Message-ID: <dd644d13-720f-c720-83bc-bab291b45d7b@redhat.com> (raw)
In-Reply-To: <160b0358-b96b-c1ff-e08f-7488366a1755@mail.ru>

>
> Ah seems I understand what you mean.
>
> One of my arguments is that "drain" - is not a lock against other
> clients who want to modify the graph. Because, drained section allows
> nested drained sections.
>
> And you try to solve it, by draining more things, this way, we'll drain
> also the job, which is a possible client, who may want to modify the
> graph in parallel.
>
> So, in other words, when we want to modify the graph, we drain the whole
> connectivity component of the graph. And we think that we are safe from
> other graph modifications because all related jobs are drained.
> Interesting, is that possible that some not drained job from another
> connectivity component will want to connect some node from our drained
> component?

You mean another job or whathever calling bdrv_find_node() on a random
graph? Yes that is not protected. But can this happen?

That's the question. What are the invariants here? Can anything happen?

>
> I just still feel that draining is a wrong mechanism to avoid
> interaction with other clients who want to modify the graph, because:
>
> 1. we stop the whole IO on all subgraph which is not necessary
> 2. draining is not a mutex, it allows nesting and it's ok when two
> different clients drain same nodes. Draining is just a requirement to do
> no IO at these nodes.
>
> And in your way, it seems that to be absolutely safe we'll need to drain
> everything..
>
> In my feeling it's better to keep draining what it is now: requirement
> to have no IO requests. And to isolate graph modifications from each
> other make a new synchronization mechanism, something like a global
> queue, where clients who want to get an access to graph modifications
> wait for their turn.

This is a matter of definitions. Subtree drains can theoretically work,
I managed to answer to my own doubts in the last email I sent.

Yes, there is still some completely random case like the one I wrote
above, but I think it is more a matter of what we want to use and what
meaning we want to give to drains.

Global queue is what Kevin proposes, I will try to implement it.

> 
> 
> As I understand:
> 
> You want to make drained section to be a kind of lock, so that if we
> take this lock, we can modify the graph and we are sure that no other
> client will modify it in parallel.

Yes

> 
> But drained sections can be nested. So to solve the problem you try to
> drain more nodes: include subtree for example, or may be we need to
> drain the whole graph connectivity component, or (to be more safe) the
> whole block layer (to be sure that during drained section in one
> connectivity component some not-drained block-job from another
> connectivity component will not try to attach some node from our drained
> connectivity component)..
> 
> I still feel that draining is wrong tool for isolating graph modifying
> operations from each other:
> 
> 1. Drained sections can be nested, and natively that's not a kind of
> lock. That's just a requirement to have no IO requests. There may be
> several clients that want this condition on same set of nodes.
> 
> 2. Blocking IO on the whole connected subgraph or even on the whole
> block layer graph is not necessary, so that's an extra blocking.
> 
> 
> Could we instead do the following:
> 
> 1. Keep draining as is - a mechanism to stop IO on some nodes
> 
> 2. To isolate graph-modifying operations implement another mechanism:
> something like a global queue, where clients wait until they gen an
> access to modify block layer.
> 
> 
> This way, any graph modifying process would look like this:
> 
> 1. drained_begin(only where necessary, not the whole subgraph in general)
> 
> 2. wait in the global queue
> 
> 3. Ok, now we can do all the modifications
> 
> 4. Kick the global queue, so that next client will get an access
> 
> 5. drained_end()
> 
> 

Please give a look at what Kevin (described by me) proposed. I think
it's the same as you are suggesting. I am pasting it below.
I will try to implement this and see if it is doable or not.

I think the advantage of drains is that it isn't so complicated and
doesn't add any complication to the existing code.
But we'll see how it goes with this global queue.

> His idea is to replicate what blk_wait_while_drained() currently does
> but on a larger scale. It is something in between this subtree_drains
> logic and a rwlock.
> 
> Basically if I understood correctly, we could implement
> bdrv_wait_while_drained(), and put in all places where we would put a
> read lock: all the reads to ->parents and ->children.
> This function detects if the bdrv is under drain, and if so it will stop
> and wait that the drain finishes (ie the graph modification).
> On the other side, each write would just need to drain probably both
> nodes (simple drain), to signal that we are modifying the graph. Once
> bdrv_drained_begin() finishes, we are sure all coroutines are stopped.
> Once bdrv_drained_end() finishes, we automatically let all coroutine
> restart, and continue where they left off.
> 
> Seems a good compromise between drains and rwlock. What do you think?
> 
> I am not sure how painful it will be to implement though.

Emanuele



  reply	other threads:[~2022-03-30  9:22 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 14:21 [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept Emanuele Giuseppe Esposito
2022-03-01 14:21 ` [RFC PATCH 1/5] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-03-02 16:21   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 2/5] introduce BDRV_POLL_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-03-02 16:22   ` Stefan Hajnoczi
2022-03-09 13:49   ` Eric Blake
2022-03-01 14:21 ` [RFC PATCH 3/5] block/io.c: introduce bdrv_subtree_drained_{begin/end}_unlocked Emanuele Giuseppe Esposito
2022-03-02 16:25   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 4/5] child_job_drained_poll: override polling condition only when in home thread Emanuele Giuseppe Esposito
2022-03-02 16:37   ` Stefan Hajnoczi
2022-03-01 14:21 ` [RFC PATCH 5/5] test-bdrv-drain: ensure draining from main loop stops iothreads Emanuele Giuseppe Esposito
2022-03-01 14:26 ` [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept Emanuele Giuseppe Esposito
2022-03-02  9:47 ` Stefan Hajnoczi
2022-03-09 13:26   ` Emanuele Giuseppe Esposito
2022-03-10 15:54     ` Stefan Hajnoczi
2022-03-17 16:23     ` Emanuele Giuseppe Esposito
2022-03-30 10:53       ` Hanna Reitz
2022-03-30 11:55         ` Emanuele Giuseppe Esposito
2022-03-30 14:12           ` Hanna Reitz
2022-03-30 16:02         ` Paolo Bonzini
2022-03-31  9:59           ` Paolo Bonzini
2022-03-31 13:51             ` Emanuele Giuseppe Esposito
2022-03-31 16:40               ` Paolo Bonzini
2022-04-01  8:05                 ` Emanuele Giuseppe Esposito
2022-04-01 11:01                   ` Paolo Bonzini
2022-04-04  9:25                     ` Stefan Hajnoczi
2022-04-04  9:41                       ` Paolo Bonzini
2022-04-04  9:51                         ` Emanuele Giuseppe Esposito
2022-04-04 10:07                           ` Paolo Bonzini
2022-04-05  9:39                         ` Stefan Hajnoczi
2022-04-05 10:43                         ` Kevin Wolf
2022-04-13 13:43                     ` Emanuele Giuseppe Esposito
2022-04-13 14:51                       ` Kevin Wolf
2022-04-13 15:14                         ` Emanuele Giuseppe Esposito
2022-04-13 15:22                           ` Emanuele Giuseppe Esposito
2022-04-13 16:29                           ` Kevin Wolf
2022-04-13 20:43                             ` Paolo Bonzini
2022-04-13 20:46                         ` Paolo Bonzini
2022-03-02 11:07 ` Vladimir Sementsov-Ogievskiy
2022-03-02 16:20   ` Stefan Hajnoczi
2022-03-09 13:26   ` Emanuele Giuseppe Esposito
2022-03-16 21:55     ` Emanuele Giuseppe Esposito
2022-03-21 12:22       ` Vladimir Sementsov-Ogievskiy
2022-03-21 15:24     ` Vladimir Sementsov-Ogievskiy
2022-03-21 15:44     ` Vladimir Sementsov-Ogievskiy
2022-03-30  9:09       ` Emanuele Giuseppe Esposito [this message]
2022-03-30  9:52         ` Vladimir Sementsov-Ogievskiy
2022-03-30  9:58           ` Emanuele Giuseppe Esposito
2022-04-05 10:55             ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd644d13-720f-c720-83bc-bab291b45d7b@redhat.com \
    --to=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=v.sementsov-og@mail.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.