All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: mreitz@redhat.com, stefanha@redhat.com, den@openvz.org,
	vsementsov@virtuozzo.com, famz@redhat.com,
	qemu-stable@nongnu.org, qemu-block@nongnu.org,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v0 2/2] block: postpone the coroutine executing if the BDS's is drained
Date: Wed, 12 Sep 2018 15:15:03 +0200	[thread overview]
Message-ID: <20180912131503.GD5846@localhost.localdomain> (raw)
In-Reply-To: <81346e2a-5ce2-50d3-791e-3cab5e2464d9@virtuozzo.com>

Am 12.09.2018 um 14:03 hat Denis Plotnikov geschrieben:
> On 10.09.2018 15:41, Kevin Wolf wrote:
> > Am 29.06.2018 um 14:40 hat Denis Plotnikov geschrieben:
> > > Fixes the problem of ide request appearing when the BDS is in
> > > the "drained section".
> > > 
> > > Without the patch the request can come and be processed by the main
> > > event loop, as the ide requests are processed by the main event loop
> > > and the main event loop doesn't stop when its context is in the
> > > "drained section".
> > > The request execution is postponed until the end of "drained section".
> > > 
> > > The patch doesn't modify ide specific code, as well as any other
> > > device code. Instead, it modifies the infrastructure of asynchronous
> > > Block Backend requests, in favor of postponing the requests arisen
> > > when in "drained section" to remove the possibility of request appearing
> > > for all the infrastructure clients.
> > > 
> > > This approach doesn't make vCPU processing the request wait untill
> > > the end of request processing.
> > > 
> > > Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> > 
> > I generally agree with the idea that requests should be queued during a
> > drained section. However, I think there are a few fundamental problems
> > with the implementation in this series:
> > 
> > 1) aio_disable_external() is already a layering violation and we'd like
> >     to get rid of it (by replacing it with a BlockDevOps callback from
> >     BlockBackend to the devices), so adding more functionality there
> >     feels like a step in the wrong direction.
> > 
> > 2) Only blk_aio_* are fixed, while we also have synchronous public
> >     interfaces (blk_pread/pwrite) as well as coroutine-based ones
> >     (blk_co_*). They need to be postponed as well.
> Good point! Thanks!
> > 
> >     blk_co_preadv/pwritev() are the common point in the call chain for
> >     all of these variants, so this is where the fix needs to live.
> Using the common point might be a good idea, but in case aio requests we
> also have to mane completions which out of the scope of
> blk_co_p(read|write)v:

I don't understand what you mean here (possibly because I fail to
understand the word "mane") and what completions have to do with
queueing of requests.

Just to clarify, we are talking about the following situation, right?
bdrv_drain_all_begin() has returned, so all the old requests have
already been drained and their completion callback has already been
called. For any new requests that come in, we need to queue them until
the drained section ends. In other words, they won't reach the point
where they could possibly complete before .drained_end.

> static void blk_aio_write_entry(void *opaque) {
>     ...
>     rwco->ret = blk_co_pwritev(...);
> 
>     blk_aio_complete(acb);
>     ...
> }
> 
> This makes the difference.
> I would suggest adding waiting until "drained_end" is done on the
> synchronous read/write at blk_prw

It is possible, but then the management becomes a bit more complicated
because you have more than just a list of Coroutines that you need to
wake up.

One thing that could be problematic in blk_co_preadv/pwritev is that
blk->in_flight would count even requests that are queued if we're not
careful. Then a nested drain would deadlock because the BlockBackend
would never say that it's quiesced.

>                               >
> > 3) Within a drained section, you want requests from other users to be
> >     blocked, but not your own ones (essentially you want exclusive
> >     access). We don't have blk_drained_begin/end() yet, so this is not
> >     something to implement right now, but let's keep this requirement in
> >     mind and choose a design that allows this.
> There is an idea to distinguish the requests that should be done without
> respect to "drained section" by using a flag in BdrvRequestFlags. The
> requests with a flag set should be processed anyway.

I don't think that would work because the accesses can be nested quite
deeply in functions that can be called from anywhere.

But possibly all of the interesting cases are directly calling BDS
functions anyway and not BlockBackend.

> > I believe the whole logic should be kept local to BlockBackend, and
> > blk_root_drained_begin/end() should be the functions that start queuing
> > requests or let queued requests resume.
> > 
> > As we are already in coroutine context in blk_co_preadv/pwritev(), after
> > checking that blk->quiesce_counter > 0, we can enter the coroutine
> > object into a list and yield. blk_root_drained_end() calls aio_co_wake()
> > for each of the queued coroutines. This should be all that we need to
> > manage.
> In my understanding by using brdv_drained_begin/end we want to protect a
> certain BlockDriverState from external access but not the whole BlockBackend
> which may involve using a number of BlockDriverState-s.
> I though it because we could possibly change a backing file for some
> BlockDriverState. And for the time of changing we need to prevent external
> access to it but keep the io going.
> By using blk_root_drained_begin/end() we put to "drained section" all the
> BlockDriverState-s linked to that root.
> Does it have to be so?

It's the other way round, actually.

In order for a BDS to be fully drained, it must make sure that it
doesn't get new requests from its parents any more. So drain propagates
towards the parents, not towards the children.

blk_root_drained_begin/end() are functions that are called when
blk->root.bs is drained.

Kevin

  reply	other threads:[~2018-09-12 13:15 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 12:40 [Qemu-devel] [PATCH v0 0/2] Postponed actions Denis Plotnikov
2018-06-29 12:40 ` [Qemu-devel] [PATCH v0 1/2] async: add infrastructure for postponed actions Denis Plotnikov
2018-06-29 12:40 ` [Qemu-devel] [PATCH v0 2/2] block: postpone the coroutine executing if the BDS's is drained Denis Plotnikov
2018-09-10 12:41   ` Kevin Wolf
2018-09-12 12:03     ` Denis Plotnikov
2018-09-12 13:15       ` Kevin Wolf [this message]
2018-09-12 14:53         ` Denis Plotnikov
2018-09-12 15:09           ` Kevin Wolf
2018-09-12 17:03         ` Denis V. Lunev
2018-09-13  8:44           ` Kevin Wolf
2018-07-02  1:47 ` [Qemu-devel] [PATCH v0 0/2] Postponed actions no-reply
2018-07-02 15:18 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2018-07-17 10:31   ` Stefan Hajnoczi
2018-07-16 15:01 ` [Qemu-devel] " Denis Plotnikov
2018-07-16 18:59   ` [Qemu-devel] [Qemu-block] " John Snow
2018-07-18  7:53     ` Denis Plotnikov
2018-08-13  8:32     ` Denis Plotnikov
2018-08-13 16:30       ` Kevin Wolf
2018-08-14  7:08         ` Denis Plotnikov
2018-08-20  7:40           ` Denis Plotnikov
2018-08-20  7:42           ` Denis Plotnikov
2018-08-27  7:05           ` Denis Plotnikov
2018-08-27 16:05             ` John Snow
2018-08-28 10:23               ` Denis Plotnikov
2018-09-10 10:11                 ` Denis Plotnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180912131503.GD5846@localhost.localdomain \
    --to=kwolf@redhat.com \
    --cc=den@openvz.org \
    --cc=dplotnikov@virtuozzo.com \
    --cc=famz@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.