All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: How to impove downtime of Live-Migration caused bdrv_drain_all()
       [not found] ` <20190328170759.GH18536@stefanha-x1.localdomain>
@ 2019-12-26  9:40   ` 张海斌
  2020-01-02 15:07     ` Stefan Hajnoczi
  0 siblings, 1 reply; 3+ messages in thread
From: 张海斌 @ 2019-12-26  9:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha

Stefan Hajnoczi <stefanha@redhat.com> 于2019年3月29日周五 上午1:08写道:
>
> On Thu, Mar 28, 2019 at 05:53:34PM +0800, 张海斌 wrote:
> > hi, stefan
> >
> > I have faced the same problem you wrote in
> > https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg04025.html
> >
> > Reproduce as follow:
> > 1. Clone qemu code from https://git.qemu.org/git/qemu.git, add some
> > debug information and compile
> > 2. Start a new VM
> > 3. In VM, use fio randwrite to add pressure for disk
> > 4. Live migrate
> >
> > Log show as follow:
> > [2019-03-28 15:10:40.206] /data/qemu/cpus.c:1086: enter do_vm_stop
> > [2019-03-28 15:10:40.212] /data/qemu/cpus.c:1097: call bdrv_drain_all
> > [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1099: call replay_disable_events
> > [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1101: call bdrv_flush_all
> > [2019-03-28 15:10:41.004] /data/qemu/cpus.c:1104: done do_vm_stop
> >
> > Calling bdrv_drain_all() costs 792 mini-seconds.
> > I just add a bdrv_drain_all() at start of do_vm_stop() before
> > pause_all_vcpus(), but it doesn't work.
> > Is there any way to improve live-migration downtime cause by bdrv_drain_all()?
> >
> > haibin
>
> Thanks for your email.  Please send technical questions to
> qemu-devel@nongnu.org and CC me.
>
> That way the discussion is archived and searchable for the future.  It
> also allows others in the community to participate and double-check any
> answers that I give.
>
> Stefan


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How to impove downtime of Live-Migration caused bdrv_drain_all()
  2019-12-26  9:40   ` How to impove downtime of Live-Migration caused bdrv_drain_all() 张海斌
@ 2020-01-02 15:07     ` Stefan Hajnoczi
  2020-01-02 15:27       ` Felipe Franciosi
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hajnoczi @ 2020-01-02 15:07 UTC (permalink / raw)
  To: 张海斌
  Cc: qemu-block, Vladimir Sementsov-Ogievskiy, qemu-devel, stefanha,
	Felipe Franciosi

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

On Thu, Dec 26, 2019 at 05:40:22PM +0800, 张海斌 wrote:
> Stefan Hajnoczi <stefanha@redhat.com> 于2019年3月29日周五 上午1:08写道:
> >
> > On Thu, Mar 28, 2019 at 05:53:34PM +0800, 张海斌 wrote:
> > > hi, stefan
> > >
> > > I have faced the same problem you wrote in
> > > https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg04025.html
> > >
> > > Reproduce as follow:
> > > 1. Clone qemu code from https://git.qemu.org/git/qemu.git, add some
> > > debug information and compile
> > > 2. Start a new VM
> > > 3. In VM, use fio randwrite to add pressure for disk
> > > 4. Live migrate
> > >
> > > Log show as follow:
> > > [2019-03-28 15:10:40.206] /data/qemu/cpus.c:1086: enter do_vm_stop
> > > [2019-03-28 15:10:40.212] /data/qemu/cpus.c:1097: call bdrv_drain_all
> > > [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1099: call replay_disable_events
> > > [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1101: call bdrv_flush_all
> > > [2019-03-28 15:10:41.004] /data/qemu/cpus.c:1104: done do_vm_stop
> > >
> > > Calling bdrv_drain_all() costs 792 mini-seconds.
> > > I just add a bdrv_drain_all() at start of do_vm_stop() before
> > > pause_all_vcpus(), but it doesn't work.
> > > Is there any way to improve live-migration downtime cause by bdrv_drain_all()?

I believe there were ideas about throttling storage controller devices
during the later phases of live migration to reduce the number of
pending I/Os.

In other words, if QEMU's virtio-blk/scsi emulation code reduces the
queue depth as live migration nears the handover point, bdrv_drain_all()
should become cheaper because fewer I/O requests will be in-flight.

A simple solution would reduce the queue depth during live migration
(e.g. queue depth 1).  A smart solution would look at I/O request
latency to decide what queue depth is acceptable.  For example, if
requests are taking 4 ms to complete then we might allow 2 or 3 requests
to achieve a ~10 ms bdrv_drain_all() downtime target.

As far as I know this has not been implemented.

Do you want to try implementing this?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How to impove downtime of Live-Migration caused bdrv_drain_all()
  2020-01-02 15:07     ` Stefan Hajnoczi
@ 2020-01-02 15:27       ` Felipe Franciosi
  0 siblings, 0 replies; 3+ messages in thread
From: Felipe Franciosi @ 2020-01-02 15:27 UTC (permalink / raw)
  To: Stefan Hajnoczi, 张海斌
  Cc: qemu-block, Vladimir Sementsov-Ogievskiy, qemu-devel, Stefan Hajnoczi



> On Jan 2, 2020, at 3:07 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> On Thu, Dec 26, 2019 at 05:40:22PM +0800, 张海斌 wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> 于2019年3月29日周五 上午1:08写道:
>>> 
>>> On Thu, Mar 28, 2019 at 05:53:34PM +0800, 张海斌 wrote:
>>>> hi, stefan
>>>> 
>>>> I have faced the same problem you wrote in
>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg04025.html
>>>> 
>>>> Reproduce as follow:
>>>> 1. Clone qemu code from https://git.qemu.org/git/qemu.git, add some
>>>> debug information and compile
>>>> 2. Start a new VM
>>>> 3. In VM, use fio randwrite to add pressure for disk
>>>> 4. Live migrate
>>>> 
>>>> Log show as follow:
>>>> [2019-03-28 15:10:40.206] /data/qemu/cpus.c:1086: enter do_vm_stop
>>>> [2019-03-28 15:10:40.212] /data/qemu/cpus.c:1097: call bdrv_drain_all
>>>> [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1099: call replay_disable_events
>>>> [2019-03-28 15:10:40.989] /data/qemu/cpus.c:1101: call bdrv_flush_all
>>>> [2019-03-28 15:10:41.004] /data/qemu/cpus.c:1104: done do_vm_stop
>>>> 
>>>> Calling bdrv_drain_all() costs 792 mini-seconds.
>>>> I just add a bdrv_drain_all() at start of do_vm_stop() before
>>>> pause_all_vcpus(), but it doesn't work.
>>>> Is there any way to improve live-migration downtime cause by bdrv_drain_all()?
> 
> I believe there were ideas about throttling storage controller devices
> during the later phases of live migration to reduce the number of
> pending I/Os.
> 
> In other words, if QEMU's virtio-blk/scsi emulation code reduces the
> queue depth as live migration nears the handover point, bdrv_drain_all()
> should become cheaper because fewer I/O requests will be in-flight.
> 
> A simple solution would reduce the queue depth during live migration
> (e.g. queue depth 1).  A smart solution would look at I/O request
> latency to decide what queue depth is acceptable.  For example, if
> requests are taking 4 ms to complete then we might allow 2 or 3 requests
> to achieve a ~10 ms bdrv_drain_all() downtime target.
> 
> As far as I know this has not been implemented.
> 
> Do you want to try implementing this?
> 
> Stefan

It is a really hard problem to solve. Ultimately, if guarantees are
needed about the blackout period, I don't see any viable solution
other than aborting all pending storage commands.

Starting with a "go to QD=1 mode" approach is probably sensible.
Vhost-based backends could even do that off the "you need to log"
message, given that these are only used during migration.

Having a "you are taking too long, abort everything" command might be
something worth looking into, specially if we can *safely* replay them
on the other side. (That may be backend-dependent.)

F.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-02 15:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAP0gKjUf4+Jf2GSZy8u5fwQAU2V9bk-viuSBByB5bo78NWHpbQ@mail.gmail.com>
     [not found] ` <20190328170759.GH18536@stefanha-x1.localdomain>
2019-12-26  9:40   ` How to impove downtime of Live-Migration caused bdrv_drain_all() 张海斌
2020-01-02 15:07     ` Stefan Hajnoczi
2020-01-02 15:27       ` Felipe Franciosi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.