All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>
Cc: John Snow <jsnow@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	quintela@redhat.com, stefanha@redhat.com, famz@redhat.com,
	mreitz@redhat.com, kwolf@redhat.com,
	Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic
Date: Thu, 2 Aug 2018 10:29:08 +0100	[thread overview]
Message-ID: <20180802092907.GA2523@work-vm> (raw)
In-Reply-To: <6bfdc952-fb17-feae-f367-be710853d829@openvz.org>

* Denis V. Lunev (den@openvz.org) wrote:
> On 08/01/2018 09:55 PM, Dr. David Alan Gilbert wrote:
> > * Denis V. Lunev (den@openvz.org) wrote:
> >> On 08/01/2018 08:40 PM, Dr. David Alan Gilbert wrote:
> >>> * John Snow (jsnow@redhat.com) wrote:
> >>>> On 08/01/2018 06:20 AM, Dr. David Alan Gilbert wrote:
> >>>>> * John Snow (jsnow@redhat.com) wrote:
> >>>>>
> >>>>> <snip>
> >>>>>
> >>>>>> I'd rather do something like this:
> >>>>>> - Always flush bitmaps to disk on inactivate.
> >>>>> Does that increase the time taken by the inactivate measurably?
> >>>>> If it's small relative to everything else that's fine; it's just I
> >>>>> always worry a little since I think this happens after we've stopped the
> >>>>> CPU on the source, so is part of the 'downtime'.
> >>>>>
> >>>>> Dave
> >>>>> --
> >>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>>>>
> >>>> I'm worried that if we don't, we're leaving behind unusable, partially
> >>>> complete files behind us. That's a bad design and we shouldn't push for
> >>>> it just because it's theoretically faster.
> >>> Oh I don't care about theoretical speed; but if it's actually unusably
> >>> slow in practice then it needs fixing.
> >>>
> >>> Dave
> >> This is not "theoretical" speed. This is real practical speed and
> >> instability.
> >> EACH IO operation can be performed unpredictably slow and thus with
> >> IO operations in mind you can not even calculate or predict downtime,
> >> which should be done according to the migration protocol.
> > We end up doing some IO anyway, even ignoring these new bitmaps,
> > at the end of the migration when we pause the CPU, we do a
> > bdrv_inactivate_all to flush any outstanding writes; so we've already
> > got that unpredictable slowness.
> >
> > So, not being a block person, but with some interest in making sure
> > downtime doesn't increase, I just wanted to understand whether the
> > amount of writes we're talking about here is comparable to that
> > which already exists or a lot smaller or a lot larger.
> > If the amount of IO you're talking about is much smaller than what
> > we typically already do, then John has a point and you may as well
> > do the write.
> > If the amount of IO for the bitmap is much larger and would slow
> > the downtime a lot then you've got a point and that would be unworkable.
> >
> > Dave
> This is not theoretical difference.
> 
> For 1 Tb drive and 64 kb bitmap granularity the size of bitmap is
> 2 Mb + some metadata (64 Kb). Thus we will have to write
> 2 Mb of data per bitmap.

OK, this was about my starting point; I think your Mb here is Byte not
Bit; so assuming a drive of 200MByte/s, that's 200/2=1/100th of a
second = 10ms; now 10ms I'd say is small enough not to worry about downtime
increases, since the number we normally hope for is in the 300ms ish
range.

> For some case there are 2-3-5 bitmaps
> this we will have 10 Mb of data.

OK, remembering I'm not a block person can you just explain why
you need 5 bitmaps?
But with 5 bitmaps that's 50ms, that's starting to get worrying.

> With 16 Tb drive the amount of
> data to write will be multiplied by 16 which gives 160 Mb to
> write. More disks and bigger the size - more data to write.

Yeh and that's going on for a second and way too big.

(Although that feels like you could fix it by adding bitmaps on your
bitmaps hierarchically so you didn't write them all; but that's
getting way more complex).

> Above amount should be multiplied by 2 - x Mb to be written
> on source, x Mb to be read on target which gives 320 Mb to
> write.
> 
> That is why this is not good - we have linear increase with the
> size and amount of disks.
> 
> There is also some thoughts on normal guest IO. Theoretically
> we can think on replaying IO on the target closing the file
> immediately or block writes to changed areas and notify
> target upon IO completion or invent other fancy dances.
> At least we think right now on these optimizations for regular
> migration paths.
> 
> The problem right that such things are not needed now for CBT
> but will become necessary and pretty much useless upon
> introducing this stuff.

I don't quite understand the last two paragraphs.

However, coming back to my question; it was really saying that
normal guest IO during the end of the migration will cause
a delay; I'm expecting that to be fairly unrelated to the size
of the disk; more to do with workload; so I guess in your case
the worry is the case of big large disks giving big large
bitmaps.

Dave

> Den
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-08-02  9:29 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-26 13:50 [Qemu-devel] [PATCH 0/6] fix persistent bitmaps migration logic Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 1/6] iotests: 169: drop deprecated 'autoload' parameter Vladimir Sementsov-Ogievskiy
2018-07-09 22:36   ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 2/6] block/qcow2: improve error message in qcow2_inactivate Vladimir Sementsov-Ogievskiy
2018-06-28 12:16   ` Eric Blake
2018-07-09 22:38     ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 3/6] bloc/qcow2: drop dirty_bitmaps_loaded state variable Vladimir Sementsov-Ogievskiy
2018-07-09 23:25   ` John Snow
2018-07-10  7:43     ` Vladimir Sementsov-Ogievskiy
2018-07-17 19:10       ` John Snow
2018-06-26 13:50 ` [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic Vladimir Sementsov-Ogievskiy
2018-07-21  2:41   ` John Snow
2018-08-01 10:20     ` Dr. David Alan Gilbert
2018-08-01 17:34       ` John Snow
2018-08-01 17:40         ` Dr. David Alan Gilbert
2018-08-01 18:42           ` Denis V. Lunev
2018-08-01 18:55             ` Dr. David Alan Gilbert
2018-08-01 20:25               ` Denis V. Lunev
2018-08-02  9:29                 ` Dr. David Alan Gilbert [this message]
2018-08-02  9:38                   ` Denis V. Lunev
2018-08-02  9:50                     ` Dr. David Alan Gilbert
2018-08-02 19:05                       ` Denis V. Lunev
2018-08-02 19:10                         ` John Snow
     [not found]                           ` <6d8ed319-9b63-5a7b-fcfe-20cd37cf8c7c@virtuozzo.com>
     [not found]                             ` <d2538432-be74-99bc-72d1-94f8abaa2f9b@redhat.com>
     [not found]                               ` <26c0e008-898d-924a-214e-68ab9fedf1ea@virtuozzo.com>
2018-10-15  9:42                                 ` [Qemu-devel] ping " Vladimir Sementsov-Ogievskiy
2018-10-29 17:52                                 ` [Qemu-devel] ping2 " Vladimir Sementsov-Ogievskiy
2018-10-29 18:06                                   ` John Snow
2018-08-03  8:33                         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-08-03  8:44                           ` Vladimir Sementsov-Ogievskiy
2018-08-03  8:49                             ` Dr. David Alan Gilbert
2018-08-03  8:59                           ` Denis V. Lunev
2018-08-03  9:10                             ` Dr. David Alan Gilbert
2018-08-01 18:56             ` John Snow
2018-08-01 20:31               ` Denis V. Lunev
2018-08-01 20:47               ` Denis V. Lunev
2018-08-01 22:28                 ` John Snow
2018-08-02 10:23                   ` Vladimir Sementsov-Ogievskiy
2018-08-01 12:24     ` Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 5/6] iotests: improve 169 Vladimir Sementsov-Ogievskiy
2018-06-26 13:50 ` [Qemu-devel] [PATCH 6/6] iotests: 169: add cases for source vm resuming Vladimir Sementsov-Ogievskiy
2018-06-26 18:22 ` [Qemu-devel] [PATCH 0/6] fix persistent bitmaps migration logic John Snow
2018-06-28 12:04   ` Vladimir Sementsov-Ogievskiy
2018-06-26 18:36 ` John Snow
2018-07-12 19:00 ` Vladimir Sementsov-Ogievskiy
2018-07-12 20:25   ` John Snow
2018-07-13  6:46     ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180802092907.GA2523@work-vm \
    --to=dgilbert@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.