Re: [Qemu-devel] [PATCH v4 for 2.12 0/3] fix bitmaps migration through shared storage

From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, jsnow@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v4 for 2.12 0/3] fix bitmaps migration through shared storage
Date: Tue, 3 Apr 2018 18:03:43 +0200	[thread overview]
Message-ID: <d0d202f1-6384-178c-9dd7-989d5f06c48a@redhat.com> (raw)
In-Reply-To: <7be588fb-dd63-d6fd-0e39-81fbee1bdd25@virtuozzo.com>

On 2018-03-30 17:32, Vladimir Sementsov-Ogievskiy wrote:
> 30.03.2018 16:31, Vladimir Sementsov-Ogievskiy wrote:
>> 29.03.2018 18:09, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.03.2018 17:03, Max Reitz wrote:
>>>> On 2018-03-29 10:08, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 28.03.2018 17:53, Max Reitz wrote:
>>>>>> On 2018-03-27 12:11, Vladimir Sementsov-Ogievskiy wrote:
>>>> [...]
>>>>
>>>>>>> isn't it because a lot of cat processes? will check, update loop to
>>>>>>> i=0; while check -qcow2 169; do ((i++)); echo $i OK; killall -9 cat;
>>>>>>> done
>>>>>> Hmm...  I know I tried to kill all of the cats, but for some
>>>>>> reason that
>>>>>> didn't really help yesterday.  Seems to help now, for 2.12.0-rc0 at
>>>>>> least (that is, before this series).
>>>>> reproduced with killing... (without these series, just on master)
>>>>>
>>>>>> After the whole series, I still get a lot of failures in 169
>>>>>> (mismatching bitmap hash, mostly).
>>>>>>
>>>>>> And interestingly, if I add an abort():
>>>>>>
>>>>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>>>>> index 486f3e83b7..9204c1c0ac 100644
>>>>>> --- a/block/qcow2.c
>>>>>> +++ b/block/qcow2.c
>>>>>> @@ -1481,6 +1481,7 @@ static int coroutine_fn
>>>>>> qcow2_do_open(BlockDriverState *bs, QDict *options,     }
>>>>>>
>>>>>>        if (bdrv_dirty_bitmap_next(bs, NULL)) {
>>>>>> +        abort();
>>>>>>            /* It's some kind of reopen with already existing dirty
>>>>>> bitmaps. There
>>>>>>             * are no known cases where we need loading bitmaps in
>>>>>> such
>>>>>> situation,
>>>>>>             * so it's safer don't load them.
>>>>>>
>>>>>> Then this fires for a couple of test cases of 169 even without the
>>>>>> third
>>>>>> patch of this series.
>>>>>>
>>>>>> I guess bdrv_dirty_bitmap_next() reacts to some bitmaps that
>>>>>> migration
>>>>>> adds or something?  Then this would be the wrong condition, because I
>>>>>> guess we still want to load the bitmaps that are in the qcow2 file.
>>>>>>
>>>>>> I'm not sure whether bdrv_has_readonly_bitmaps() is the correct
>>>>>> condition then, either, though.  Maybe let's take a step back: We
>>>>>> want
>>>>>> to load all the bitmaps from the file exactly once, and that is
>>>>>> when it
>>>>>> is opened the first time.  Or that's what I would have thought...  Is
>>>>>> that even correct?
>>>>>>
>>>>>> Why do we load the bitmaps when the device is inactive anyway?
>>>>>> Shouldn't we load them only once the device is activated?
>>>>> Hmm, not sure. May be, we don't need. But we anyway need to load them,
>>>>> when opening read-only, and we should correspondingly reopen in
>>>>> this case.
>>>> Yeah, well, yeah, but the current state seems just wrong. Apparently
>>>> there are cases where a qcow2 node may have bitmaps before we try to
>>>> load them from the file, so the current condition doesn't work.
>>>>
>>>> Furthermore, it seems like the current "state machine" is too
>>>> complex so
>>>> we don't know which cases are possible anymore and what to do when.
>>>>
>>>> So the first thing we could do is add a field to the BDRVQCow2State
>>>> that
>>>> tells us whether the bitmaps have been loaded already or not. If not,
>>>> we invoke qcow2_load_dirty_bitmaps() and set the value to true. If the
>>>> value was true already and the BDS is active and R/W now, we call
>>>> qcow2_reopen_bitmaps_rw_hint().  That should solve one problem.
>>>
>>> good idea, will do.
>>>
>>>>
>>>> The other problem of course is the question whether we should call
>>>> qcow2_load_dirty_bitmaps() at all while the drive is still inactive.
>>>> You know the migration model better than me, so I'm asking this
>>>> question
>>>> to you.  We can phrase it differently: Do we need to load the bitmaps
>>>> before the drive is activated?
>>>
>>> Now I think that we don't need. At least, we don't have such cases in
>>> Virtuozzo (I hope :).
>>>
>>> Why did I doubt:
>>>
>>> 1. We have cases, when we want to start vm as inactive, to be able to
>>> export it's drive as NBD export, push some data to it and then start
>>> the VM (which involves activating)
>>> 2. We have cases, when we want to start vm stopped and operate on
>>> dirty bitmaps.
>>>
>>> If just open all images in inactive mode until vm start, it looks
>>> like we need bitmaps in inactive mode (for 2.). But it looks like
>>> wrong approach anyway.
>>> Firstly, I tried to solve (1.) by simply inactivate_all() in case of
>>> start vm in paused mode, but it breaks at least (2.), so finally, I
>>> solved (1.) by an approach similar with "-incoming defer". So, we
>>> have inactive mode in two cases:
>>>  - incoming migration
>>>  - push data to vm before start
>>>
>>> and, in these cases, we don't need to load dirty-bitmaps.
>>>
>>> Also, inconsistency: now, we remove persistent bitmaps on inactivate.
>>> So, it is inconsistent to load the in inactive mode.
>>>
>>> Ok, I'll try to respin.
>>
>> finally, what cases we actually have for qcow2_do_open?
>>
>> 1. INACTIVE -> ACTIVE (through invalidate_cache, we obviously should
>> load bitmaps, if we decided that we have no persistent bitmaps in
>> INACTIVE mode)
>> 2. creating new bdrv state (first open of the image) in INACTIVE mode
>> (will not load bitmaps)
>> 3. creating new bdrv state (first open of the image) in ACTIVE mode
>> (will load bitmaps, maybe read-only if disk is RO)
>>
>> If only these three cases, it would be enough to just load bitmaps if
>> !INACTIVE and do nothing otherwise.

Maybe, but I'd prefer just adding a flag for now.  We can think about
whether we need bitmaps in inactive mode later, and even then managing
the state through a flag doesn't hurt.

>> Or, we have some of the following cases too?
>>
>> 1?. ACTIVE -> ACTIVE (through invalidate_cache, some kind of no-op, we
>> should not reload bitmaps)
>> 2?. RO -> RW (we should reopen_bitmaps_rw) (or it is possible only
>> through bdrv_reopen, which will not touch qcow2_do_open?
>> 3?. RW -> RO (reopen_bitmaps_ro ?)
>> ? something other??

I don't really know, but I just hope having a flag saves us from having
to think about all of this too much. :-)

>>> about 169, how often is it reproducible for you?

More errors than passes, I guess.

But it always leaves cats behind, and that's another sign that something
isn't right.

>> it becomes very interseting.
>>
>> persistent-migbitmap-online case failed on line 136. with mismathced
>> bitmap sha. This check is not relate to migration, on this line,
>> bitmap is already successfully migrated. But for some reason it is
>> corrupted after stop/start the VM. How is it possible - I can't
>> imagine. But it looks not really related to migration.. May it relate
>> to case, when postcopy was not finished or something like this?..
>> maybe fixing wait(RESUME) to something more appropriate will help. But
>> it is strange anyway.
>>
>> persistent-notmigbitmap-offline case failed on ine 128, which is
>> "self.vm_b.event_wait("RESUME", timeout=10.0)" with timeout.. can we
>> skip RESUME for some reason? maybe just move to MIGRATION event will
>> help, will check

It would still be interesting to know where the RESUME ended up...

> looks like moving from "RESUME" to "MIGRATION" events fixes the whole
> issue (I've already 740 successful iterations, which is good enough, but
> I will not stop the loop until at least 2000). I'll submit a patch. But
> I don't understand, why it fix sha256 mismatch, and why sha256-check
> fails with "RESUME" (check success after migration and fail after
> stop/start, o_O)...

Well, if you have the time, it might be worth investigating.

Max