qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* RFC: tracking valid backing chain issue
@ 2020-10-20  8:21 Nikolay Shirokovskiy
  2020-10-20  8:50 ` Kevin Wolf
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Shirokovskiy @ 2020-10-20  8:21 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Nikolay Shirokovskiy

Hi, all.

I recently found a corner case when it is impossible AFAIK to find out valid
backing chain after block commit operation. Imagine committing top image. After
commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
next start mgmt can either check block job status for non-autodissmised job or
inspect backing chain to infer was pivot was successful or not in case of older
qemu.

But imagine after mgmt crash qemu process was destroyed too. In this case there
is no option to know now what is valid backing chain. Yeah libvirt starts qemu
process with -no-shutdown flags so process is not destroyed in case of shutdown
but still process can crash.

So corner case is very rare. Mgmt crash in a specific short moment and then
qemu crash before mgmt is up again.

I guess some 'invalidated' flag for image would help. And also qemu itself
could check that mgmt is not trying to run on invalid backing chain based
on this flag.

Nikolay


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: tracking valid backing chain issue
  2020-10-20  8:21 RFC: tracking valid backing chain issue Nikolay Shirokovskiy
@ 2020-10-20  8:50 ` Kevin Wolf
  2020-10-20 10:23   ` Nikolay Shirokovskiy
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2020-10-20  8:50 UTC (permalink / raw)
  To: Nikolay Shirokovskiy; +Cc: qemu-devel, qemu-block

Am 20.10.2020 um 10:21 hat Nikolay Shirokovskiy geschrieben:
> Hi, all.
> 
> I recently found a corner case when it is impossible AFAIK to find out valid
> backing chain after block commit operation. Imagine committing top image. After
> commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
> next start mgmt can either check block job status for non-autodissmised job or
> inspect backing chain to infer was pivot was successful or not in case of older
> qemu.
> 
> But imagine after mgmt crash qemu process was destroyed too. In this case there
> is no option to know now what is valid backing chain. Yeah libvirt starts qemu
> process with -no-shutdown flags so process is not destroyed in case of shutdown
> but still process can crash.

I don't think this is a problem.

Between completion of the job and finalising it, both the base node and
the top node are equivalent. You can access either and you'll always get
the same data.

So if libvirt didn't save that the job was already completed, it will
use the old image file, and it's fine. And if libvirt already sent the
job-finalize command, it will first have saved that the job was
completed and therefore use the new image, and it's fine, too.

Kevin

> So corner case is very rare. Mgmt crash in a specific short moment and then
> qemu crash before mgmt is up again.
> 
> I guess some 'invalidated' flag for image would help. And also qemu itself
> could check that mgmt is not trying to run on invalid backing chain based
> on this flag.
> 
> Nikolay
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: tracking valid backing chain issue
  2020-10-20  8:50 ` Kevin Wolf
@ 2020-10-20 10:23   ` Nikolay Shirokovskiy
  2020-10-20 10:29     ` Nikolay Shirokovskiy
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Shirokovskiy @ 2020-10-20 10:23 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel



On 20.10.2020 11:50, Kevin Wolf wrote:
> Am 20.10.2020 um 10:21 hat Nikolay Shirokovskiy geschrieben:
>> Hi, all.
>>
>> I recently found a corner case when it is impossible AFAIK to find out valid
>> backing chain after block commit operation. Imagine committing top image. After
>> commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
>> next start mgmt can either check block job status for non-autodissmised job or
>> inspect backing chain to infer was pivot was successful or not in case of older
>> qemu.
>>
>> But imagine after mgmt crash qemu process was destroyed too. In this case there
>> is no option to know now what is valid backing chain. Yeah libvirt starts qemu
>> process with -no-shutdown flags so process is not destroyed in case of shutdown
>> but still process can crash.
> 
> I don't think this is a problem.
> 
> Between completion of the job and finalising it, both the base node and
> the top node are equivalent. You can access either and you'll always get
> the same data.
> 
> So if libvirt didn't save that the job was already completed, it will
> use the old image file, and it's fine. And if libvirt already sent the
> job-finalize command, it will first have saved that the job was
> completed and therefore use the new image, and it's fine, too.

So finalizing can't fail? Otherwise libvirt can save that job is completed and
graph is changed while is was really wasn't

Nikolay

> 
> Kevin
> 
>> So corner case is very rare. Mgmt crash in a specific short moment and then
>> qemu crash before mgmt is up again.
>>
>> I guess some 'invalidated' flag for image would help. And also qemu itself
>> could check that mgmt is not trying to run on invalid backing chain based
>> on this flag.
>>
>> Nikolay
>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: tracking valid backing chain issue
  2020-10-20 10:23   ` Nikolay Shirokovskiy
@ 2020-10-20 10:29     ` Nikolay Shirokovskiy
  2020-10-21 10:56       ` Kevin Wolf
  0 siblings, 1 reply; 6+ messages in thread
From: Nikolay Shirokovskiy @ 2020-10-20 10:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel



On 20.10.2020 13:23, Nikolay Shirokovskiy wrote:
> 
> 
> On 20.10.2020 11:50, Kevin Wolf wrote:
>> Am 20.10.2020 um 10:21 hat Nikolay Shirokovskiy geschrieben:
>>> Hi, all.
>>>
>>> I recently found a corner case when it is impossible AFAIK to find out valid
>>> backing chain after block commit operation. Imagine committing top image. After
>>> commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
>>> next start mgmt can either check block job status for non-autodissmised job or
>>> inspect backing chain to infer was pivot was successful or not in case of older
>>> qemu.
>>>
>>> But imagine after mgmt crash qemu process was destroyed too. In this case there
>>> is no option to know now what is valid backing chain. Yeah libvirt starts qemu
>>> process with -no-shutdown flags so process is not destroyed in case of shutdown
>>> but still process can crash.
>>
>> I don't think this is a problem.
>>
>> Between completion of the job and finalising it, both the base node and
>> the top node are equivalent. You can access either and you'll always get
>> the same data.
>>
>> So if libvirt didn't save that the job was already completed, it will
>> use the old image file, and it's fine. And if libvirt already sent the
>> job-finalize command, it will first have saved that the job was
>> completed and therefore use the new image, and it's fine, too.
> 
> So finalizing can't fail? Otherwise libvirt can save that job is completed and
> graph is changed while is was really wasn't
> 

Hmm, it is even not the matter of qemu. Libvirt can save that job is completed
and then crash before sending command to finalize to qemu. So after qemu crash
and libvirt start libvirt would think that valid backing chain is without
top image which is not true.

>>
>> Kevin
>>
>>> So corner case is very rare. Mgmt crash in a specific short moment and then
>>> qemu crash before mgmt is up again.
>>>
>>> I guess some 'invalidated' flag for image would help. And also qemu itself
>>> could check that mgmt is not trying to run on invalid backing chain based
>>> on this flag.
>>>
>>> Nikolay
>>>
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: tracking valid backing chain issue
  2020-10-20 10:29     ` Nikolay Shirokovskiy
@ 2020-10-21 10:56       ` Kevin Wolf
  2020-10-22 15:54         ` Nikolay Shirokovskiy
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2020-10-21 10:56 UTC (permalink / raw)
  To: Nikolay Shirokovskiy; +Cc: qemu-devel, qemu-block

Am 20.10.2020 um 12:29 hat Nikolay Shirokovskiy geschrieben:
> 
> 
> On 20.10.2020 13:23, Nikolay Shirokovskiy wrote:
> > 
> > 
> > On 20.10.2020 11:50, Kevin Wolf wrote:
> >> Am 20.10.2020 um 10:21 hat Nikolay Shirokovskiy geschrieben:
> >>> Hi, all.
> >>>
> >>> I recently found a corner case when it is impossible AFAIK to find out valid
> >>> backing chain after block commit operation. Imagine committing top image. After
> >>> commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
> >>> next start mgmt can either check block job status for non-autodissmised job or
> >>> inspect backing chain to infer was pivot was successful or not in case of older
> >>> qemu.
> >>>
> >>> But imagine after mgmt crash qemu process was destroyed too. In this case there
> >>> is no option to know now what is valid backing chain. Yeah libvirt starts qemu
> >>> process with -no-shutdown flags so process is not destroyed in case of shutdown
> >>> but still process can crash.
> >>
> >> I don't think this is a problem.
> >>
> >> Between completion of the job and finalising it, both the base node and
> >> the top node are equivalent. You can access either and you'll always get
> >> the same data.
> >>
> >> So if libvirt didn't save that the job was already completed, it will
> >> use the old image file, and it's fine. And if libvirt already sent the
> >> job-finalize command, it will first have saved that the job was
> >> completed and therefore use the new image, and it's fine, too.
> > 
> > So finalizing can't fail? Otherwise libvirt can save that job is completed and
> > graph is changed while is was really wasn't
> 
> Hmm, it is even not the matter of qemu. Libvirt can save that job is completed
> and then crash before sending command to finalize to qemu. So after qemu crash
> and libvirt start libvirt would think that valid backing chain is without
> top image which is not true.

Why not? During this time the top and base image are equally valid to be
used as the active image.

If QEMU hadn't switched from top to base yet when it crashed, it's still
no problem if libvirt does the switch when restarting QEMU.

Kevin



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: tracking valid backing chain issue
  2020-10-21 10:56       ` Kevin Wolf
@ 2020-10-22 15:54         ` Nikolay Shirokovskiy
  0 siblings, 0 replies; 6+ messages in thread
From: Nikolay Shirokovskiy @ 2020-10-22 15:54 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel



On 21.10.2020 13:56, Kevin Wolf wrote:
> Am 20.10.2020 um 12:29 hat Nikolay Shirokovskiy geschrieben:
>>
>>
>> On 20.10.2020 13:23, Nikolay Shirokovskiy wrote:
>>>
>>>
>>> On 20.10.2020 11:50, Kevin Wolf wrote:
>>>> Am 20.10.2020 um 10:21 hat Nikolay Shirokovskiy geschrieben:
>>>>> Hi, all.
>>>>>
>>>>> I recently found a corner case when it is impossible AFAIK to find out valid
>>>>> backing chain after block commit operation. Imagine committing top image. After
>>>>> commit ready state pivot is sent and then mgmt crashed. So far so good. Upon
>>>>> next start mgmt can either check block job status for non-autodissmised job or
>>>>> inspect backing chain to infer was pivot was successful or not in case of older
>>>>> qemu.
>>>>>
>>>>> But imagine after mgmt crash qemu process was destroyed too. In this case there
>>>>> is no option to know now what is valid backing chain. Yeah libvirt starts qemu
>>>>> process with -no-shutdown flags so process is not destroyed in case of shutdown
>>>>> but still process can crash.
>>>>
>>>> I don't think this is a problem.
>>>>
>>>> Between completion of the job and finalising it, both the base node and
>>>> the top node are equivalent. You can access either and you'll always get
>>>> the same data.
>>>>
>>>> So if libvirt didn't save that the job was already completed, it will
>>>> use the old image file, and it's fine. And if libvirt already sent the
>>>> job-finalize command, it will first have saved that the job was
>>>> completed and therefore use the new image, and it's fine, too.
>>>
>>> So finalizing can't fail? Otherwise libvirt can save that job is completed and
>>> graph is changed while is was really wasn't
>>
>> Hmm, it is even not the matter of qemu. Libvirt can save that job is completed
>> and then crash before sending command to finalize to qemu. So after qemu crash
>> and libvirt start libvirt would think that valid backing chain is without
>> top image which is not true.
> 
> Why not? During this time the top and base image are equally valid to be
> used as the active image.
> 
> If QEMU hadn't switched from top to base yet when it crashed, it's still
> no problem if libvirt does the switch when restarting QEMU.
> 

Now it clear. Thanx for explanation.

Nikolay


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-22 15:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-20  8:21 RFC: tracking valid backing chain issue Nikolay Shirokovskiy
2020-10-20  8:50 ` Kevin Wolf
2020-10-20 10:23   ` Nikolay Shirokovskiy
2020-10-20 10:29     ` Nikolay Shirokovskiy
2020-10-21 10:56       ` Kevin Wolf
2020-10-22 15:54         ` Nikolay Shirokovskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).