Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
       [not found] <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com>
@ 2016-11-22 12:34 ` Vladimir Sementsov-Ogievskiy
  2016-11-22 13:38 ` Nikolay Shirokovskiy
  2016-11-22 16:07 ` John Snow
  2 siblings, 0 replies; 7+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-11-22 12:34 UTC (permalink / raw)
  To: Nikolay Shirokovskiy, qemu-devel, qemu block
  Cc: Denis Lunev, Maxim Nestratov, John Snow, Eric Blake, Fam Zheng

22.11.2016 15:01, Nikolay Shirokovskiy wrote:
> Hi, everyone.
>
>    There is a problem with current incremental backups. Imagine I ask qemu to
> make an incremental backup then go away and return back when backup
> job is finished. Qemu process dismisses the job completely and I missed
> all the events so I don't know the result of the operation and what is
> most important I don't know the base for dirty bitmap now. In case of failure
> it is previous backup and in case of success it is the last backup. Qemu does
> not track dirty bitmap base for me so I have no choice other then clear
> dirty bitmap and make full backup which would be rather unexpected from user
> POV (The situation of going away/coming back is libvirt crash/restart of course.)
>
>    I guess problem has wider scope. In case I miss successfull completion of full
> backup my only option is to drop backup file and redo the backup completely
> which is rather wasteful. AFAIU I can not query backup completion result from
> backup file itself. I guess there can be similar issues for other qemu jobs.
>
> Nikolay

I suggest an additional int field for BdrvDirtyBitmap, which will count 
number of bdrv_dirty_bitmap_abdicate() calls and will be available to 
the user through qmp query-block. So, for the user, <counter 
incremented> <=> <backup was successful and bitmap updated>.


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
       [not found] <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com>
  2016-11-22 12:34 ` [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions Vladimir Sementsov-Ogievskiy
@ 2016-11-22 13:38 ` Nikolay Shirokovskiy
  2016-11-22 16:07 ` John Snow
  2 siblings, 0 replies; 7+ messages in thread
From: Nikolay Shirokovskiy @ 2016-11-22 13:38 UTC (permalink / raw)
  To: qemu-devel

send to mailing list with correct address

On 22.11.2016 15:01, Nikolay Shirokovskiy wrote:
> Hi, everyone.
> 
>   There is a problem with current incremental backups. Imagine I ask qemu to
> make an incremental backup then go away and return back when backup
> job is finished. Qemu process dismisses the job completely and I missed
> all the events so I don't know the result of the operation and what is
> most important I don't know the base for dirty bitmap now. In case of failure
> it is previous backup and in case of success it is the last backup. Qemu does
> not track dirty bitmap base for me so I have no choice other then clear
> dirty bitmap and make full backup which would be rather unexpected from user 
> POV (The situation of going away/coming back is libvirt crash/restart of course.)
> 
>   I guess problem has wider scope. In case I miss successfull completion of full   
> backup my only option is to drop backup file and redo the backup completely
> which is rather wasteful. AFAIU I can not query backup completion result from
> backup file itself. I guess there can be similar issues for other qemu jobs.
> 
> Nikolay
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
       [not found] <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com>
  2016-11-22 12:34 ` [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions Vladimir Sementsov-Ogievskiy
  2016-11-22 13:38 ` Nikolay Shirokovskiy
@ 2016-11-22 16:07 ` John Snow
  2016-11-22 16:16   ` Vladimir Sementsov-Ogievskiy
  2016-11-22 16:16   ` Eric Blake
  2 siblings, 2 replies; 7+ messages in thread
From: John Snow @ 2016-11-22 16:07 UTC (permalink / raw)
  To: Nikolay Shirokovskiy, qemu-devel
  Cc: Denis Lunev, Vladimir Sementsov-Ogievskiy, Maxim Nestratov,
	Eric Blake, Jeff Cody

On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
> Hi, everyone.
>
>   There is a problem with current incremental backups. Imagine I ask qemu to
> make an incremental backup then go away and return back when backup
> job is finished. Qemu process dismisses the job completely and I missed
> all the events so I don't know the result of the operation and what is
> most important I don't know the base for dirty bitmap now. In case of failure
> it is previous backup and in case of success it is the last backup. Qemu does
> not track dirty bitmap base for me so I have no choice other then clear
> dirty bitmap and make full backup which would be rather unexpected from user
> POV (The situation of going away/coming back is libvirt crash/restart of course.)
>

Why was the completion/failure event missed? Is there some reason why 
you cannot guarantee that you will observe the completion?

>   I guess problem has wider scope. In case I miss successfull completion of full
> backup my only option is to drop backup file and redo the backup completely
> which is rather wasteful. AFAIU I can not query backup completion result from
> backup file itself. I guess there can be similar issues for other qemu jobs.
>
> Nikolay
>

I would personally advocate for a job-neutral solution where jobs can be 
given a parameter such that the job persists in memory in a new 
"completed" state until such time that it is queried explicitly, then it 
can be dropped.

I am not sure if we can make this the default behavior, as it might 
confuse libvirt to occasionally see jobs that have already completed.

Talking to Kevin off-list, he suggested that we might be able to make 
this the default behavior if we pivot to the new jobs API that I have 
been proposing, accompanied by a new explicit command to put a command 
to rest.

I can work on this for 2.9; though we may still need a "temporary" 
solution for the old jobs API until we're ready to officially deprecate 
the older interface.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
  2016-11-22 16:07 ` John Snow
@ 2016-11-22 16:16   ` Vladimir Sementsov-Ogievskiy
  2016-11-22 16:16   ` Eric Blake
  1 sibling, 0 replies; 7+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-11-22 16:16 UTC (permalink / raw)
  To: John Snow, Nikolay Shirokovskiy, qemu-devel
  Cc: Denis Lunev, Maxim Nestratov, Eric Blake, Jeff Cody

22.11.2016 19:07, John Snow wrote:
>
>
> On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
>> Hi, everyone.
>>
>>   There is a problem with current incremental backups. Imagine I ask 
>> qemu to
>> make an incremental backup then go away and return back when backup
>> job is finished. Qemu process dismisses the job completely and I missed
>> all the events so I don't know the result of the operation and what is
>> most important I don't know the base for dirty bitmap now. In case of 
>> failure
>> it is previous backup and in case of success it is the last backup. 
>> Qemu does
>> not track dirty bitmap base for me so I have no choice other then clear
>> dirty bitmap and make full backup which would be rather unexpected 
>> from user
>> POV (The situation of going away/coming back is libvirt crash/restart 
>> of course.)
>>
>
> Why was the completion/failure event missed? Is there some reason why 
> you cannot guarantee that you will observe the completion?
>
>>   I guess problem has wider scope. In case I miss successfull 
>> completion of full
>> backup my only option is to drop backup file and redo the backup 
>> completely
>> which is rather wasteful. AFAIU I can not query backup completion 
>> result from
>> backup file itself. I guess there can be similar issues for other 
>> qemu jobs.
>>
>> Nikolay
>>
>
> I would personally advocate for a job-neutral solution where jobs can 
> be given a parameter such that the job persists in memory in a new 
> "completed" state until such time that it is queried explicitly, then 
> it can be dropped.
>
> I am not sure if we can make this the default behavior, as it might 
> confuse libvirt to occasionally see jobs that have already completed.
>
> Talking to Kevin off-list, he suggested that we might be able to make 
> this the default behavior if we pivot to the new jobs API that I have 
> been proposing, accompanied by a new explicit command to put a command 
> to rest.
>
> I can work on this for 2.9; though we may still need a "temporary" 
> solution for the old jobs API until we're ready to officially 
> deprecate the older interface.
>

Jobs is completed state sounds good for me, I've thought about it too.


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
  2016-11-22 16:07 ` John Snow
  2016-11-22 16:16   ` Vladimir Sementsov-Ogievskiy
@ 2016-11-22 16:16   ` Eric Blake
  2016-11-22 17:26     ` John Snow
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Blake @ 2016-11-22 16:16 UTC (permalink / raw)
  To: John Snow, Nikolay Shirokovskiy, qemu-devel
  Cc: Denis Lunev, Vladimir Sementsov-Ogievskiy, Maxim Nestratov, Jeff Cody

[-- Attachment #1: Type: text/plain, Size: 2607 bytes --]

On 11/22/2016 10:07 AM, John Snow wrote:
> 
> 
> On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
>> Hi, everyone.
>>
>>   There is a problem with current incremental backups. Imagine I ask
>> qemu to
>> make an incremental backup then go away and return back when backup
>> job is finished. Qemu process dismisses the job completely and I missed
>> all the events so I don't know the result of the operation and what is
>> most important I don't know the base for dirty bitmap now. In case of
>> failure
>> it is previous backup and in case of success it is the last backup.
>> Qemu does
>> not track dirty bitmap base for me so I have no choice other then clear
>> dirty bitmap and make full backup which would be rather unexpected
>> from user
>> POV (The situation of going away/coming back is libvirt crash/restart
>> of course.)
>>
> 
> Why was the completion/failure event missed? Is there some reason why
> you cannot guarantee that you will observe the completion?

I think the intent of some of the on-error parameters is to make it so
that the job can't go away on error, only on success.  Admittedly,
libvirt isn't using those policies as well as it could.

> 
>>   I guess problem has wider scope. In case I miss successfull
>> completion of full
>> backup my only option is to drop backup file and redo the backup
>> completely
>> which is rather wasteful. AFAIU I can not query backup completion
>> result from
>> backup file itself. I guess there can be similar issues for other qemu
>> jobs.
>>
>> Nikolay
>>
> 
> I would personally advocate for a job-neutral solution where jobs can be
> given a parameter such that the job persists in memory in a new
> "completed" state until such time that it is queried explicitly, then it
> can be dropped.
> 
> I am not sure if we can make this the default behavior, as it might
> confuse libvirt to occasionally see jobs that have already completed.
> 
> Talking to Kevin off-list, he suggested that we might be able to make
> this the default behavior if we pivot to the new jobs API that I have
> been proposing, accompanied by a new explicit command to put a command
> to rest.

Yeah, revisiting the overall job API will require some overhaul in
libvirt as well, but it is probably worth it.

> 
> I can work on this for 2.9; though we may still need a "temporary"
> solution for the old jobs API until we're ready to officially deprecate
> the older interface.
> 
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
  2016-11-22 16:16   ` Eric Blake
@ 2016-11-22 17:26     ` John Snow
  2016-11-23  9:40       ` Stefan Hajnoczi
  0 siblings, 1 reply; 7+ messages in thread
From: John Snow @ 2016-11-22 17:26 UTC (permalink / raw)
  To: Eric Blake, Nikolay Shirokovskiy, qemu-devel
  Cc: Denis Lunev, Vladimir Sementsov-Ogievskiy, Maxim Nestratov, Jeff Cody



On 11/22/2016 11:16 AM, Eric Blake wrote:
> On 11/22/2016 10:07 AM, John Snow wrote:
>>
>>
>> On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
>>> Hi, everyone.
>>>
>>>   There is a problem with current incremental backups. Imagine I ask
>>> qemu to
>>> make an incremental backup then go away and return back when backup
>>> job is finished. Qemu process dismisses the job completely and I missed
>>> all the events so I don't know the result of the operation and what is
>>> most important I don't know the base for dirty bitmap now. In case of
>>> failure
>>> it is previous backup and in case of success it is the last backup.
>>> Qemu does
>>> not track dirty bitmap base for me so I have no choice other then clear
>>> dirty bitmap and make full backup which would be rather unexpected
>>> from user
>>> POV (The situation of going away/coming back is libvirt crash/restart
>>> of course.)
>>>
>>
>> Why was the completion/failure event missed? Is there some reason why
>> you cannot guarantee that you will observe the completion?
>
> I think the intent of some of the on-error parameters is to make it so
> that the job can't go away on error, only on success.  Admittedly,
> libvirt isn't using those policies as well as it could.
>
>>
>>>   I guess problem has wider scope. In case I miss successfull
>>> completion of full
>>> backup my only option is to drop backup file and redo the backup
>>> completely
>>> which is rather wasteful. AFAIU I can not query backup completion
>>> result from
>>> backup file itself. I guess there can be similar issues for other qemu
>>> jobs.
>>>
>>> Nikolay
>>>
>>
>> I would personally advocate for a job-neutral solution where jobs can be
>> given a parameter such that the job persists in memory in a new
>> "completed" state until such time that it is queried explicitly, then it
>> can be dropped.
>>
>> I am not sure if we can make this the default behavior, as it might
>> confuse libvirt to occasionally see jobs that have already completed.
>>
>> Talking to Kevin off-list, he suggested that we might be able to make
>> this the default behavior if we pivot to the new jobs API that I have
>> been proposing, accompanied by a new explicit command to put a command
>> to rest.
>
> Yeah, revisiting the overall job API will require some overhaul in
> libvirt as well, but it is probably worth it.
>

I wonder if I should try to rectify this temporarily for 2.9, or just 
jump straight into a new interface.

>>
>> I can work on this for 2.9; though we may still need a "temporary"
>> solution for the old jobs API until we're ready to officially deprecate
>> the older interface.
>>
>>
>

-- 
—js

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions
  2016-11-22 17:26     ` John Snow
@ 2016-11-23  9:40       ` Stefan Hajnoczi
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2016-11-23  9:40 UTC (permalink / raw)
  To: John Snow
  Cc: Eric Blake, Nikolay Shirokovskiy, qemu-devel, Jeff Cody,
	Vladimir Sementsov-Ogievskiy, Denis Lunev, Maxim Nestratov

[-- Attachment #1: Type: text/plain, Size: 3092 bytes --]

On Tue, Nov 22, 2016 at 12:26:34PM -0500, John Snow wrote:
> 
> 
> On 11/22/2016 11:16 AM, Eric Blake wrote:
> > On 11/22/2016 10:07 AM, John Snow wrote:
> > > 
> > > 
> > > On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
> > > > Hi, everyone.
> > > > 
> > > >   There is a problem with current incremental backups. Imagine I ask
> > > > qemu to
> > > > make an incremental backup then go away and return back when backup
> > > > job is finished. Qemu process dismisses the job completely and I missed
> > > > all the events so I don't know the result of the operation and what is
> > > > most important I don't know the base for dirty bitmap now. In case of
> > > > failure
> > > > it is previous backup and in case of success it is the last backup.
> > > > Qemu does
> > > > not track dirty bitmap base for me so I have no choice other then clear
> > > > dirty bitmap and make full backup which would be rather unexpected
> > > > from user
> > > > POV (The situation of going away/coming back is libvirt crash/restart
> > > > of course.)
> > > > 
> > > 
> > > Why was the completion/failure event missed? Is there some reason why
> > > you cannot guarantee that you will observe the completion?
> > 
> > I think the intent of some of the on-error parameters is to make it so
> > that the job can't go away on error, only on success.  Admittedly,
> > libvirt isn't using those policies as well as it could.
> > 
> > > 
> > > >   I guess problem has wider scope. In case I miss successfull
> > > > completion of full
> > > > backup my only option is to drop backup file and redo the backup
> > > > completely
> > > > which is rather wasteful. AFAIU I can not query backup completion
> > > > result from
> > > > backup file itself. I guess there can be similar issues for other qemu
> > > > jobs.
> > > > 
> > > > Nikolay
> > > > 
> > > 
> > > I would personally advocate for a job-neutral solution where jobs can be
> > > given a parameter such that the job persists in memory in a new
> > > "completed" state until such time that it is queried explicitly, then it
> > > can be dropped.
> > > 
> > > I am not sure if we can make this the default behavior, as it might
> > > confuse libvirt to occasionally see jobs that have already completed.
> > > 
> > > Talking to Kevin off-list, he suggested that we might be able to make
> > > this the default behavior if we pivot to the new jobs API that I have
> > > been proposing, accompanied by a new explicit command to put a command
> > > to rest.
> > 
> > Yeah, revisiting the overall job API will require some overhaul in
> > libvirt as well, but it is probably worth it.
> > 
> 
> I wonder if I should try to rectify this temporarily for 2.9, or just jump
> straight into a new interface.

I suggest drafting the "proper" API fix.  If it turns out to be a major
undertaking then maybe a sub-problem can be solved more easily instead.
But attacking the full problem first seems like a good approach - the
QEMU 2.9 development cycle hasn't even opened yet :).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-11-23  9:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com>
2016-11-22 12:34 ` [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions Vladimir Sementsov-Ogievskiy
2016-11-22 13:38 ` Nikolay Shirokovskiy
2016-11-22 16:07 ` John Snow
2016-11-22 16:16   ` Vladimir Sementsov-Ogievskiy
2016-11-22 16:16   ` Eric Blake
2016-11-22 17:26     ` John Snow
2016-11-23  9:40       ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.