* [Qemu-devel] Making QMP 'block-job-cancel' transactionable @ 2017-03-24 12:34 Kashyap Chamarthy 2017-03-28 14:49 ` Eric Blake 2017-04-03 20:29 ` John Snow 0 siblings, 2 replies; 16+ messages in thread From: Kashyap Chamarthy @ 2017-03-24 12:34 UTC (permalink / raw) To: qemu-devel; +Cc: qemu-block, eblake While debugging some other issue, I happened to stumble across an old libvirt commit[*] that adds support for pivot (whether QEMU should switch to a target copy or not) operation as a result of issuing QMP 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block copy"). In the libvirt commit message[*] Eric Blake writes: "[...] There may be potential improvements to the snapshot code to exploit block copy over multiple disks all at one point in time. And, if 'block-job-cancel' were made part of 'transaction', you could copy multiple disks at the same point in time without pausing the domain. [...]" I realize that 'block-job-cancel' is currently not part of the @TransactionAction. Is it worthwhile to do so? Given the current behavior of QMP 'drive-mirror': - Upon 'block-job-complete', synchronization will end, and live QEMU pivots to the target (i.e. the copy) - Upon 'block-job-cancel', a point-in-time (at the time of cancel) copy gets created, and live QEMU will _not_ pivot. I realize that it is not possible to perform a "block copy" of multiple disks at the same point in time without pausing QEMU. From a brief chat with Stefan Hajnoczi on IRC, he does confirm that there's no current API for doing it atomically across multiple disks. Since Stefan asked if a bug exists for adding 'transaction' support to 'block-job-cancel', thought I might bring it up here first. [*] https://libvirt.org/git/?p=libvirt.git;a=commit;h=eaba79d -- blockjob: support pivot operation on cancel -- /kashyap ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] Making QMP 'block-job-cancel' transactionable 2017-03-24 12:34 [Qemu-devel] Making QMP 'block-job-cancel' transactionable Kashyap Chamarthy @ 2017-03-28 14:49 ` Eric Blake 2017-03-28 15:29 ` Kashyap Chamarthy 2017-04-03 20:29 ` John Snow 1 sibling, 1 reply; 16+ messages in thread From: Eric Blake @ 2017-03-28 14:49 UTC (permalink / raw) To: Kashyap Chamarthy, qemu-devel; +Cc: qemu-block [-- Attachment #1: Type: text/plain, Size: 1920 bytes --] On 03/24/2017 07:34 AM, Kashyap Chamarthy wrote: > While debugging some other issue, I happened to stumble across an old > libvirt commit[*] that adds support for pivot (whether QEMU should > switch to a target copy or not) operation as a result of issuing QMP > 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block > copy"). > > In the libvirt commit message[*] Eric Blake writes: > > "[...] There may be potential improvements to the snapshot code to > exploit block copy over multiple disks all at one point in time. > And, if 'block-job-cancel' were made part of 'transaction', you > could copy multiple disks at the same point in time without pausing > the domain. [...]" > > I realize that 'block-job-cancel' is currently not part of the > @TransactionAction. Is it worthwhile to do so? Yes, it is still worthwhile to make that improvement, although it is now 2.10 material. > > Given the current behavior of QMP 'drive-mirror': > > - Upon 'block-job-complete', synchronization will end, and live QEMU > pivots to the target (i.e. the copy) > > - Upon 'block-job-cancel', a point-in-time (at the time of cancel) > copy gets created, and live QEMU will _not_ pivot. > > I realize that it is not possible to perform a "block copy" of multiple > disks at the same point in time without pausing QEMU. From a brief chat > with Stefan Hajnoczi on IRC, he does confirm that there's no current API > for doing it atomically across multiple disks. > > Since Stefan asked if a bug exists for adding 'transaction' support to > 'block-job-cancel', thought I might bring it up here first. > > > [*] https://libvirt.org/git/?p=libvirt.git;a=commit;h=eaba79d -- > blockjob: support pivot operation on cancel > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] Making QMP 'block-job-cancel' transactionable 2017-03-28 14:49 ` Eric Blake @ 2017-03-28 15:29 ` Kashyap Chamarthy 2017-04-03 14:38 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi 0 siblings, 1 reply; 16+ messages in thread From: Kashyap Chamarthy @ 2017-03-28 15:29 UTC (permalink / raw) To: Eric Blake; +Cc: qemu-devel, qemu-block On Tue, Mar 28, 2017 at 09:49:40AM -0500, Eric Blake wrote: > On 03/24/2017 07:34 AM, Kashyap Chamarthy wrote: > > While debugging some other issue, I happened to stumble across an old > > libvirt commit[*] that adds support for pivot (whether QEMU should > > switch to a target copy or not) operation as a result of issuing QMP > > 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block > > copy"). > > > > In the libvirt commit message[*] Eric Blake writes: > > > > "[...] There may be potential improvements to the snapshot code to > > exploit block copy over multiple disks all at one point in time. > > And, if 'block-job-cancel' were made part of 'transaction', you > > could copy multiple disks at the same point in time without pausing > > the domain. [...]" > > > > I realize that 'block-job-cancel' is currently not part of the > > @TransactionAction. Is it worthwhile to do so? > > Yes, it is still worthwhile to make that improvement, although it is > now 2.10 material. Thanks for looking, didn't imply that it's for 2.9 (which is in Release Candidate phase) -- I realize most people are busy with it. Further discussion could be had when 2.10 tree opens up. [...] -- /kashyap ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-03-28 15:29 ` Kashyap Chamarthy @ 2017-04-03 14:38 ` Stefan Hajnoczi 0 siblings, 0 replies; 16+ messages in thread From: Stefan Hajnoczi @ 2017-04-03 14:38 UTC (permalink / raw) To: Kashyap Chamarthy; +Cc: Eric Blake, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1507 bytes --] On Tue, Mar 28, 2017 at 05:29:34PM +0200, Kashyap Chamarthy wrote: > On Tue, Mar 28, 2017 at 09:49:40AM -0500, Eric Blake wrote: > > On 03/24/2017 07:34 AM, Kashyap Chamarthy wrote: > > > While debugging some other issue, I happened to stumble across an old > > > libvirt commit[*] that adds support for pivot (whether QEMU should > > > switch to a target copy or not) operation as a result of issuing QMP > > > 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block > > > copy"). > > > > > > In the libvirt commit message[*] Eric Blake writes: > > > > > > "[...] There may be potential improvements to the snapshot code to > > > exploit block copy over multiple disks all at one point in time. > > > And, if 'block-job-cancel' were made part of 'transaction', you > > > could copy multiple disks at the same point in time without pausing > > > the domain. [...]" > > > > > > I realize that 'block-job-cancel' is currently not part of the > > > @TransactionAction. Is it worthwhile to do so? > > > > Yes, it is still worthwhile to make that improvement, although it is > > now 2.10 material. > > Thanks for looking, didn't imply that it's for 2.9 (which is in Release > Candidate phase) -- I realize most people are busy with it. Further > discussion could be had when 2.10 tree opens up. Thanks for the discussion. I have added this task to the block layer todo list: http://wiki.qemu-project.org/ToDo/Block#Atomic_block-job-cancel [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-03-24 12:34 [Qemu-devel] Making QMP 'block-job-cancel' transactionable Kashyap Chamarthy 2017-03-28 14:49 ` Eric Blake @ 2017-04-03 20:29 ` John Snow 2017-04-03 20:38 ` Eric Blake 2017-04-11 12:05 ` Kevin Wolf 1 sibling, 2 replies; 16+ messages in thread From: John Snow @ 2017-04-03 20:29 UTC (permalink / raw) To: Kashyap Chamarthy, qemu-devel; +Cc: qemu-block On 03/24/2017 08:34 AM, Kashyap Chamarthy wrote: > While debugging some other issue, I happened to stumble across an old > libvirt commit[*] that adds support for pivot (whether QEMU should > switch to a target copy or not) operation as a result of issuing QMP > 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block > copy"). > > In the libvirt commit message[*] Eric Blake writes: > > "[...] There may be potential improvements to the snapshot code to > exploit block copy over multiple disks all at one point in time. > And, if 'block-job-cancel' were made part of 'transaction', you > could copy multiple disks at the same point in time without pausing > the domain. [...]" > Oh, you want a transactional cancel to basically capitalize on the second completion mode of the mirror job. I have never really cared for the way this job works, because I don't think "canceling" a ready job is semantically valid (it's not canceled! We completed successfully, just using a different completion mode) -- but if I am in the minority here I would cede that a transactional cancel would be a worthwhile thing to have. I think at other points we have discussed the concept of having a configurable completion mode that jobs could have (and allowing this setting to be adjusted at runtime) that changes which completion mode they'll pursue. This would make a cancel unambiguously a cancellation. It would make a non-pivot completion to a mirror action an unambiguous success, too. Minor nit, perhaps, but I want to be sure before we cement the semantics of how mirror can be "successful." --js > I realize that 'block-job-cancel' is currently not part of the > @TransactionAction. Is it worthwhile to do so? > > Given the current behavior of QMP 'drive-mirror': > > - Upon 'block-job-complete', synchronization will end, and live QEMU > pivots to the target (i.e. the copy) > > - Upon 'block-job-cancel', a point-in-time (at the time of cancel) > copy gets created, and live QEMU will _not_ pivot. > > I realize that it is not possible to perform a "block copy" of multiple > disks at the same point in time without pausing QEMU. From a brief chat > with Stefan Hajnoczi on IRC, he does confirm that there's no current API > for doing it atomically across multiple disks. > > Since Stefan asked if a bug exists for adding 'transaction' support to > 'block-job-cancel', thought I might bring it up here first. > > > [*] https://libvirt.org/git/?p=libvirt.git;a=commit;h=eaba79d -- > blockjob: support pivot operation on cancel > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-03 20:29 ` John Snow @ 2017-04-03 20:38 ` Eric Blake 2017-04-04 13:28 ` Kashyap Chamarthy 2017-04-11 12:05 ` Kevin Wolf 1 sibling, 1 reply; 16+ messages in thread From: Eric Blake @ 2017-04-03 20:38 UTC (permalink / raw) To: John Snow, Kashyap Chamarthy, qemu-devel; +Cc: qemu-block [-- Attachment #1: Type: text/plain, Size: 2735 bytes --] On 04/03/2017 03:29 PM, John Snow wrote: > > > On 03/24/2017 08:34 AM, Kashyap Chamarthy wrote: >> While debugging some other issue, I happened to stumble across an old >> libvirt commit[*] that adds support for pivot (whether QEMU should >> switch to a target copy or not) operation as a result of issuing QMP >> 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block >> copy"). >> >> In the libvirt commit message[*] Eric Blake writes: >> >> "[...] There may be potential improvements to the snapshot code to >> exploit block copy over multiple disks all at one point in time. >> And, if 'block-job-cancel' were made part of 'transaction', you >> could copy multiple disks at the same point in time without pausing >> the domain. [...]" >> > > Oh, you want a transactional cancel to basically capitalize on the > second completion mode of the mirror job. > > I have never really cared for the way this job works, because I don't > think "canceling" a ready job is semantically valid (it's not canceled! > We completed successfully, just using a different completion mode) -- > but if I am in the minority here I would cede that a transactional > cancel would be a worthwhile thing to have. > > I think at other points we have discussed the concept of having a > configurable completion mode that jobs could have (and allowing this > setting to be adjusted at runtime) that changes which completion mode > they'll pursue. Indeed, having a runtime-adjustable completion mode would allow what libvirt wants: libvirt doesn't know what mode the user wants until they request virDomainBlockJobAbort() (the name is scary, but it merely means that they are stopping what is otherwise an unending job), and pass a flag that says whether they want pivot or end-point-in-time copy semantics. If they request pivot semantics, libvirt invokes block-job-complete to do its default completion mode, if they request copy semantics, libvirt then switches the completion mode and still calls block-job-complete (which _is_ valid in a transaction). > > This would make a cancel unambiguously a cancellation. It would make a > non-pivot completion to a mirror action an unambiguous success, too. > > Minor nit, perhaps, but I want to be sure before we cement the semantics > of how mirror can be "successful." Minor or not, it is a useful viewpoint. Either way, as long as the new way of getting a transactional non-pivot successful completion is something that libvirt can learn via introspection, it should solve what we are hoping for here. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-03 20:38 ` Eric Blake @ 2017-04-04 13:28 ` Kashyap Chamarthy 2017-04-04 13:54 ` Eric Blake 0 siblings, 1 reply; 16+ messages in thread From: Kashyap Chamarthy @ 2017-04-04 13:28 UTC (permalink / raw) To: Eric Blake; +Cc: John Snow, qemu-devel, qemu-block On Mon, Apr 03, 2017 at 03:38:36PM -0500, Eric Blake wrote: > On 04/03/2017 03:29 PM, John Snow wrote: > > On 03/24/2017 08:34 AM, Kashyap Chamarthy wrote: [...] > >> "[...] There may be potential improvements to the snapshot code to > >> exploit block copy over multiple disks all at one point in time. > >> And, if 'block-job-cancel' were made part of 'transaction', you > >> could copy multiple disks at the same point in time without pausing > >> the domain. [...]" > >> > > > > Oh, you want a transactional cancel to basically capitalize on the > > second completion mode of the mirror job. > > > > I have never really cared for the way this job works, because I don't > > think "canceling" a ready job is semantically valid (it's not canceled! > > We completed successfully, just using a different completion mode) -- > > but if I am in the minority here I would cede that a transactional > > cancel would be a worthwhile thing to have. > > > > I think at other points we have discussed the concept of having a > > configurable completion mode that jobs could have (and allowing this > > setting to be adjusted at runtime) that changes which completion mode > > they'll pursue. > > Indeed, having a runtime-adjustable completion mode would allow what > libvirt wants: libvirt doesn't know what mode the user wants until they > request virDomainBlockJobAbort() (the name is scary, but it merely means > that they are stopping what is otherwise an unending job), and pass a > flag that says whether they want pivot or end-point-in-time copy > semantics. If they request pivot semantics, libvirt invokes > block-job-complete to do its default completion mode, if they request > copy semantics, libvirt then switches the completion mode and still > calls block-job-complete (which _is_ valid in a transaction). Thanks for the nice articulation of the problem at hand, and yes -- configurable / "runtime-adjustable completion mode" would be useful for long-running block operations. > > This would make a cancel unambiguously a cancellation. It would make > > a non-pivot completion to a mirror action an unambiguous success, > > too. > > > > Minor nit, perhaps, but I want to be sure before we cement the > > semantics of how mirror can be "successful." > > Minor or not, it is a useful viewpoint. Either way, as long as the new > way of getting a transactional non-pivot successful completion is > something that libvirt can learn via introspection, Can you elaborate a little more on the above, for my own edification -- how might it be possible for "libvirt can learn via introspection"? Is it via some method using the QMP 'query-commands' / 'query-command-line-options'? > it should solve what we are hoping for here. -- /kashyap ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-04 13:28 ` Kashyap Chamarthy @ 2017-04-04 13:54 ` Eric Blake 2017-04-11 9:42 ` Markus Armbruster 0 siblings, 1 reply; 16+ messages in thread From: Eric Blake @ 2017-04-04 13:54 UTC (permalink / raw) To: Kashyap Chamarthy; +Cc: John Snow, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1049 bytes --] On 04/04/2017 08:28 AM, Kashyap Chamarthy wrote: >> Minor or not, it is a useful viewpoint. Either way, as long as the new >> way of getting a transactional non-pivot successful completion is >> something that libvirt can learn via introspection, > > Can you elaborate a little more on the above, for my own edification -- > how might it be possible for "libvirt can learn via introspection"? Is > it via some method using the QMP 'query-commands' / > 'query-command-line-options'? Those, and query-qmp-schema. If the change includes the addition of something new in the .json files describing QMP, then query-qmp-schema will let libvirt probe whether the version of qemu it it talking to has that new field or not (presumably, in this case it would be a new optional bool member to select the mode when creating a job, as well as a new command visible through query-commands to change the mode of a running job). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-04 13:54 ` Eric Blake @ 2017-04-11 9:42 ` Markus Armbruster 2017-04-11 10:30 ` Kashyap Chamarthy 0 siblings, 1 reply; 16+ messages in thread From: Markus Armbruster @ 2017-04-11 9:42 UTC (permalink / raw) To: Eric Blake; +Cc: Kashyap Chamarthy, John Snow, qemu-devel, qemu-block Eric Blake <eblake@redhat.com> writes: > On 04/04/2017 08:28 AM, Kashyap Chamarthy wrote: > >>> Minor or not, it is a useful viewpoint. Either way, as long as the new >>> way of getting a transactional non-pivot successful completion is >>> something that libvirt can learn via introspection, >> >> Can you elaborate a little more on the above, for my own edification -- >> how might it be possible for "libvirt can learn via introspection"? Is >> it via some method using the QMP 'query-commands' / >> 'query-command-line-options'? > > Those, and query-qmp-schema. Avoid query-command-line-options if you can. If you can't, know its limitations and quirks. > If the change includes the addition of > something new in the .json files describing QMP, then query-qmp-schema > will let libvirt probe whether the version of qemu it it talking to has > that new field or not (presumably, in this case it would be a new > optional bool member to select the mode when creating a job, as well as > a new command visible through query-commands to change the mode of a > running job). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-11 9:42 ` Markus Armbruster @ 2017-04-11 10:30 ` Kashyap Chamarthy 0 siblings, 0 replies; 16+ messages in thread From: Kashyap Chamarthy @ 2017-04-11 10:30 UTC (permalink / raw) To: Markus Armbruster; +Cc: Eric Blake, John Snow, qemu-devel, qemu-block On Tue, Apr 11, 2017 at 11:42:28AM +0200, Markus Armbruster wrote: > Eric Blake <eblake@redhat.com> writes: > > > On 04/04/2017 08:28 AM, Kashyap Chamarthy wrote: > > > >>> Minor or not, it is a useful viewpoint. Either way, as long as the new > >>> way of getting a transactional non-pivot successful completion is > >>> something that libvirt can learn via introspection, > >> > >> Can you elaborate a little more on the above, for my own edification -- > >> how might it be possible for "libvirt can learn via introspection"? Is > >> it via some method using the QMP 'query-commands' / > >> 'query-command-line-options'? > > > > Those, and query-qmp-schema. > > Avoid query-command-line-options if you can. If you can't, know its > limitations and quirks. Thanks; I actually meant to mention 'query-qmp-schema' (thanks to Eric for correcting). And I didn't fortet your reminder of the incompleteness / limitations of 'query-command-line-options' from your KVMForum 2015 presentation :-) > > > If the change includes the addition of > > something new in the .json files describing QMP, then query-qmp-schema > > will let libvirt probe whether the version of qemu it it talking to has > > that new field or not (presumably, in this case it would be a new > > optional bool member to select the mode when creating a job, as well as > > a new command visible through query-commands to change the mode of a > > running job). -- /kashyap ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-03 20:29 ` John Snow 2017-04-03 20:38 ` Eric Blake @ 2017-04-11 12:05 ` Kevin Wolf 2017-04-11 13:14 ` Eric Blake 1 sibling, 1 reply; 16+ messages in thread From: Kevin Wolf @ 2017-04-11 12:05 UTC (permalink / raw) To: John Snow; +Cc: Kashyap Chamarthy, qemu-devel, qemu-block Am 03.04.2017 um 22:29 hat John Snow geschrieben: > On 03/24/2017 08:34 AM, Kashyap Chamarthy wrote: > > While debugging some other issue, I happened to stumble across an old > > libvirt commit[*] that adds support for pivot (whether QEMU should > > switch to a target copy or not) operation as a result of issuing QMP > > 'block-job-cancel' to a 'drive-mirror' (in libvirt parlance, "block > > copy"). > > > > In the libvirt commit message[*] Eric Blake writes: > > > > "[...] There may be potential improvements to the snapshot code to > > exploit block copy over multiple disks all at one point in time. > > And, if 'block-job-cancel' were made part of 'transaction', you > > could copy multiple disks at the same point in time without pausing > > the domain. [...]" > > > > Oh, you want a transactional cancel to basically capitalize on the > second completion mode of the mirror job. > > I have never really cared for the way this job works, because I don't > think "canceling" a ready job is semantically valid (it's not canceled! > We completed successfully, just using a different completion mode) -- > but if I am in the minority here I would cede that a transactional > cancel would be a worthwhile thing to have. Note that job completion/cancellation aren't synchronous QMP commands. The job works something like this, where '...' means that the VM can run and submit new writes etc.: 1. Start job: mirror_start ... 2. Bulk has completed: BLOCK_JOB_READY event ... 3. Request completion/cancellation: block-job-completet/cancel ... 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED The last one is the actual job completion that we want to be atomic for a group of nodes. Just making step 3 atomic (which is what including block-job-complete/cancel in transaction would mean) doesn't really buy us anything because the data will still change between step 3 and 4. Now step 4 is reached for each job individually, and unless you stop the VM (or at least the processing of I/O requests), I don't see how you could reach it at the same time for all jobs. Kevin ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-11 12:05 ` Kevin Wolf @ 2017-04-11 13:14 ` Eric Blake 2017-04-11 13:30 ` Kevin Wolf 0 siblings, 1 reply; 16+ messages in thread From: Eric Blake @ 2017-04-11 13:14 UTC (permalink / raw) To: Kevin Wolf, John Snow; +Cc: qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1669 bytes --] On 04/11/2017 07:05 AM, Kevin Wolf wrote: > > Note that job completion/cancellation aren't synchronous QMP commands. > The job works something like this, where '...' means that the VM can run > and submit new writes etc.: > > 1. Start job: mirror_start > ... > 2. Bulk has completed: BLOCK_JOB_READY event > ... > 3. Request completion/cancellation: block-job-completet/cancel > ... > 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED > > The last one is the actual job completion that we want to be atomic for > a group of nodes. Just making step 3 atomic (which is what including > block-job-complete/cancel in transaction would mean) doesn't really buy > us anything because the data will still change between step 3 and 4. But as long as the data changes between steps 3 and 4 are written to only one of the two devices, rather than both, then the disk contents atomicity is guaranteed at the point where we stopped the mirroring (ie. during step 3). > > Now step 4 is reached for each job individually, and unless you stop the > VM (or at least the processing of I/O requests), I don't see how you > could reach it at the same time for all jobs. The fact that the jobs complete independently (based on different amounts of data to flush) is not problematic, if we are still guaranteed that issuing the request altered the graph so that future writes by the guest only go to one side, and the delay in closing is only due to flushing write requests that pre-dated the job end request. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-11 13:14 ` Eric Blake @ 2017-04-11 13:30 ` Kevin Wolf 2017-04-12 8:42 ` Fam Zheng 0 siblings, 1 reply; 16+ messages in thread From: Kevin Wolf @ 2017-04-11 13:30 UTC (permalink / raw) To: Eric Blake; +Cc: John Snow, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 2194 bytes --] Am 11.04.2017 um 15:14 hat Eric Blake geschrieben: > On 04/11/2017 07:05 AM, Kevin Wolf wrote: > > Note that job completion/cancellation aren't synchronous QMP commands. > > The job works something like this, where '...' means that the VM can run > > and submit new writes etc.: > > > > 1. Start job: mirror_start > > ... > > 2. Bulk has completed: BLOCK_JOB_READY event > > ... > > 3. Request completion/cancellation: block-job-completet/cancel > > ... > > 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED > > > > The last one is the actual job completion that we want to be atomic for > > a group of nodes. Just making step 3 atomic (which is what including > > block-job-complete/cancel in transaction would mean) doesn't really buy > > us anything because the data will still change between step 3 and 4. > > But as long as the data changes between steps 3 and 4 are written to > only one of the two devices, rather than both, then the disk contents > atomicity is guaranteed at the point where we stopped the mirroring (ie. > during step 3). But that's not how things work. Essentially requesting completion means that the next time that source and target are in sync, we complete the block job. This means that still new copying requests can be issued before the job actually completes (and it's even likely to happen). > > Now step 4 is reached for each job individually, and unless you stop the > > VM (or at least the processing of I/O requests), I don't see how you > > could reach it at the same time for all jobs. > > The fact that the jobs complete independently (based on different > amounts of data to flush) is not problematic, if we are still guaranteed > that issuing the request altered the graph so that future writes by the > guest only go to one side, and the delay in closing is only due to > flushing write requests that pre-dated the job end request. This is not easily possible without implementing something like a backup job for the final stage of the mirror job... Otherwise the guest can always overwrite a sector that is still marked dirty and then that new sector will definitely be copied still. Kevin [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-11 13:30 ` Kevin Wolf @ 2017-04-12 8:42 ` Fam Zheng 2017-04-12 8:59 ` Kevin Wolf 0 siblings, 1 reply; 16+ messages in thread From: Fam Zheng @ 2017-04-12 8:42 UTC (permalink / raw) To: Kevin Wolf; +Cc: Eric Blake, John Snow, qemu-devel, qemu-block On Tue, 04/11 15:30, Kevin Wolf wrote: > Am 11.04.2017 um 15:14 hat Eric Blake geschrieben: > > On 04/11/2017 07:05 AM, Kevin Wolf wrote: > > > Note that job completion/cancellation aren't synchronous QMP commands. > > > The job works something like this, where '...' means that the VM can run > > > and submit new writes etc.: > > > > > > 1. Start job: mirror_start > > > ... > > > 2. Bulk has completed: BLOCK_JOB_READY event > > > ... > > > 3. Request completion/cancellation: block-job-completet/cancel > > > ... > > > 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED > > > > > > The last one is the actual job completion that we want to be atomic for > > > a group of nodes. Just making step 3 atomic (which is what including > > > block-job-complete/cancel in transaction would mean) doesn't really buy > > > us anything because the data will still change between step 3 and 4. > > > > But as long as the data changes between steps 3 and 4 are written to > > only one of the two devices, rather than both, then the disk contents > > atomicity is guaranteed at the point where we stopped the mirroring (ie. > > during step 3). > > But that's not how things work. Essentially requesting completion means > that the next time that source and target are in sync, we complete the > block job. This means that still new copying requests can be issued > before the job actually completes (and it's even likely to happen). > > > > Now step 4 is reached for each job individually, and unless you stop the > > > VM (or at least the processing of I/O requests), I don't see how you > > > could reach it at the same time for all jobs. > > > > The fact that the jobs complete independently (based on different > > amounts of data to flush) is not problematic, if we are still guaranteed > > that issuing the request altered the graph so that future writes by the > > guest only go to one side, and the delay in closing is only due to > > flushing write requests that pre-dated the job end request. > > This is not easily possible without implementing something like a backup > job for the final stage of the mirror job... Otherwise the guest can > always overwrite a sector that is still marked dirty and then that new > sector will definitely be copied still. We can add a block-job-cancel-sync command and call block_job_cancel_sync(). If added to transaction, this does what Eric is asking, right? Fam ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-12 8:42 ` Fam Zheng @ 2017-04-12 8:59 ` Kevin Wolf 2017-04-12 9:12 ` Fam Zheng 0 siblings, 1 reply; 16+ messages in thread From: Kevin Wolf @ 2017-04-12 8:59 UTC (permalink / raw) To: Fam Zheng; +Cc: Eric Blake, John Snow, qemu-devel, qemu-block Am 12.04.2017 um 10:42 hat Fam Zheng geschrieben: > On Tue, 04/11 15:30, Kevin Wolf wrote: > > Am 11.04.2017 um 15:14 hat Eric Blake geschrieben: > > > On 04/11/2017 07:05 AM, Kevin Wolf wrote: > > > > Note that job completion/cancellation aren't synchronous QMP commands. > > > > The job works something like this, where '...' means that the VM can run > > > > and submit new writes etc.: > > > > > > > > 1. Start job: mirror_start > > > > ... > > > > 2. Bulk has completed: BLOCK_JOB_READY event > > > > ... > > > > 3. Request completion/cancellation: block-job-completet/cancel > > > > ... > > > > 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED > > > > > > > > The last one is the actual job completion that we want to be atomic for > > > > a group of nodes. Just making step 3 atomic (which is what including > > > > block-job-complete/cancel in transaction would mean) doesn't really buy > > > > us anything because the data will still change between step 3 and 4. > > > > > > But as long as the data changes between steps 3 and 4 are written to > > > only one of the two devices, rather than both, then the disk contents > > > atomicity is guaranteed at the point where we stopped the mirroring (ie. > > > during step 3). > > > > But that's not how things work. Essentially requesting completion means > > that the next time that source and target are in sync, we complete the > > block job. This means that still new copying requests can be issued > > before the job actually completes (and it's even likely to happen). > > > > > > Now step 4 is reached for each job individually, and unless you stop the > > > > VM (or at least the processing of I/O requests), I don't see how you > > > > could reach it at the same time for all jobs. > > > > > > The fact that the jobs complete independently (based on different > > > amounts of data to flush) is not problematic, if we are still guaranteed > > > that issuing the request altered the graph so that future writes by the > > > guest only go to one side, and the delay in closing is only due to > > > flushing write requests that pre-dated the job end request. > > > > This is not easily possible without implementing something like a backup > > job for the final stage of the mirror job... Otherwise the guest can > > always overwrite a sector that is still marked dirty and then that new > > sector will definitely be copied still. > > We can add a block-job-cancel-sync command and call block_job_cancel_sync(). If > added to transaction, this does what Eric is asking, right? But why is this better than issuing an explicit stop/cont? The VM will still be stopped for the same time, but additionally the monitor will block and the qemu process won't be responsive until the job has completed. I assumed that we'd all agree that long running synchronous monitor commands are bad. Kevin ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Making QMP 'block-job-cancel' transactionable 2017-04-12 8:59 ` Kevin Wolf @ 2017-04-12 9:12 ` Fam Zheng 0 siblings, 0 replies; 16+ messages in thread From: Fam Zheng @ 2017-04-12 9:12 UTC (permalink / raw) To: Kevin Wolf; +Cc: John Snow, qemu-devel, qemu-block On Wed, 04/12 10:59, Kevin Wolf wrote: > Am 12.04.2017 um 10:42 hat Fam Zheng geschrieben: > > On Tue, 04/11 15:30, Kevin Wolf wrote: > > > Am 11.04.2017 um 15:14 hat Eric Blake geschrieben: > > > > On 04/11/2017 07:05 AM, Kevin Wolf wrote: > > > > > Note that job completion/cancellation aren't synchronous QMP commands. > > > > > The job works something like this, where '...' means that the VM can run > > > > > and submit new writes etc.: > > > > > > > > > > 1. Start job: mirror_start > > > > > ... > > > > > 2. Bulk has completed: BLOCK_JOB_READY event > > > > > ... > > > > > 3. Request completion/cancellation: block-job-completet/cancel > > > > > ... > > > > > 4. Actual completion/cancellation: BLOCK_JOB_COMPLETED > > > > > > > > > > The last one is the actual job completion that we want to be atomic for > > > > > a group of nodes. Just making step 3 atomic (which is what including > > > > > block-job-complete/cancel in transaction would mean) doesn't really buy > > > > > us anything because the data will still change between step 3 and 4. > > > > > > > > But as long as the data changes between steps 3 and 4 are written to > > > > only one of the two devices, rather than both, then the disk contents > > > > atomicity is guaranteed at the point where we stopped the mirroring (ie. > > > > during step 3). > > > > > > But that's not how things work. Essentially requesting completion means > > > that the next time that source and target are in sync, we complete the > > > block job. This means that still new copying requests can be issued > > > before the job actually completes (and it's even likely to happen). > > > > > > > > Now step 4 is reached for each job individually, and unless you stop the > > > > > VM (or at least the processing of I/O requests), I don't see how you > > > > > could reach it at the same time for all jobs. > > > > > > > > The fact that the jobs complete independently (based on different > > > > amounts of data to flush) is not problematic, if we are still guaranteed > > > > that issuing the request altered the graph so that future writes by the > > > > guest only go to one side, and the delay in closing is only due to > > > > flushing write requests that pre-dated the job end request. > > > > > > This is not easily possible without implementing something like a backup > > > job for the final stage of the mirror job... Otherwise the guest can > > > always overwrite a sector that is still marked dirty and then that new > > > sector will definitely be copied still. > > > > We can add a block-job-cancel-sync command and call block_job_cancel_sync(). If > > added to transaction, this does what Eric is asking, right? > > But why is this better than issuing an explicit stop/cont? The VM will > still be stopped for the same time, but additionally the monitor will > block and the qemu process won't be responsive until the job has > completed. In practice I don't expect block_job_cancel_sync to be "long running", I expect it in the order of a few bdrv_flush() time, which is not worse than the block job starting commands. (I agree that block-job-complete-sync would be very different.) Fam ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-04-12 9:12 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-24 12:34 [Qemu-devel] Making QMP 'block-job-cancel' transactionable Kashyap Chamarthy 2017-03-28 14:49 ` Eric Blake 2017-03-28 15:29 ` Kashyap Chamarthy 2017-04-03 14:38 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi 2017-04-03 20:29 ` John Snow 2017-04-03 20:38 ` Eric Blake 2017-04-04 13:28 ` Kashyap Chamarthy 2017-04-04 13:54 ` Eric Blake 2017-04-11 9:42 ` Markus Armbruster 2017-04-11 10:30 ` Kashyap Chamarthy 2017-04-11 12:05 ` Kevin Wolf 2017-04-11 13:14 ` Eric Blake 2017-04-11 13:30 ` Kevin Wolf 2017-04-12 8:42 ` Fam Zheng 2017-04-12 8:59 ` Kevin Wolf 2017-04-12 9:12 ` Fam Zheng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.