* [Qemu-devel] [RFC] block-insert-node and block-job-delete @ 2017-07-26 14:19 Manos Pitsidianakis 2017-07-26 15:12 ` Stefan Hajnoczi 0 siblings, 1 reply; 13+ messages in thread From: Manos Pitsidianakis @ 2017-07-26 14:19 UTC (permalink / raw) To: qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Kevin Wolf, Alberto Garcia [-- Attachment #1: Type: text/plain, Size: 2541 bytes --] This proposal follows a discussion we had with Kevin and Stefan on filter node management. With block filter drivers arises a need to configure filter nodes on runtime with QMP on live graphs. A problem with doing live graph modifications is that some block jobs modify the graph when they are done and don't operate under any synchronisation, resulting in a race condition if we try to insert a filter node in the place of an existing edge. The race can be overcome if we introduce an optional manual-delete flag in the creation of a block job to indicate that they will not be deleted automatically, but rather wait until the user explicitly calls block-job-delete to remove the block job and apply the graph modifications. This makes filter insertion require that there are no active block jobs with manual-delete set to false. The graph operations will be synchronous unlike block-job-complete; the BlockJobDeferToMainLoopFn completion callback will be executed from QMP instead of using block_job_defer_to_main_loop. block-job-delete will be defined as { 'command': 'block-job-delete', 'data': { 'device': 'str' } } With this change we can define block-insert-node. New nodes will be created with blockdev-add with the appropriate children nodes. On calling block-insert-node we specify the node to add, and the edge we wish to replace: { 'command': 'block-insert-node', 'data' : { '*parent' : 'str', 'child' : 'str', 'node' : 'str', '*device' : 'str' } } This will be similar to x-blockdev-change, but instead of bdrv_add_child the parent driver must implement bdrv_reopen_* to change the child. If instead of parent we specify device, the node will be inserted as the root bs of the specified BlockBackend device. If 'child' is not in the bs->children of 'node', we should abort. I'm not certain if there's need to implement an option between bs->file/bs->backing. In the following example we insert a throttle filter node (T) between A and B: A | B { "execute": "blockdev-add", "arguments": { "driver": "throttle", "node-name": "T", "throttling-group": "foobar", "limits" : { "iops-total" : 1000, }, "file": "B" } } A T \ / B { "execute" : "block-insert-node", "arguments" : { "parent" : "A", "child" : "B", "node" : "T" } } A | T | B If bdrv_reopen is ever introduced to QMP this command, except for the BB root case, might be obsolete. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-26 14:19 [Qemu-devel] [RFC] block-insert-node and block-job-delete Manos Pitsidianakis @ 2017-07-26 15:12 ` Stefan Hajnoczi 2017-07-26 18:23 ` Manos Pitsidianakis 0 siblings, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2017-07-26 15:12 UTC (permalink / raw) To: Manos Pitsidianakis, qemu-block, qemu-devel, Kevin Wolf, Alberto Garcia Cc: John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 944 bytes --] On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > This proposal follows a discussion we had with Kevin and Stefan on filter > node management. > > With block filter drivers arises a need to configure filter nodes on runtime > with QMP on live graphs. A problem with doing live graph modifications is > that some block jobs modify the graph when they are done and don't operate > under any synchronisation, resulting in a race condition if we try to insert > a filter node in the place of an existing edge. Block jobs *do* operate under synchronization. They only manipulate the graph from the main loop (under the QEMU global mutex just like a monitor command). But maybe you are thinking about higher level race conditions between QMP commands. Can you give an example of the race? CCing John Snow (interested in commands for 'reaping' completed block jobs) and Jeff Cody (block jobs maintainer). [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-26 15:12 ` Stefan Hajnoczi @ 2017-07-26 18:23 ` Manos Pitsidianakis 2017-07-27 10:07 ` Stefan Hajnoczi 2017-07-27 22:09 ` John Snow 0 siblings, 2 replies; 13+ messages in thread From: Manos Pitsidianakis @ 2017-07-26 18:23 UTC (permalink / raw) To: Stefan Hajnoczi Cc: qemu-block, qemu-devel, Kevin Wolf, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 2181 bytes --] On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: >On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: >> This proposal follows a discussion we had with Kevin and Stefan on filter >> node management. >> >> With block filter drivers arises a need to configure filter nodes on runtime >> with QMP on live graphs. A problem with doing live graph modifications is >> that some block jobs modify the graph when they are done and don't operate >> under any synchronisation, resulting in a race condition if we try to insert >> a filter node in the place of an existing edge. > >But maybe you are thinking about higher level race conditions between >QMP commands. Can you give an example of the race? Exactly, synchronisation between the QMP/user actions. Here's an example, correct me if I'm wrong: User issues a block-commit to this tree to commit A onto B: BB -> A -> B This inserts the dummy mirror node at the top since this is a mirror job (active commit) BB -> dummy -> A -> B User wants to add a filter node F below the BB. First we create the node: BB -> dummy -> A -> B F / Commit finishes before we do block-insert-node, dummy and A is removed from the BB's chain, mirror_exit calls bdrv_replace_node() BB ------------> B F -> / Now inserting F under BB will error since dummy does not exist any more. The proposal doesn't guard against the following: User issues a block-commit to this tree to commit B onto C: BB -> A -> B -> C The dummy node (commit's top bs is A): BB -> A -> dummy -> B -> C blockdev-add of a filter we wish to eventually be between A and C: BB -> A -> dummy -> B -> C F/ if commit finishes before we issue block-insert-node (commit_complete() calls bdrv_set_backing_hd() which only touches A) BB -> A --------------> C F -> dummy -> B / resulting in error when issuing block-insert-node, else if we set the job to manual-delete and insert F: BB -> A -> F -> dummy -> B -> C deleting the job will result in F being lost since the job's top bs wasn't updated. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-26 18:23 ` Manos Pitsidianakis @ 2017-07-27 10:07 ` Stefan Hajnoczi 2017-07-28 12:08 ` Kevin Wolf 2017-07-27 22:09 ` John Snow 1 sibling, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2017-07-27 10:07 UTC (permalink / raw) To: Manos Pitsidianakis, qemu-block, qemu-devel, Kevin Wolf, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 3378 bytes --] On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > > This proposal follows a discussion we had with Kevin and Stefan on filter > > > node management. > > > > > > With block filter drivers arises a need to configure filter nodes on runtime > > > with QMP on live graphs. A problem with doing live graph modifications is > > > that some block jobs modify the graph when they are done and don't operate > > > under any synchronisation, resulting in a race condition if we try to insert > > > a filter node in the place of an existing edge. > > > > But maybe you are thinking about higher level race conditions between > > QMP commands. Can you give an example of the race? > > Exactly, synchronisation between the QMP/user actions. Here's an example, > correct me if I'm wrong: > > User issues a block-commit to this tree to commit A onto B: > BB -> A -> B > > This inserts the dummy mirror node at the top since this is a mirror job > (active commit) > BB -> dummy -> A -> B > > User wants to add a filter node F below the BB. First we create the node: > BB -> dummy -> A -> B > F / > > Commit finishes before we do block-insert-node, dummy and A is removed from > the BB's chain, mirror_exit calls bdrv_replace_node() > > BB ------------> B > F -> / > > Now inserting F under BB will error since dummy does not exist any more. I see the diagrams but the flow isn't clear without the user's QMP commands. F is created using blockdev-add? What are the parameters and how does it know about dummy? I think this is an interesting question in itself because dummy is a transient internal node that QMP users don't necessarily know about. What is the full block-insert-node command? > The proposal doesn't guard against the following: > User issues a block-commit to this tree to commit B onto C: > BB -> A -> B -> C > > The dummy node (commit's top bs is A): > BB -> A -> dummy -> B -> C > > blockdev-add of a filter we wish to eventually be between A and C: > BB -> A -> dummy -> B -> C > F/ > > if commit finishes before we issue block-insert-node (commit_complete() > calls bdrv_set_backing_hd() which only touches A) > BB -> A --------------> C > F -> dummy -> B / > resulting in error when issuing block-insert-node, > > else if we set the job to manual-delete and insert F: > BB -> A -> F -> dummy -> B -> C > deleting the job will result in F being lost since the job's top bs wasn't > updated. manual-delete is a solution for block jobs. The write threshold feature is a plain QMP command that in the future will create a transient internal node (before write notifier). I'm not sure it makes sense to turn the write threshold feature into a block job. I guess it could work, but it seems a little unnatural and maybe there will be other features that use transient internal nodes. What I'm getting at is that there might be alternative to manual-delete that work in the general case. For example, if blockdev-add + block-insert-node are part of a QMP 'transaction' command, can we lock the graph to protect it against modifications? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-27 10:07 ` Stefan Hajnoczi @ 2017-07-28 12:08 ` Kevin Wolf 2017-07-31 14:53 ` Stefan Hajnoczi 2017-07-31 17:30 ` Manos Pitsidianakis 0 siblings, 2 replies; 13+ messages in thread From: Kevin Wolf @ 2017-07-28 12:08 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Manos Pitsidianakis, qemu-block, qemu-devel, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 4691 bytes --] Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben: > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > > > This proposal follows a discussion we had with Kevin and Stefan on filter > > > > node management. > > > > > > > > With block filter drivers arises a need to configure filter nodes on runtime > > > > with QMP on live graphs. A problem with doing live graph modifications is > > > > that some block jobs modify the graph when they are done and don't operate > > > > under any synchronisation, resulting in a race condition if we try to insert > > > > a filter node in the place of an existing edge. > > > > > > But maybe you are thinking about higher level race conditions between > > > QMP commands. Can you give an example of the race? > > > > Exactly, synchronisation between the QMP/user actions. Here's an example, > > correct me if I'm wrong: > > > > User issues a block-commit to this tree to commit A onto B: > > BB -> A -> B > > > > This inserts the dummy mirror node at the top since this is a mirror job > > (active commit) > > BB -> dummy -> A -> B > > > > User wants to add a filter node F below the BB. First we create the node: > > BB -> dummy -> A -> B > > F / > > > > Commit finishes before we do block-insert-node, dummy and A is removed from > > the BB's chain, mirror_exit calls bdrv_replace_node() > > > > BB ------------> B > > F -> / > > > > Now inserting F under BB will error since dummy does not exist any more. > > I see the diagrams but the flow isn't clear without the user's QMP > commands. > > F is created using blockdev-add? What are the parameters and how does > it know about dummy? I think this is an interesting question in itself > because dummy is a transient internal node that QMP users don't > necessarily know about. We can expect that users of block-insert-node also know about block job filter nodes, simply because the former is newer than the latter. (I also don't like the name "dummy", this makes it sound like it's not really a proper node. In reality, there is little reason to treat it specially.) > What is the full block-insert-node command? > > > The proposal doesn't guard against the following: > > User issues a block-commit to this tree to commit B onto C: > > BB -> A -> B -> C > > > > The dummy node (commit's top bs is A): > > BB -> A -> dummy -> B -> C > > > > blockdev-add of a filter we wish to eventually be between A and C: > > BB -> A -> dummy -> B -> C > > F/ > > > > if commit finishes before we issue block-insert-node (commit_complete() > > calls bdrv_set_backing_hd() which only touches A) > > BB -> A --------------> C > > F -> dummy -> B / > > resulting in error when issuing block-insert-node, > > > > else if we set the job to manual-delete and insert F: > > BB -> A -> F -> dummy -> B -> C > > deleting the job will result in F being lost since the job's top bs wasn't > > updated. > > manual-delete is a solution for block jobs. The write threshold > feature is a plain QMP command that in the future will create a > transient internal node (before write notifier). Yes, that becomes a legacy QMP command then. The new way is blockdev-add and block-insert-node for write threshold, too. Apart from that, a write threshold node never disappears by itself, it is only inserted or removed in the context of a QMP command. That makes it a lot less dangerous than automatic block completion. > I'm not sure it makes sense to turn the write threshold feature into a > block job. I guess it could work, but it seems a little unnatural and > maybe there will be other features that use transient internal nodes. Yeah, no reason to do so. Just a manually created filter node. > What I'm getting at is that there might be alternative to manual-delete > that work in the general case. For example, if blockdev-add + > block-insert-node are part of a QMP 'transaction' command, can we lock > the graph to protect it against modifications? We thought about this at the last STR meeting. However, not for much longer than a minute. Markus and I both agreed that we wouldn't be the ones to try and make blockdev-add transactionable. It sounds like a huge amount of work and a high risk of bugs for little benefit. Let's just make sure that we provide a way for management tools to avoid any surprise changes in the block graph. Kevin [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-28 12:08 ` Kevin Wolf @ 2017-07-31 14:53 ` Stefan Hajnoczi 2017-07-31 17:30 ` Manos Pitsidianakis 1 sibling, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2017-07-31 14:53 UTC (permalink / raw) To: Kevin Wolf Cc: Manos Pitsidianakis, qemu-block, qemu-devel, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 4019 bytes --] On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote: > Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben: > > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > > > > This proposal follows a discussion we had with Kevin and Stefan on filter > > > > > node management. > > > > > > > > > > With block filter drivers arises a need to configure filter nodes on runtime > > > > > with QMP on live graphs. A problem with doing live graph modifications is > > > > > that some block jobs modify the graph when they are done and don't operate > > > > > under any synchronisation, resulting in a race condition if we try to insert > > > > > a filter node in the place of an existing edge. > > > > > > > > But maybe you are thinking about higher level race conditions between > > > > QMP commands. Can you give an example of the race? > > > > > > Exactly, synchronisation between the QMP/user actions. Here's an example, > > > correct me if I'm wrong: > > > > > > User issues a block-commit to this tree to commit A onto B: > > > BB -> A -> B > > > > > > This inserts the dummy mirror node at the top since this is a mirror job > > > (active commit) > > > BB -> dummy -> A -> B > > > > > > User wants to add a filter node F below the BB. First we create the node: > > > BB -> dummy -> A -> B > > > F / > > > > > > Commit finishes before we do block-insert-node, dummy and A is removed from > > > the BB's chain, mirror_exit calls bdrv_replace_node() > > > > > > BB ------------> B > > > F -> / > > > > > > Now inserting F under BB will error since dummy does not exist any more. > > > > I see the diagrams but the flow isn't clear without the user's QMP > > commands. > > > > F is created using blockdev-add? What are the parameters and how does > > it know about dummy? I think this is an interesting question in itself > > because dummy is a transient internal node that QMP users don't > > necessarily know about. > > We can expect that users of block-insert-node also know about block job > filter nodes, simply because the former is newer than the latter. > > (I also don't like the name "dummy", this makes it sound like it's not > really a proper node. In reality, there is little reason to treat it > specially.) > > > What is the full block-insert-node command? > > > > > The proposal doesn't guard against the following: > > > User issues a block-commit to this tree to commit B onto C: > > > BB -> A -> B -> C > > > > > > The dummy node (commit's top bs is A): > > > BB -> A -> dummy -> B -> C > > > > > > blockdev-add of a filter we wish to eventually be between A and C: > > > BB -> A -> dummy -> B -> C > > > F/ > > > > > > if commit finishes before we issue block-insert-node (commit_complete() > > > calls bdrv_set_backing_hd() which only touches A) > > > BB -> A --------------> C > > > F -> dummy -> B / > > > resulting in error when issuing block-insert-node, > > > > > > else if we set the job to manual-delete and insert F: > > > BB -> A -> F -> dummy -> B -> C > > > deleting the job will result in F being lost since the job's top bs wasn't > > > updated. > > > > manual-delete is a solution for block jobs. The write threshold > > feature is a plain QMP command that in the future will create a > > transient internal node (before write notifier). > > Yes, that becomes a legacy QMP command then. The new way is blockdev-add > and block-insert-node for write threshold, too. > > Apart from that, a write threshold node never disappears by itself, it > is only inserted or removed in the context of a QMP command. That makes > it a lot less dangerous than automatic block completion. Nice thinking - write threshold can be a node too! [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-28 12:08 ` Kevin Wolf 2017-07-31 14:53 ` Stefan Hajnoczi @ 2017-07-31 17:30 ` Manos Pitsidianakis 2017-08-01 13:50 ` Kevin Wolf 1 sibling, 1 reply; 13+ messages in thread From: Manos Pitsidianakis @ 2017-07-31 17:30 UTC (permalink / raw) To: Kevin Wolf Cc: Stefan Hajnoczi, qemu-block, qemu-devel, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 4874 bytes --] On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote: >Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben: >> On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: >> > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: >> > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: >> > > > This proposal follows a discussion we had with Kevin and Stefan on filter >> > > > node management. >> > > > >> > > > With block filter drivers arises a need to configure filter nodes on runtime >> > > > with QMP on live graphs. A problem with doing live graph modifications is >> > > > that some block jobs modify the graph when they are done and don't operate >> > > > under any synchronisation, resulting in a race condition if we try to insert >> > > > a filter node in the place of an existing edge. >> > > >> > > But maybe you are thinking about higher level race conditions between >> > > QMP commands. Can you give an example of the race? >> > >> > Exactly, synchronisation between the QMP/user actions. Here's an example, >> > correct me if I'm wrong: >> > >> > User issues a block-commit to this tree to commit A onto B: >> > BB -> A -> B >> > >> > This inserts the dummy mirror node at the top since this is a mirror job >> > (active commit) >> > BB -> dummy -> A -> B >> > >> > User wants to add a filter node F below the BB. First we create the node: >> > BB -> dummy -> A -> B >> > F / >> > >> > Commit finishes before we do block-insert-node, dummy and A is removed from >> > the BB's chain, mirror_exit calls bdrv_replace_node() >> > >> > BB ------------> B >> > F -> / >> > >> > Now inserting F under BB will error since dummy does not exist any more. >> >> I see the diagrams but the flow isn't clear without the user's QMP >> commands. >> >> F is created using blockdev-add? What are the parameters and how does >> it know about dummy? I think this is an interesting question in itself >> because dummy is a transient internal node that QMP users don't >> necessarily know about. > >We can expect that users of block-insert-node also know about block job >filter nodes, simply because the former is newer than the latter. > >(I also don't like the name "dummy", this makes it sound like it's not >really a proper node. In reality, there is little reason to treat it >specially.) > >> What is the full block-insert-node command? >> >> > The proposal doesn't guard against the following: >> > User issues a block-commit to this tree to commit B onto C: >> > BB -> A -> B -> C >> > >> > The dummy node (commit's top bs is A): >> > BB -> A -> dummy -> B -> C >> > >> > blockdev-add of a filter we wish to eventually be between A and C: >> > BB -> A -> dummy -> B -> C >> > F/ >> > >> > if commit finishes before we issue block-insert-node (commit_complete() >> > calls bdrv_set_backing_hd() which only touches A) >> > BB -> A --------------> C >> > F -> dummy -> B / >> > resulting in error when issuing block-insert-node, >> > >> > else if we set the job to manual-delete and insert F: >> > BB -> A -> F -> dummy -> B -> C >> > deleting the job will result in F being lost since the job's top bs wasn't >> > updated. >> >> manual-delete is a solution for block jobs. The write threshold >> feature is a plain QMP command that in the future will create a >> transient internal node (before write notifier). > >Yes, that becomes a legacy QMP command then. The new way is blockdev-add >and block-insert-node for write threshold, too. > >Apart from that, a write threshold node never disappears by itself, it >is only inserted or removed in the context of a QMP command. That makes >it a lot less dangerous than automatic block completion. > >> I'm not sure it makes sense to turn the write threshold feature into a >> block job. I guess it could work, but it seems a little unnatural and >> maybe there will be other features that use transient internal nodes. > >Yeah, no reason to do so. Just a manually created filter node. > Can you explain what you have in mind? The current workflow is using block-set-write-threshold on the targetted bs. If we want write threshold to be on node level we would want a new filter driver so that it can take options special to write-threshold. Unless we make the notify filter be a write threshold by default, and when using it internally by the backup job, disable the threshold and add our backup notifier to the node's list. Of course the current write threshold code could be refactored into a new driver so that it doesn't have to rely on notifiers. The way I've currently done the conversion is to add an implicit filter when enabling the threshold, just like in backup jobs. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-31 17:30 ` Manos Pitsidianakis @ 2017-08-01 13:50 ` Kevin Wolf 2017-08-01 13:57 ` Manos Pitsidianakis 0 siblings, 1 reply; 13+ messages in thread From: Kevin Wolf @ 2017-08-01 13:50 UTC (permalink / raw) To: Manos Pitsidianakis, Stefan Hajnoczi, qemu-block, qemu-devel, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 6005 bytes --] Am 31.07.2017 um 19:30 hat Manos Pitsidianakis geschrieben: > On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote: > > Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben: > > > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: > > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > > > > > This proposal follows a discussion we had with Kevin and Stefan on filter > > > > > > node management. > > > > > > > > > > > > With block filter drivers arises a need to configure filter nodes on runtime > > > > > > with QMP on live graphs. A problem with doing live graph modifications is > > > > > > that some block jobs modify the graph when they are done and don't operate > > > > > > under any synchronisation, resulting in a race condition if we try to insert > > > > > > a filter node in the place of an existing edge. > > > > > > > > > > But maybe you are thinking about higher level race conditions between > > > > > QMP commands. Can you give an example of the race? > > > > > > > > Exactly, synchronisation between the QMP/user actions. Here's an example, > > > > correct me if I'm wrong: > > > > > > > > User issues a block-commit to this tree to commit A onto B: > > > > BB -> A -> B > > > > > > > > This inserts the dummy mirror node at the top since this is a mirror job > > > > (active commit) > > > > BB -> dummy -> A -> B > > > > > > > > User wants to add a filter node F below the BB. First we create the node: > > > > BB -> dummy -> A -> B > > > > F / > > > > > > > > Commit finishes before we do block-insert-node, dummy and A is removed from > > > > the BB's chain, mirror_exit calls bdrv_replace_node() > > > > > > > > BB ------------> B > > > > F -> / > > > > > > > > Now inserting F under BB will error since dummy does not exist any more. > > > > > > I see the diagrams but the flow isn't clear without the user's QMP > > > commands. > > > > > > F is created using blockdev-add? What are the parameters and how does > > > it know about dummy? I think this is an interesting question in itself > > > because dummy is a transient internal node that QMP users don't > > > necessarily know about. > > > > We can expect that users of block-insert-node also know about block job > > filter nodes, simply because the former is newer than the latter. > > > > (I also don't like the name "dummy", this makes it sound like it's not > > really a proper node. In reality, there is little reason to treat it > > specially.) > > > > > What is the full block-insert-node command? > > > > > > > The proposal doesn't guard against the following: > > > > User issues a block-commit to this tree to commit B onto C: > > > > BB -> A -> B -> C > > > > > > > > The dummy node (commit's top bs is A): > > > > BB -> A -> dummy -> B -> C > > > > > > > > blockdev-add of a filter we wish to eventually be between A and C: > > > > BB -> A -> dummy -> B -> C > > > > F/ > > > > > > > > if commit finishes before we issue block-insert-node (commit_complete() > > > > calls bdrv_set_backing_hd() which only touches A) > > > > BB -> A --------------> C > > > > F -> dummy -> B / > > > > resulting in error when issuing block-insert-node, > > > > > > > > else if we set the job to manual-delete and insert F: > > > > BB -> A -> F -> dummy -> B -> C > > > > deleting the job will result in F being lost since the job's top bs wasn't > > > > updated. > > > > > > manual-delete is a solution for block jobs. The write threshold > > > feature is a plain QMP command that in the future will create a > > > transient internal node (before write notifier). > > > > Yes, that becomes a legacy QMP command then. The new way is blockdev-add > > and block-insert-node for write threshold, too. > > > > Apart from that, a write threshold node never disappears by itself, it > > is only inserted or removed in the context of a QMP command. That makes > > it a lot less dangerous than automatic block completion. > > > > > I'm not sure it makes sense to turn the write threshold feature into a > > > block job. I guess it could work, but it seems a little unnatural and > > > maybe there will be other features that use transient internal nodes. > > > > Yeah, no reason to do so. Just a manually created filter node. > > > > Can you explain what you have in mind? The current workflow is using > block-set-write-threshold on the targetted bs. If we want write threshold to > be on node level we would want a new filter driver so that it can take > options special to write-threshold. Yes, I think that's what we want. > Unless we make the notify filter be a write threshold by default, and > when using it internally by the backup job, disable the threshold and > add our backup notifier to the node's list. Of course the current > write threshold code could be refactored into a new driver so that it > doesn't have to rely on notifiers. As discussed on IRC, we shouldn't implement a bad interface (which notifiers are, they bypass normal block device configuration) in a different way, but we should replace it by something that fits better in the block layer design. In other words, don't add magic to provide the same notifier functions as we had, but convert their callers to have a block node of their own (i.e. one backup_top driver, one write-treshold driver, etc.). > The way I've currently done the conversion is to add an implicit > filter when enabling the threshold, just like in backup jobs. This is fine, but it should be specifically a write-threshold filter that a user can also manually insert whereever they like. The old QMP interfaces would be considered legacy commands then, because blockdev-add and blockdev-insert-node can provide the same thing. Kevin [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-08-01 13:50 ` Kevin Wolf @ 2017-08-01 13:57 ` Manos Pitsidianakis 0 siblings, 0 replies; 13+ messages in thread From: Manos Pitsidianakis @ 2017-08-01 13:57 UTC (permalink / raw) To: Kevin Wolf Cc: Stefan Hajnoczi, qemu-block, qemu-devel, Alberto Garcia, John Snow, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 6095 bytes --] On Tue, Aug 01, 2017 at 03:50:36PM +0200, Kevin Wolf wrote: >Am 31.07.2017 um 19:30 hat Manos Pitsidianakis geschrieben: >> On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote: >> > Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben: >> > > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote: >> > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: >> > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: >> > > > > > This proposal follows a discussion we had with Kevin and Stefan on filter >> > > > > > node management. >> > > > > > >> > > > > > With block filter drivers arises a need to configure filter nodes on runtime >> > > > > > with QMP on live graphs. A problem with doing live graph modifications is >> > > > > > that some block jobs modify the graph when they are done and don't operate >> > > > > > under any synchronisation, resulting in a race condition if we try to insert >> > > > > > a filter node in the place of an existing edge. >> > > > > >> > > > > But maybe you are thinking about higher level race conditions between >> > > > > QMP commands. Can you give an example of the race? >> > > > >> > > > Exactly, synchronisation between the QMP/user actions. Here's an example, >> > > > correct me if I'm wrong: >> > > > >> > > > User issues a block-commit to this tree to commit A onto B: >> > > > BB -> A -> B >> > > > >> > > > This inserts the dummy mirror node at the top since this is a mirror job >> > > > (active commit) >> > > > BB -> dummy -> A -> B >> > > > >> > > > User wants to add a filter node F below the BB. First we create the node: >> > > > BB -> dummy -> A -> B >> > > > F / >> > > > >> > > > Commit finishes before we do block-insert-node, dummy and A is removed from >> > > > the BB's chain, mirror_exit calls bdrv_replace_node() >> > > > >> > > > BB ------------> B >> > > > F -> / >> > > > >> > > > Now inserting F under BB will error since dummy does not exist any more. >> > > >> > > I see the diagrams but the flow isn't clear without the user's QMP >> > > commands. >> > > >> > > F is created using blockdev-add? What are the parameters and how does >> > > it know about dummy? I think this is an interesting question in itself >> > > because dummy is a transient internal node that QMP users don't >> > > necessarily know about. >> > >> > We can expect that users of block-insert-node also know about block job >> > filter nodes, simply because the former is newer than the latter. >> > >> > (I also don't like the name "dummy", this makes it sound like it's not >> > really a proper node. In reality, there is little reason to treat it >> > specially.) >> > >> > > What is the full block-insert-node command? >> > > >> > > > The proposal doesn't guard against the following: >> > > > User issues a block-commit to this tree to commit B onto C: >> > > > BB -> A -> B -> C >> > > > >> > > > The dummy node (commit's top bs is A): >> > > > BB -> A -> dummy -> B -> C >> > > > >> > > > blockdev-add of a filter we wish to eventually be between A and C: >> > > > BB -> A -> dummy -> B -> C >> > > > F/ >> > > > >> > > > if commit finishes before we issue block-insert-node (commit_complete() >> > > > calls bdrv_set_backing_hd() which only touches A) >> > > > BB -> A --------------> C >> > > > F -> dummy -> B / >> > > > resulting in error when issuing block-insert-node, >> > > > >> > > > else if we set the job to manual-delete and insert F: >> > > > BB -> A -> F -> dummy -> B -> C >> > > > deleting the job will result in F being lost since the job's top bs wasn't >> > > > updated. >> > > >> > > manual-delete is a solution for block jobs. The write threshold >> > > feature is a plain QMP command that in the future will create a >> > > transient internal node (before write notifier). >> > >> > Yes, that becomes a legacy QMP command then. The new way is blockdev-add >> > and block-insert-node for write threshold, too. >> > >> > Apart from that, a write threshold node never disappears by itself, it >> > is only inserted or removed in the context of a QMP command. That makes >> > it a lot less dangerous than automatic block completion. >> > >> > > I'm not sure it makes sense to turn the write threshold feature into a >> > > block job. I guess it could work, but it seems a little unnatural and >> > > maybe there will be other features that use transient internal nodes. >> > >> > Yeah, no reason to do so. Just a manually created filter node. >> > >> >> Can you explain what you have in mind? The current workflow is using >> block-set-write-threshold on the targetted bs. If we want write threshold to >> be on node level we would want a new filter driver so that it can take >> options special to write-threshold. > >Yes, I think that's what we want. > >> Unless we make the notify filter be a write threshold by default, and >> when using it internally by the backup job, disable the threshold and >> add our backup notifier to the node's list. Of course the current >> write threshold code could be refactored into a new driver so that it >> doesn't have to rely on notifiers. > >As discussed on IRC, we shouldn't implement a bad interface (which >notifiers are, they bypass normal block device configuration) in a >different way, but we should replace it by something that fits better in >the block layer design. > >In other words, don't add magic to provide the same notifier functions >as we had, but convert their callers to have a block node of their own >(i.e. one backup_top driver, one write-treshold driver, etc.). > >> The way I've currently done the conversion is to add an implicit >> filter when enabling the threshold, just like in backup jobs. > >This is fine, but it should be specifically a write-threshold filter >that a user can also manually insert whereever they like. The old QMP >interfaces would be considered legacy commands then, because >blockdev-add and blockdev-insert-node can provide the same thing. > >Kevin Alright thanks, this will make it easier. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-26 18:23 ` Manos Pitsidianakis 2017-07-27 10:07 ` Stefan Hajnoczi @ 2017-07-27 22:09 ` John Snow 2017-07-28 8:49 ` Manos Pitsidianakis 2017-07-28 11:55 ` Kevin Wolf 1 sibling, 2 replies; 13+ messages in thread From: John Snow @ 2017-07-27 22:09 UTC (permalink / raw) To: Manos Pitsidianakis, Stefan Hajnoczi, qemu-block, qemu-devel, Kevin Wolf, Alberto Garcia, Jeff Cody On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote: > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: >> On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: >>> This proposal follows a discussion we had with Kevin and Stefan on >>> filter >>> node management. >>> >>> With block filter drivers arises a need to configure filter nodes on >>> runtime >>> with QMP on live graphs. A problem with doing live graph >>> modifications is >>> that some block jobs modify the graph when they are done and don't >>> operate >>> under any synchronisation, resulting in a race condition if we try to >>> insert >>> a filter node in the place of an existing edge. >> >> But maybe you are thinking about higher level race conditions between >> QMP commands. Can you give an example of the race? > > Exactly, synchronisation between the QMP/user actions. Here's an > example, correct me if I'm wrong: > > User issues a block-commit to this tree to commit A onto B: > BB -> A -> B > > This inserts the dummy mirror node at the top since this is a mirror job > (active commit) > BB -> dummy -> A -> B > > User wants to add a filter node F below the BB. First we create the node: > BB -> dummy -> A -> B > F / > > Commit finishes before we do block-insert-node, dummy and A is removed > from the BB's chain, mirror_exit calls bdrv_replace_node() > > BB ------------> B > F -> / > > Now inserting F under BB will error since dummy does not exist any more. > Exactly how much of a problem is this? You, the user, have instructed QEMU to do two conflicting things: (1) Squash A down into B, and (2) Insert a filter above A Shame on you for deleting the node you were trying to reference, no? And as an added bonus, QEMU will even error out on your command instead of trying to do something and getting it wrong. Is this a problem? > The proposal doesn't guard against the following: > User issues a block-commit to this tree to commit B onto C: > BB -> A -> B -> C > > The dummy node (commit's top bs is A): > BB -> A -> dummy -> B -> C > > blockdev-add of a filter we wish to eventually be between A and C: > BB -> A -> dummy -> B -> C > F/ > > if commit finishes before we issue block-insert-node (commit_complete() > calls bdrv_set_backing_hd() which only touches A) > BB -> A --------------> C > F -> dummy -> B / > resulting in error when issuing block-insert-node, Because "B" and "dummy" have already actually been deleted, I assume. (The diagram is confusing.) > > else if we set the job to manual-delete and insert F: > BB -> A -> F -> dummy -> B -> C > deleting the job will result in F being lost since the job's top bs > wasn't updated. > Seems like a fairly good reason not to pursue this avenue then, yes? I think I agree with Stefan's concerns that this is more of a temporary workaround for the benefit of jobs, but that there may well be other reasons (especially in the future) where graph manipulations may occur invisibly to the user. Do you have any use cases for failure here that don't involve the user themselves asking for two conflicting things? That is, QEMU doing some graph reorganization without the user's knowledge coupled with the user asking for a new filter node? Are there any failures that result in anything more dangerous or bad than a "404: node not found" style error where we simply reject the premise of what the user was attempting to accomplish? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-27 22:09 ` John Snow @ 2017-07-28 8:49 ` Manos Pitsidianakis 2017-07-28 11:55 ` Kevin Wolf 1 sibling, 0 replies; 13+ messages in thread From: Manos Pitsidianakis @ 2017-07-28 8:49 UTC (permalink / raw) To: John Snow Cc: Stefan Hajnoczi, qemu-block, qemu-devel, Kevin Wolf, Alberto Garcia, Jeff Cody [-- Attachment #1: Type: text/plain, Size: 4425 bytes --] On Thu, Jul 27, 2017 at 06:09:04PM -0400, John Snow wrote: > > >On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote: >>On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: >>>On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: >>>>This proposal follows a discussion we had with Kevin and Stefan >>>>on filter >>>>node management. >>>> >>>>With block filter drivers arises a need to configure filter >>>>nodes on runtime >>>>with QMP on live graphs. A problem with doing live graph >>>>modifications is >>>>that some block jobs modify the graph when they are done and >>>>don't operate >>>>under any synchronisation, resulting in a race condition if we >>>>try to insert >>>>a filter node in the place of an existing edge. >>> >>>But maybe you are thinking about higher level race conditions between >>>QMP commands. Can you give an example of the race? >> >>Exactly, synchronisation between the QMP/user actions. Here's an >>example, correct me if I'm wrong: >> >>User issues a block-commit to this tree to commit A onto B: >> BB -> A -> B >> >>This inserts the dummy mirror node at the top since this is a mirror >>job (active commit) >> BB -> dummy -> A -> B >> >>User wants to add a filter node F below the BB. First we create the node: >> BB -> dummy -> A -> B >> F / >> >>Commit finishes before we do block-insert-node, dummy and A is >>removed from the BB's chain, mirror_exit calls bdrv_replace_node() >> >> BB ------------> B >> F -> / >> >>Now inserting F under BB will error since dummy does not exist any more. >> > >Exactly how much of a problem is this? You, the user, have instructed >QEMU to do two conflicting things: > >(1) Squash A down into B, and >(2) Insert a filter above A > >Shame on you for deleting the node you were trying to reference, no? >And as an added bonus, QEMU will even error out on your command >instead of trying to do something and getting it wrong. > >Is this a problem? > >>The proposal doesn't guard against the following: >>User issues a block-commit to this tree to commit B onto C: >> BB -> A -> B -> C >> >>The dummy node (commit's top bs is A): >> BB -> A -> dummy -> B -> C >> >>blockdev-add of a filter we wish to eventually be between A and C: >> BB -> A -> dummy -> B -> C >> F/ >> >>if commit finishes before we issue block-insert-node (commit_complete() >>calls bdrv_set_backing_hd() which only touches A) >> BB -> A --------------> C >> F -> dummy -> B / >> resulting in error when issuing block-insert-node, > >Because "B" and "dummy" have already actually been deleted, I assume. >(The diagram is confusing.) because we have an edge (A, C) and C is not a child of F, so we cannot insert it as a child to A. (I rotated the usual ascii art by 90 deg to save me from handling whitespace, but I might have done it worse) >> >>else if we set the job to manual-delete and insert F: >> BB -> A -> F -> dummy -> B -> C >>deleting the job will result in F being lost since the job's top bs >>wasn't updated. >> > >Seems like a fairly good reason not to pursue this avenue then, yes? I think this is just a different problem, not a race but of broadcasting graph operations to affected parties (block jobs). > >I think I agree with Stefan's concerns that this is more of a >temporary workaround for the benefit of jobs, but that there may well >be other reasons (especially in the future) where graph manipulations >may occur invisibly to the user. > >Do you have any use cases for failure here that don't involve the user >themselves asking for two conflicting things? That is, QEMU doing some >graph reorganization without the user's knowledge coupled with the >user asking for a new filter node? > >Are there any failures that result in anything more dangerous or bad >than a "404: node not found" style error where we simply reject the >premise of what the user was attempting to accomplish? > I couldn't come up with a case that caused catastrophic failure, no (though I lack familiarity with block jobs). If this isn't a problem after all, good news. block-insert-node needs to work with no dangerous conflicts to other graph operations, basically. Is there a case where there's not just an error and we can't recover? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-27 22:09 ` John Snow 2017-07-28 8:49 ` Manos Pitsidianakis @ 2017-07-28 11:55 ` Kevin Wolf 2017-08-02 10:47 ` Stefan Hajnoczi 1 sibling, 1 reply; 13+ messages in thread From: Kevin Wolf @ 2017-07-28 11:55 UTC (permalink / raw) To: John Snow Cc: Manos Pitsidianakis, Stefan Hajnoczi, qemu-block, qemu-devel, Alberto Garcia, Jeff Cody, armbru Am 28.07.2017 um 00:09 hat John Snow geschrieben: > On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote: > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > > > This proposal follows a discussion we had with Kevin and Stefan > > > > on filter > > > > node management. > > > > > > > > With block filter drivers arises a need to configure filter > > > > nodes on runtime > > > > with QMP on live graphs. A problem with doing live graph > > > > modifications is > > > > that some block jobs modify the graph when they are done and > > > > don't operate > > > > under any synchronisation, resulting in a race condition if we > > > > try to insert > > > > a filter node in the place of an existing edge. > > > > > > But maybe you are thinking about higher level race conditions between > > > QMP commands. Can you give an example of the race? > > > > Exactly, synchronisation between the QMP/user actions. Here's an > > example, correct me if I'm wrong: > > > > User issues a block-commit to this tree to commit A onto B: > > BB -> A -> B > > > > This inserts the dummy mirror node at the top since this is a mirror job > > (active commit) > > BB -> dummy -> A -> B > > > > User wants to add a filter node F below the BB. First we create the node: > > BB -> dummy -> A -> B > > F / > > > > Commit finishes before we do block-insert-node, dummy and A is removed > > from the BB's chain, mirror_exit calls bdrv_replace_node() > > > > BB ------------> B > > F -> / > > > > Now inserting F under BB will error since dummy does not exist any more. So the basic scenario is the one from the last section of the STR block meeting notes: http://lists.gnu.org/archive/html/qemu-block/2016-12/msg00045.html At the time, this scenario seemed to make sense, but in fact, as your explanation show, it's not a real problem. The theory that inserting the filter would bring the mirror_top node back is false because removing the mirror_top node updates _all_ of its parents (including the new throttling node) rather than just the BB to point to the BDS below. However, whether block-insert-node actually fails depends on how you address the edge(s) on which you want to insert a filter node. We would only get an error if you specify it as "insert filter above this node". However, this is not a way to unambiguously reference a single edge, you could only update _all_ incoming edges for a node at once. Therefore I don't think this is a good way to do things. An unambiguous way to identify a single edge is the name of the parent node (usually a node-name, but possibly a BB name - they share a single namespace, so having one QMP argument for both could be enough) and of the child (as in BdrvChild), e.g. "insert as new child=backing of node=myimage", This way, the example scenario would actually just work. > Exactly how much of a problem is this? You, the user, have instructed QEMU > to do two conflicting things: > > (1) Squash A down into B, and > (2) Insert a filter above A > > Shame on you for deleting the node you were trying to reference, no? And as > an added bonus, QEMU will even error out on your command instead of trying > to do something and getting it wrong. > > Is this a problem? I don't think these two actions can reasonably be considered conflicting, adding a filter on top doesn't influence the commit operation at all. We used to just forbid everything while doing block jobs, but these operations are block jobs because they are long-running, and just forbidding everything for a long time isn't very helpful. There is no reason why a user shouldn't be able to take a snapshot while they are creating a backup, etc. As I said above, filter nodes automagically going away at some point without an explicit user command isn't necessarily a problem, but I also don't feel very comfortable with it. It certainly feels like an easy way to introduce bugs. Things should become much easier to manage when all graph modifications are explicit. I'm surprised that you seem to be opposed to an explicit job-delete now as we already discussed this before and it would solve several problems: * Block job completion issues only an event. If the management tools reconnects, it has no way to get the same information with a query-* command. If jobs stay around until job-delete, this is not a problem. * Completion is usually rather complicated with all of the graph changes and involves operations that can fail, but we don't have a way to actually report errors to the user. job-delete could do that. * And finally the management problem we're discussing here. Yes, if I want to snapshot below a mirror node and the mirror goes away immediately before I send the command, libvirt will just get an error and can process all events and update its graph view and then retry with another node name. Though even if it's possible, do you really believe this will be fun to implement for management tools and help to avoid bugs? Or will people even remember to properly implement this? On the other hand, the downsides of job-delete: * Er... Haven't heard of any yet. Maybe inconvenient for human users? But they shouldn't be using QMP anyway. > > The proposal doesn't guard against the following: > > User issues a block-commit to this tree to commit B onto C: > > BB -> A -> B -> C > > > > The dummy node (commit's top bs is A): > > BB -> A -> dummy -> B -> C > > > > blockdev-add of a filter we wish to eventually be between A and C: > > BB -> A -> dummy -> B -> C > > F/ > > > > if commit finishes before we issue block-insert-node (commit_complete() > > calls bdrv_set_backing_hd() which only touches A) > > BB -> A --------------> C > > F -> dummy -> B / > > resulting in error when issuing block-insert-node, > > Because "B" and "dummy" have already actually been deleted, I assume. (The > diagram is confusing.) Why would there be a graph change at all before job-delete? > > else if we set the job to manual-delete and insert F: > > BB -> A -> F -> dummy -> B -> C > > deleting the job will result in F being lost since the job's top bs > > wasn't updated. > > Seems like a fairly good reason not to pursue this avenue then, yes? That's an implementation problem of the commit job. That it requires knowledge about the graph above the commit_top node is just completely wrong. The current implementation works for the trivial case, but not for much more. You don't even have to add filters dynamically for it to break. A future correct commit job would just replace the commit_top node by base when it completes. I have patches to implement this change, but unfortunately they revealed some bugs between bdrv_reopen() and op blockers, so they aren't ready yet to be posted. > I think I agree with Stefan's concerns that this is more of a > temporary workaround for the benefit of jobs, but that there may well > be other reasons (especially in the future) where graph manipulations > may occur invisibly to the user. We should do our best to avoid this. Implicit graph changes only make libvirt's life hard. I agree that we'll have many more graph changes in the future, but let's keep them as explicit as possible. > Do you have any use cases for failure here that don't involve the user > themselves asking for two conflicting things? That is, QEMU doing some graph > reorganization without the user's knowledge coupled with the user asking for > a new filter node? > > Are there any failures that result in anything more dangerous or bad than a > "404: node not found" style error where we simply reject the premise of what > the user was attempting to accomplish? "404: node not found" is already very bad if you want management tools to be able to cope with it. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete 2017-07-28 11:55 ` Kevin Wolf @ 2017-08-02 10:47 ` Stefan Hajnoczi 0 siblings, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2017-08-02 10:47 UTC (permalink / raw) To: Kevin Wolf Cc: John Snow, Manos Pitsidianakis, qemu-block, qemu-devel, Alberto Garcia, Jeff Cody, armbru [-- Attachment #1: Type: text/plain, Size: 1054 bytes --] On Fri, Jul 28, 2017 at 01:55:38PM +0200, Kevin Wolf wrote: > Am 28.07.2017 um 00:09 hat John Snow geschrieben: > > On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote: > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote: > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote: > > I think I agree with Stefan's concerns that this is more of a > > temporary workaround for the benefit of jobs, but that there may well > > be other reasons (especially in the future) where graph manipulations > > may occur invisibly to the user. > > We should do our best to avoid this. Implicit graph changes only make > libvirt's life hard. I agree that we'll have many more graph changes in > the future, but let's keep them as explicit as possible. Speaking of implicit graph changes: the COLO replication driver (block/replication.c) uses the backup job internally and that requires before write notifiers. That means there is an implicit graph change when COLO decides to complete the backup job. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-08-02 10:47 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-07-26 14:19 [Qemu-devel] [RFC] block-insert-node and block-job-delete Manos Pitsidianakis 2017-07-26 15:12 ` Stefan Hajnoczi 2017-07-26 18:23 ` Manos Pitsidianakis 2017-07-27 10:07 ` Stefan Hajnoczi 2017-07-28 12:08 ` Kevin Wolf 2017-07-31 14:53 ` Stefan Hajnoczi 2017-07-31 17:30 ` Manos Pitsidianakis 2017-08-01 13:50 ` Kevin Wolf 2017-08-01 13:57 ` Manos Pitsidianakis 2017-07-27 22:09 ` John Snow 2017-07-28 8:49 ` Manos Pitsidianakis 2017-07-28 11:55 ` Kevin Wolf 2017-08-02 10:47 ` Stefan Hajnoczi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.