On Wed, Dec 16, 2015 at 01:25:29AM -0500, Jeff Cody wrote: > Background: > ------------ > Block jobs, and other QAPI operations, may modify and impact the > BlockDriverState graph in QEMU. In order to support multiple > operations safely, we need a mechanism to block and gate operations, > > We currently have op blockers, that are attached to each BDS. > However, in practice we check this on the device level, rather than on > the granularity of the individual BDS. Also, due to limitations of > the current system, we only allow a single block job at a time. > > > Proposed New Design Features: > ------------------------------ > This design would supersede the current op blocker system. > > Rather than have the op blockers attached directly to each BDS, Block > Job access will be done through a separate Node Access Control system. > > This new system will: Trying to understand what's new: > * Allow / Disallow operations to BDSs, (generally initiated by QAPI > actions; i.e. BDS node operations other than guest read/writes) The old system allows this. > * Not be limited to "block jobs" (e.g. block-commit, block-stream, > etc..) - will also apply to other operations, such as > blockdev-add, change-backing-file, etc. The old system allows this. > * Allow granularity in options, and provide the safety needed to > support multiple block jobs and operations. The old system allows this although it leads to a proliferation of op blockers. > * Reference each BDS by its node-name The old system is also per-BDS. > * Be independent of the bdrv_states graph (i.e. does not reside in > the BDS structure) This is new but what is benefit? I'm not advocating the old system, just trying to understand what the new requirements are. > Node Access Control: Jobs > -------------------------- > Every QAPI/HMP initiated operation is considered a "job", regardless > if it is implemented via coroutines (e.g. block jobs such as > block-commit), or handled synchronously in a QAPI handler (e.g. > change-backing-file, blockdev-add). Perhaps "operation" is clearer since the term "job" is already associated with block jobs. > A job consists of: > * Unique job ID What is the purpose of the ID? I guess you are planning a QMP interface? > * A list of all node-names affected by the operation > * Action flags (see below) for each node in the above list > > > Node Access Control: Action Flags > ----------------------------------- > Every operation must set action flag pairs, for every node affected by > the operation. > > Action flags are set as a Require/Allow pair. If the Require > flag is set, then the operation requires the ability to take the > specific action. If the Allow flag is set, then the operation will > allow other operations to perform same action in parallel. > > The default is to prohibit, not allow, parallel actions. > > The proposed actions are: > > 1. Modify - Visible Data > * This is modification that would be detected by an external > read (for instance, a hypothetical qemu-io read from the > specific node), inclusive of its backing files. A classic > example would be writing data into another node, as part of > block-commit. > > 2. Modify - Metadata > * This is a write that is not necessarily visible to an external > read, but still modifies the underlying image format behind the > node. I.e., change-backing-file to an identical backing file. > (n.b.: if changing the backing file to a non-identical > backing file, the "Write - Visible Data" action would also > be required). > > 3. Graph Reconfiguration > * Removing, adding, or reordering nodes in the bdrv_state > graph. (n.b.: Write - Visible Data may need to be set, if > appropriate) > > 4. Read - Visible Data > * Read data visible to an external read. > > 5. Read - Metadata > * Read node metadata I think this is the main change from the old system. Instead of relying on operation-specific blockers (e.g. BLOCK_OP_TYPE_COMMIT_TARGET), the new system uses just a few actions. The risk is that this small list of actions won't be fine-grained enough to express the relationships between existing operations. I don't know whether or not this is the case. Have you enumerated existing operations and checked that these actions can represent the necessary blockers? > Node Access Control: Tracking Jobs > ---------------------------------- > Each new NAC job will have a unique ID. Associated with that ID is > a list of each BDS node affected by the operation, alongside its > corresponding Action Flags. > > When a new NAC job is created, a separate list of node-name flags is > updated to provide easy go/no go decisions. > > An operation is prohibited if: These make sense: > * A "Require" Action Flag is set, and > * The logical AND of the "Allow" Action Flag of all NAC-controlled > operations for that node is 0. "You cannot do something that pre-existing operations have prohibited." > > - OR - > > * The operation does not set the "Allow" Action Flag, and > * The logical OR of the corresponding "Require" Action Flag of all > NAC-controlled operations for that node is 1. "You cannot prohibit pre-existing operations from something they are already doing." > When a NAC controlled job completes, the node-name flag list is > updated, and the corresponding NAC job removed from the job list. > > > General Notes: > ----------------- > * This is still a "honor" system, in that each handler / job is > responsible for acquiring the NAC permission, and properly identifying > all nodes affected correctly by their operation. > > This should be done before any action is taken by the handler - that > is, it should be clean to abort the operation if the NAC does not give > consent. > > * It may be possible to later expand the NAC system, to provide > handles for use by low-level operations in block.c. > > * Thoughts? There are still probably some holes in the scheme that need > to be plugged. Which new use cases require a new op blockers system?