All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Jeff Cody <jcody@redhat.com>
Cc: kwolf@redhat.com, famz@redhat.com, qemu-block@nongnu.org,
	qemu-devel@nongnu.org, armbru@redhat.com, jsnow@redhat.com
Subject: Re: [Qemu-devel] RFC: Operation Blockers in QEMU Block Nodes
Date: Fri, 18 Dec 2015 13:52:26 +0800	[thread overview]
Message-ID: <20151218055226.GD13945@stefanha-x1.localdomain> (raw)
In-Reply-To: <20151216062529.GA25889@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 6457 bytes --]

On Wed, Dec 16, 2015 at 01:25:29AM -0500, Jeff Cody wrote:
> Background:
> ------------
> Block jobs, and other QAPI operations, may modify and impact the
> BlockDriverState graph in QEMU.  In order to support multiple
> operations safely, we need a mechanism to block and gate operations,
> 
> We currently have op blockers, that are attached to each BDS.
> However, in practice we check this on the device level, rather than on
> the granularity of the individual BDS.  Also, due to limitations of
> the current system, we only allow a single block job at a time.
> 
> 
> Proposed New Design Features:
> ------------------------------
> This design would supersede the current op blocker system.
> 
> Rather than have the op blockers attached directly to each BDS, Block
> Job access will be done through a separate Node Access Control system.
> 
> This new system will:

Trying to understand what's new:

>     * Allow / Disallow operations to BDSs, (generally initiated by QAPI
>       actions; i.e. BDS node operations other than guest read/writes)

The old system allows this.

>     * Not be limited to "block jobs" (e.g. block-commit, block-stream,
>       etc..) - will also apply to other operations, such as
>       blockdev-add, change-backing-file, etc.

The old system allows this.

>     * Allow granularity in options, and provide the safety needed to
>       support multiple block jobs and operations.

The old system allows this although it leads to a proliferation of op
blockers.

>     * Reference each BDS by its node-name

The old system is also per-BDS.

>     * Be independent of the bdrv_states graph (i.e. does not reside in
>       the BDS structure)

This is new but what is benefit?

I'm not advocating the old system, just trying to understand what the
new requirements are.

> Node Access Control: Jobs
> --------------------------
> Every QAPI/HMP initiated operation is considered a "job", regardless
> if it is implemented via coroutines (e.g. block jobs such as
> block-commit), or handled synchronously in a QAPI handler (e.g.
> change-backing-file, blockdev-add).

Perhaps "operation" is clearer since the term "job" is already
associated with block jobs.

> A job consists of:
>     * Unique job ID

What is the purpose of the ID?  I guess you are planning a QMP
interface?

>     * A list of all node-names affected by the operation
>     * Action flags (see below) for each node in the above list
> 
> 
> Node Access Control: Action Flags
> -----------------------------------
> Every operation must set action flag pairs, for every node affected by
> the operation.
> 
> Action flags are set as a Require/Allow pair.  If the Require
> flag is set, then the operation requires the ability to take the
> specific action.  If the Allow flag is set, then the operation will
> allow other operations to perform same action in parallel.
> 
> The default is to prohibit, not allow, parallel actions.
> 
> The proposed actions are:
> 
>     1. Modify - Visible Data
>         * This is modification that would be detected by an external
>           read (for instance, a hypothetical qemu-io read from the
>           specific node), inclusive of its backing files.  A classic
>           example would be writing data into another node, as part of
>           block-commit.
> 
>     2. Modify - Metadata
>         * This is a write that is not necessarily visible to an external
>           read, but still modifies the underlying image format behind the
>           node.  I.e., change-backing-file to an identical backing file.
>           (n.b.: if changing the backing file to a non-identical
>           backing file, the "Write - Visible Data" action would also
>           be required).
> 
>     3. Graph Reconfiguration
>         * Removing, adding, or reordering nodes in the bdrv_state
>           graph. (n.b.: Write - Visible Data may need to be set, if
>           appropriate)
> 
>     4. Read - Visible Data
>         * Read data visible to an external read.
> 
>     5. Read - Metadata
>         * Read node metadata

I think this is the main change from the old system.  Instead of relying
on operation-specific blockers (e.g. BLOCK_OP_TYPE_COMMIT_TARGET), the
new system uses just a few actions.

The risk is that this small list of actions won't be fine-grained enough
to express the relationships between existing operations.  I don't know
whether or not this is the case.

Have you enumerated existing operations and checked that these actions
can represent the necessary blockers?

> Node Access Control: Tracking Jobs
> ----------------------------------
> Each new NAC job will have a unique ID. Associated with that ID is
> a list of each BDS node affected by the operation, alongside its
> corresponding Action Flags.
> 
> When a new NAC job is created, a separate list of node-name flags is
> updated to provide easy go/no go decisions.
> 
> An operation is prohibited if:

These make sense:

>     * A "Require" Action Flag is set, and
>     * The logical AND of the "Allow" Action Flag of all NAC-controlled
>       operations for that node is 0.

"You cannot do something that pre-existing operations have prohibited."

> 
>     - OR -
> 
>     * The operation does not set the "Allow" Action Flag, and
>     * The logical OR of the corresponding "Require" Action Flag of all
>       NAC-controlled operations for that node is 1.

"You cannot prohibit pre-existing operations from something they are
already doing."

> When a NAC controlled job completes, the node-name flag list is
> updated, and the corresponding NAC job removed from the job list.
> 
> 
> General Notes:
> -----------------
> * This is still a "honor" system, in that each handler / job is
> responsible for acquiring the NAC permission, and properly identifying
> all nodes affected correctly by their operation.
> 
> This should be done before any action is taken by the handler - that
> is, it should be clean to abort the operation if the NAC does not give
> consent.
> 
> * It may be possible to later expand the NAC system, to provide
>   handles for use by low-level operations in block.c.
> 
> * Thoughts?  There are still probably some holes in the scheme that need
> to be plugged.

Which new use cases require a new op blockers system?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2015-12-18  6:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-16  6:25 [Qemu-devel] RFC: Operation Blockers in QEMU Block Nodes Jeff Cody
2015-12-16 16:31 ` Eric Blake
2015-12-16 18:45   ` Jeff Cody
2015-12-18  5:52 ` Stefan Hajnoczi [this message]
2015-12-18 14:19 ` Kevin Wolf
2015-12-19  5:42   ` Jeff Cody
2016-01-25 14:10     ` [Qemu-devel] [Qemu-block] " Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151218055226.GD13945@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.