All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Wouter Verhelst <w@uter.be>, "Denis V. Lunev" <den@openvz.org>
Cc: nbd-general@lists.sourceforge.net, Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 12:55:51 +0100	[thread overview]
Message-ID: <56F3D5C7.9070007@redhat.com> (raw)
In-Reply-To: <20160323175834.GC2467@grep.be>



On 23/03/2016 18:58, Wouter Verhelst wrote:
>> +To provide such class of information, `GET_LBA_STATUS` extension adds new
>> +`NBD_CMD_GET_LBA_STATUS` command which returns a list of LBA ranges with
>> +their respective states.
>> +
>> +* `NBD_CMD_GET_LBA_STATUS` (7)
>> +
>> +    An LBA range status query request. Length and offset define the range
>> +    of interest. The server MUST reply with a reply header, followed
>> +    immediately by the following data:
> 
> As Eric noted, please expand LBA at least once.

Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS).

>> +      - 32 bits, length of parameter data that follow (unsigned)
>> +      - zero or more LBA status descriptors, each having the following
>> +        structure:
>> +
>> +        * 64 bits, offset (unsigned)
>> +        * 32 bits, length (unsigned)
>> +        * 16 bits, status (unsigned)
>> +
>> +    unless an error condition has occurred.
>> +

Can we just return one descriptor?  That would simplify the protocol a bit.

>> +    the provisioning state of the device. The following provisionnig states
>> +    are defined for the command:
>> +
>> +      - `NBD_STATE_ALLOCATED` (0x0), LBA extent is present on the block device;
>> +      - `NBD_STATE_ZEROED` (0x1), LBA extent is present on the block device
>> +        and contains zeroes;
> 
> Presumably this should be "contains only zeroes"?

Yes, or "reads as zeroes".

> Also, this may end up being a fairly expensive call for the server to
> process. Is it really useful?

It's always okay for the server to omit NBD_STATE_ZERO, but it can be
useful if the state is known for some reason.  For example, file system
holes are always zero, but unallocated blocks on a block device are not
always zero.

However, let's make these bits, so that

NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device
NBD_STATE_ZERO (0x2), LBA extent will read as zeroes

and you can have NBD_STATE_ALLOCATED|NBD_STATE_ZERO.  File systems do
have the concept of unwritten extents which would be represented like
that.  The API to access the information (the FIEMAP ioctl) is broken,
but perhaps in the future a non-broken API could be added---for example
SEEK_ZERO and SEEK_NONZERO values for lseek's "whence" argument.

>> +      - `NBD_STATE_DEALLOCATED` (0x2), LBA extent is not present on the
>> +        block device. A client MUST NOT make any assumptions about the
>> +        contents of the extent.
>> +
>> +    2. Block dirtiness state
>> +
>> +    Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command flags
>> +    field set to `NBD_FLAG_GET_DIRTY` (0x1), the server MUST return
>> +    the dirtiness status of the device. The following dirtiness states
>> +    are defined for the command:
>> +
>> +      - `NBD_STATE_DIRTY` (0x0), LBA extent is dirty;
>> +      - `NBD_STATE_CLEAN` (0x1), LBA extent is clean.
>> +
>> +    Generic NBD client implementation without knowledge of a particular NBD
>> +    server operation MUST NOT make any assumption on the meaning of the
>> +    NBD_STATE_DIRTY or NBD_STATE_CLEAN states.
> 
> That makes it a useless call. A server can read /dev/random to decide
> whether to send STATE_DIRTY or STATE_CLEAN, and still be compliant with
> this spec.
> 
> Either the spec should define what it means for a block to be in a dirty
> state, or it should not talk about it.

Here is my attempt:

    This command is meant to operate in tandem with other (non-NBD)
    channels to the server.  Generally, a "dirty" block is a block that
    has been written to by someone, but the exact meaning of "has been
    written" is left to the implementation.  For example, a virtual
    machine monitor could provide a (non-NBD) command to start tracking
    blocks written by the virtual machine.  A backup client then can
    connect to an NBD server provided by the virtual machine monitor
    and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual
    machine has changed.

    An implementation that doesn't track the "dirtiness" state of blocks
    MUST either fail this command with EINVAL, or mark all blocks as
    dirty in the descriptor that it returns.

Paolo

  parent reply	other threads:[~2016-03-24 11:56 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14   ` Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini [this message]
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` [Qemu-devel] " Eric Blake
2016-03-25  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F3D5C7.9070007@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.