All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wouter Verhelst <w@uter.be>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	nbd-general@lists.sourceforge.net, kwolf@redhat.com,
	qemu-devel@nongnu.org, pborzenkov@virtuozzo.com,
	pbonzini@redhat.com, mpa@pengutronix.de, den@openvz.org
Subject: Re: [Qemu-devel] [Nbd] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension
Date: Tue, 29 Nov 2016 14:08:50 +0100	[thread overview]
Message-ID: <20161129130850.rapxzzt4lrkym4cc@grep.be> (raw)
In-Reply-To: <5d011ebd-1485-8ca2-402d-95beb3146c30@virtuozzo.com>

Hi Vladimir,

On Tue, Nov 29, 2016 at 03:41:10PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi,
> 
> 29.11.2016 13:50, Wouter Verhelst wrote:
> 
>     Hi,
> 
>     On Mon, Nov 28, 2016 at 06:33:24PM +0100, Wouter Verhelst wrote:
> 
>         However, I'm arguing that if we're going to provide information about
>         snapshots, we should be able to properly refer to these snapshots from
>         within an NBD context. My previous mail suggested adding a negotiation
>         message that would essentially ask the server "tell me about the
>         snapshots you know about", giving them an NBD identifier in the process
>         (accompanied by a "foreign" identifier that is decidedly *not* an NBD
>         identifier and that could be used to match the NBD identifier to
>         something implementation-defined). This would be read-only information;
>         the client cannot ask the server to create new snapshots. We can then
>         later in the protocol refer to these snapshots by way of that NBD
>         identifier.
> 
>     To make this a bit more concrete, I've changed the proposal like so:
> 
[...]
>     +##### Allocation contexts
>     +
>     +Allocation context 0 is the basic "exists at all" allocation context. If
>     +an extent is not allocated at allocation context 0, it MUST NOT be
>     +listed as allocated at another allocation context. This supports sparse
> 
> 
> allocated here is range with unset NBD_STATE_HOLE bit?

Eh, yes. I should clarify that a bit further (no time right now though,
but patches are certainly welcome).

>     +file semantics on the server side. If a server has only one allocation
>     +context (the default), then writing to an extent which is allocated in
>     +that allocation context 0 MUST NOT fail with ENOSPC.
>     +
>     +For all other cases, this specification requires no specific semantics
>     +of allocation contexts. Implementations could support allocation
>     +contexts with semantics like the following:
>     +
>     +- Incremental snapshots; if a block is allocated in one allocation
>     +  context, that implies that it is also allocated in the next level up.
>     +- Various bits of data about the backend of the storage; e.g., if the
>     +  storage is written on a RAID array, an allocation context could
>     +  return information about the redundancy level of a given extent
>     +- If the backend implements a write-through cache of some sort, or
>     +  synchronises with other servers, an allocation context could state
>     +  that an extent is "allocated" once it has reached permanent storage
>     +  and/or is synchronized with other servers.
>     +
>     +The only requirement of an allocation context is that it MUST be
>     +representable with the flags as defined for `NBD_CMD_BLOCK_STATUS`.
>     +
>     +Likewise, the syntax of query strings is not specified by this document.
>     +
>     +Server implementations SHOULD document their syntax for query strings
>     +and semantics for resulting allocation contexts in a document like this
>     +one.
> 
> 
> IMHO, redundant paragraph for this spec.

The "SHOULD document" one? I want that there at any rate, just to make
clear that it's probably a good thing to have it (but no, it's not part
of the formal protocol spec).

>     +
>      ### Transmission phase
> 
>      #### Flag fields
>     @@ -983,6 +1065,9 @@ valid may depend on negotiation during the handshake phase.
>         content chunk in reply.  MUST NOT be set unless the transmission
>         flags include `NBD_FLAG_SEND_DF`.  Use of this flag MAY trigger an
>         `EOVERFLOW` error chunk, if the request length is too large.
>     +- bit 3, `NBD_CMD_FLAG_REQ_ONE`; valid during `NBD_CMD_BLOCK_STATUS`. If
>     +  set, the client is interested in only one extent per allocation
>     +  context.
> 
>      ##### Structured reply flags
> 
>     @@ -1371,38 +1456,48 @@ adds a new `NBD_CMD_BLOCK_STATUS` command which returns a list of
>      ranges with their respective states.  This extension is not available
>      unless the client also negotiates the `STRUCTURED_REPLY` extension.
> 
>     -* `NBD_FLAG_SEND_BLOCK_STATUS`
>     -
>     -    The server SHOULD set this transmission flag to 1 if structured
>     -    replies have been negotiated, and the `NBD_CMD_BLOCK_STATUS`
>     -    request is supported.
>     -
>      * `NBD_REPLY_TYPE_BLOCK_STATUS`
> 
>     -    *length* MUST be a positive integer multiple of 8.  This reply
>     +    *length* MUST be 4 + (a positive integer multiple of 8).  This reply
>          represents a series of consecutive block descriptors where the sum
>          of the lengths of the descriptors MUST not be greater than the
>     -    length of the original request.  This chunk type MUST appear at most
>     -    once in a structured reply. Valid as a reply to
>     +    length of the original request. This chunk type MUST appear at most
>     +    once per allocation ID in a structured reply. Valid as a reply to
>          `NBD_CMD_BLOCK_STATUS`.
> 
>     -    The payload is structured as a list of one or more descriptors,
>     -    each with this layout:
>     +    Servers MUST return an `NBD_REPLY_TYPE_BLOCK_STATUS` chunk for every
>     +    allocation context ID, except if the semantics of particular
>     +    allocation contexts mean that the information for one allocation
>     +    context is implied by the information for another.
> 
> 
> So, actually, instead of selecting with a nbd_cmd or with external tool which
> bitmap we want to access, we just reply with all bitmaps (negotiated at the
> beginning). Personally I dislike this. With such approach, we will have to
> export allocation bitmap always, even if we need only dirtyness. Consider, that
> requesting allocation bitmap may be much more expensive in time that requesting
> dirtyness.

Ah, yes, didn't consider that. I suppose more updates will be required,
then.

What would you suggest instead, then?

> Or allocation bitmap information is implied by dirtyness?
> 
> Furthermore, as allocation context semantics defined externally, 'semantics
> mean that information is implied' states nothing, and we actually return to way
> #3, where external tool decides which bitmap to export.
> 
> 
>     +
>     +    The payload starts with:
>     +
>     +        * 32 bits, allocation context ID
>     +
>     +    and is followed by a list of one or more descriptors, each with this
>     +    layout:
> 
>              * 32 bits, length (unsigned, MUST NOT be zero)
>              * 32 bits, status flags
> 
>     -    The definition of the status flags is determined based on the
>     -    flags present in the original request.
>     +    If the client used the `NBD_CMD_FLAG_REQ_ONE` flag in the request,
>     +    then every reply chunk MUST NOT contain more than one descriptor.
>     +
>     +    Even if the client did not use the `NBD_CMD_FLAG_REQ_ONE` flag in
>     +    its request, the server MAY return less descriptors in the reply
>     +    than would be required to fully specify the whole range of requested
>     +    information to the client, if the number of descriptors would be
>     +    over 16 otherwise and looking up the information would be too
>     +    resource-intensive for the server.
> 
> 
> So, if there are <=16 extents, they all MUST be present in reply.. (just note)

That's the proposal, yes. I think it makes sense to have a minimum that
MUST be present (unless the client asked for REQ_ONE), although the
exact count can be different from 16, if needs be.

[...]
>     Thoughts?
> 
> For me considering dirtyness like one of allocation contexts sounds a bit
> weird, as dirtyness is not allocation.. But it is not so important.

I would certainly be willing to change the name, if that helps. The idea
is that you have various types of information that you can query the
server about. I called these "allocation contexts", but I'm certainly
not of the opinion that it *has* to be called that. Allocation is one of
the information types, but there can be more; my proposed spec
explicitly does not go into detail about the others.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

  reply	other threads:[~2016-11-29 13:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-25 11:28 [Qemu-devel] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension Vladimir Sementsov-Ogievskiy
2016-11-25 14:02 ` Stefan Hajnoczi
2016-11-27 19:17 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-11-28 11:19   ` Stefan Hajnoczi
2016-11-28 17:33     ` Wouter Verhelst
2016-11-29  9:17       ` Stefan Hajnoczi
2016-11-29 10:50       ` Wouter Verhelst
2016-11-29 12:41         ` Vladimir Sementsov-Ogievskiy
2016-11-29 13:08           ` Wouter Verhelst [this message]
2016-11-29 13:07         ` Alex Bligh
2016-12-01 10:14         ` Wouter Verhelst
2016-12-01 11:26           ` Vladimir Sementsov-Ogievskiy
2016-12-02  9:25             ` Wouter Verhelst
2016-11-28 23:15   ` John Snow
2016-11-29 10:18   ` Kevin Wolf
2016-11-29 11:34     ` Vladimir Sementsov-Ogievskiy
2016-11-30 10:41   ` Sergey Talantov
2016-11-29 12:57 ` [Qemu-devel] " Alex Bligh
2016-11-29 14:36   ` Vladimir Sementsov-Ogievskiy
2016-11-29 14:52     ` Alex Bligh
2016-11-29 15:07       ` Vladimir Sementsov-Ogievskiy
2016-11-29 15:17         ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-12-01 23:42   ` [Qemu-devel] " John Snow
2016-12-02  9:16     ` Vladimir Sementsov-Ogievskiy
2016-12-02 18:45     ` Alex Bligh
2016-12-02 20:39       ` John Snow
2016-12-03 11:08         ` Alex Bligh
2016-12-05  8:36         ` Vladimir Sementsov-Ogievskiy
2016-12-06 13:32         ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-12-06 16:39           ` John Snow
2016-12-08  3:39       ` [Qemu-devel] " Alex Bligh
2016-12-08  6:58         ` Vladimir Sementsov-Ogievskiy
2016-12-08 14:13           ` Alex Bligh
2016-12-08  9:44         ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-12-08 14:40           ` Alex Bligh
2016-12-08 15:59             ` Eric Blake
2016-12-08 16:03               ` Alex Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161129130850.rapxzzt4lrkym4cc@grep.be \
    --to=w@uter.be \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mpa@pengutronix.de \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=pborzenkov@virtuozzo.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.