All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>,
	nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Pavel Borzenkov <pborzenkov@virtuozzo.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Wouter Verhelst <w@uter.be>
Subject: Re: [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 16:08:13 -0600	[thread overview]
Message-ID: <56F4654D.2020700@redhat.com> (raw)
In-Reply-To: <1458742562-30624-3-git-send-email-den@openvz.org>

[-- Attachment #1: Type: text/plain, Size: 4961 bytes --]

On 03/23/2016 08:16 AM, Denis V. Lunev wrote:
> From: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> 
> With the availability of sparse storage formats, it is often needed to
> query status of a particular LBA range and read only those blocks of
> data that are actually present on the block device.
> 
> To provide such information, the patch adds GET_LBA_STATUS extension
> with one new NBD_CMD_GET_LBA_STATUS command.
> 

> +* `NBD_CMD_GET_LBA_STATUS` (7)
> +
> +    An LBA range status query request. Length and offset define the range
> +    of interest. The server MUST reply with a reply header, followed
> +    immediately by the following data:
> +
> +      - 32 bits, length of parameter data that follow (unsigned)
> +      - zero or more LBA status descriptors, each having the following
> +        structure:
> +
> +        * 64 bits, offset (unsigned)
> +        * 32 bits, length (unsigned)
> +        * 16 bits, status (unsigned)
> +
> +    unless an error condition has occurred.

To date, only the NBD_CMD_READ command caused the server to send data
after the handle in its reply.  This would be the second command with a
data field in the response, but it is returning a variable amount of
data, not directly tied to the client's length - but at least you did
make it structured so that the client knows how much to read.  However,
your patch is incomplete; you'll need to edit the "Transmission" section
of the document to mention the rules on the server sending data, as the
existing text would now be wrong:

> The server replies with:
> 
> S: 32 bits, 0x67446698, magic (NBD_REPLY_MAGIC)
> S: 32 bits, error
> S: 64 bits, handle
> S: (length bytes of data if the request is of type NBD_CMD_READ)

You may also want to add a rule that for all future extensions, any
command that requires data in the server response (other than the server
response to NBD_CMD_READ) must include a 32-bit length as the first
field of its data payload, so that the server reply is always structured.

Hmm, I still think it would be hard to write a wireshark dissector for
server replies, since there is no indication whether a given reply will
include data or not.  The _client_ knows (well, any well-written client
that uses a different value for 'handle' for every command sent to the
server), because it can read the returned 'handle' field to know what
command it matches to and therefore what data to expect; but a
third-party observer seeing _just_ the server messages has no idea which
server responses should have payload.  Scanning the stream and assuming
that a new reply starts each time NBD_REPLY_MAGIC is encountered is a
close approximation, but hits false positives if the data being returned
for NBD_CMD_READ contains the same magic number as part of its contents.
 Obviously, back-compat says we can't change the response to
NBD_CMD_READ, but that means that a wireshark dissector has to either
maintain context, or hunt through returned data looking for potential
magic numbers and possibly hitting false positives.

That said, maybe we want to allow global flag negotiation prior to
transmission to add a new flag on both server and client side - the
server would advertise that it is capable of a new reply mode, and the
client then has to reply that it wants to use the new reply mode; in
that mode, all server replies starting with magic 0x67446698
(NBD_REPLY_MAGIC) will never have a data payload, and all commands that
cause a reply with payload (both NBD_CMD_READ and the new
NBD_CMD_GET_LBA_STATUS of this message, or whatever name we give it)
will reply with a NEW magic number:

S: 32 bits, XXXX, magic (NBD_REPLY_MAGIC2)
S: 32 bits, error
S: 64 bits, handle
S: 32 bits, length
S: length bytes of data

so that the server's data stream is fully structured without having to
maintain context of the client's requests.  That is, a dissector can now
merely scan for both magic numbers; and on a stream between a client and
server that have negotiated the new mode, the old magic number will
never have a payload, and the new magic number will always be
accompanied with a payload that describes how much data to read to the
boundary of the next reply.

For that matter, right now, NBD_CMD_READ is required to either end the
session or return length bytes of data even when error is non-zero (but
where those bytes MAY be invalid); but by adding a negotiated flag for
structured length replies, it would be possible to allow for short reads
and/or returning an error with 0 bytes of payload but still keeping the
connection to the client open, without having to send wasted bytes over
the wire.

I could write up a negotiation of global flags for structured reply
lengths as an extension proposal, if you think it is worth it.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  parent reply	other threads:[~2016-03-24 22:08 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14   ` Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` Eric Blake [this message]
2016-03-25  8:49     ` Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F4654D.2020700@redhat.com \
    --to=eblake@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=pborzenkov@virtuozzo.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.