All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Bligh <alex@alex.org.uk>
To: Wouter Verhelst <w@uter.be>
Cc: "nbd-general@lists.sourceforge.net"
	<nbd-general@lists.sourceforge.net>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Alex Bligh <alex@alex.org.uk>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 3/1] doc: Propose Structured Replies extension
Date: Tue, 29 Mar 2016 20:39:29 +0100	[thread overview]
Message-ID: <C18AAD54-A6E4-43D5-AA8B-481E8D9DF752@alex.org.uk> (raw)
In-Reply-To: <20160329185157.GC12469@grep.be>


On 29 Mar 2016, at 19:51, Wouter Verhelst <w@uter.be> wrote:

>> 
>> But I was envisioning the opposite: the server must NOT send X bytes
>> unless it knows they are valid; if it encounters a read error at Z,
>> then it sends a structured read of Z-1 bytes before the final normal
>> message that reports overall failure.  The client then assumes that
>> all X bytes received are valid.
> 
> The problem with that approach is that it makes it impossible for a
> server to use a sendfile()-like system call, where you don't know that
> there's a read error until start sending out data to the client (which
> implies that you must've already sent out the header).

I don't think sendfile semantics are ever compatible with reporting
read errors *unless* you pad after the read.

IIRC the way sendfile works is that you specify a pointer to an offset,
and sendfile sends as much as it can read (up to the length specified)
and updates the offset for the length read.

Naturally at the start of the read section, you don't know when the
error is going to occur, so you must say that the length of the data
read is going to be the length of the actual chunk. sendfile then
does its stuff, and fills up either the whole thing, or part of it.
In the case that part of the data (only) is available, you can't
report the error there and then, because the client is expecting
chunk data, so you must either close the connection (potentially
disruptive) or pad the data, and report the error at the end.

Using Eric's current scheme, you have no way of knowing where the error
occurred. Remember the chunks could be out of order, e.g. you get
chunks 1,3,5,7,9 in, and then an error, so you have no idea where
the error was. It could be in chunks 1,3,5,7,9 (and the server might
have padded the rest of the chunk) or in an unread chunk (2,4,6,8,10).
This seems undesirable.

I think we are paying too much attention to trying to keep NBD_RESPONSE
intact. The justification for this was (I think) that it made it easier
for existing protocol analysers. It doesn't, really, as all the data
is going to come BEFORE the NBD_RESPONSE (unlike in NBD_CMD_READ in
other situations).

I think we should therefore look at this the other way around. Here's
a straw man proposal as an alternative for the reply bits. For
a structured reply ALL we get is the chunks. The final chunk
(possibly the only chunk) is marked specially. Each chunk looks something
like:

offset+
0000    32 bit   NBD_STRUCTURED_REPLY_MAGIC
0004    64 bit   handle
000C    32 bit   Flags
0010    32 bit   Payload length


We have a couple of flags defined:

NBD_CHUNK_IS_DATA: the chunk is data, and the payload is a 64 bit offset
plus the data read

NBD_CHUNK_IS_HOLE: the chunk is zeroes, and the payload is a 64 bit offset
(only)

NBD_CHUNK_IS_END: (must be the final chunk). The payload is a 64 bit offset
plus a 32 bit error code, or zero. If no error, the offset must be set to
the total amount read. If there is an error, the offset MAY indicate the
position of the error. If an error occurs, no more chunks should be sent.


The advantages of this scheme are:

1. Only one packet type in the reply (chunks)

2. It's no more difficult to implement wireshark decoding of this (in
   addition to the normal NBD protocol) than the current proposal. I'd
   suggest in fact they could be easier.

3. Chunks that error part way through (sendfile type) must still be
   padded but now can indicate error location.

4. It would be possible to allow EVERY server reply to be a structured
   reply that simply set NBD_CHUNK_IS_END. That gives us a convenient
   route to servers which only implement structured replies. With DF,
   this would be little harder than implementing the current
   protocol.

-- 
Alex Bligh

  parent reply	other threads:[~2016-03-29 19:39 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-28 13:59 [Qemu-devel] [PATCH] doc: Mention proper use of handle Eric Blake
2016-03-29  3:56 ` [Qemu-devel] [PATCH 2/1] doc: More details on flag negotiation Eric Blake
2016-03-29  3:56 ` [Qemu-devel] [PATCH 3/1] doc: Propose Structured Replies extension Eric Blake
2016-03-29  7:33   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-29  8:24   ` Alex Bligh
2016-03-29 14:21     ` Eric Blake
2016-03-29 14:37       ` Alex Bligh
2016-03-29 15:12         ` Eric Blake
2016-03-29 16:37           ` Wouter Verhelst
2016-03-29 17:34           ` Alex Bligh
2016-03-29 17:45             ` Eric Blake
2016-03-29 18:03               ` Wouter Verhelst
2016-03-29 18:07                 ` Eric Blake
2016-03-29 18:19                   ` Wouter Verhelst
2016-03-29 18:25                     ` Eric Blake
2016-03-29 18:09                 ` Alex Bligh
2016-03-29 17:53   ` Wouter Verhelst
2016-03-29 18:23     ` Eric Blake
2016-03-29 18:51       ` Wouter Verhelst
2016-03-29 19:06         ` Wouter Verhelst
2016-03-29 19:39         ` Alex Bligh [this message]
2016-03-29 20:00           ` Eric Blake
2016-03-29 20:18             ` Alex Bligh
2016-03-29 20:44             ` Alex Bligh
2016-03-29 21:05               ` Wouter Verhelst
2016-03-29 22:05                 ` Alex Bligh
2016-03-29 22:45                   ` Wouter Verhelst
2016-03-29 22:53                     ` Alex Bligh
2016-03-29  7:11 ` [Qemu-devel] [PATCH] doc: Mention proper use of handle Wouter Verhelst
2016-03-29 13:59   ` Eric Blake
2016-03-29 23:00 ` [Qemu-devel] [PATCH v2 0/3] NBD Structured Read Eric Blake
2016-03-29 23:00   ` [Qemu-devel] [PATCH v2 1/3] NBD proto: add "Command flags" section Eric Blake
2016-03-29 23:00   ` [Qemu-devel] [PATCH v2 2/3] doc: Mention proper use of handle Eric Blake
2016-03-29 23:01   ` [Qemu-devel] [PATCH v2 3/3] doc: Propose Structured Read extension Eric Blake
2016-03-29 23:29     ` Eric Blake
2016-03-30  6:50     ` Alex Bligh
2016-03-30 17:45       ` Eric Blake
2016-03-30 19:51         ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-30 20:54           ` Eric Blake
2016-03-30 21:26             ` Wouter Verhelst
2016-03-30 22:48         ` [Qemu-devel] " Alex Bligh
2016-03-30 20:44     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-30  8:09   ` [Qemu-devel] [Nbd] [PATCH v2 0/3] NBD Structured Read Wouter Verhelst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C18AAD54-A6E4-43D5-AA8B-481E8D9DF752@alex.org.uk \
    --to=alex@alex.org.uk \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=qemu-devel@nongnu.org \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.