From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58007) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1akuli-00010q-JL for qemu-devel@nongnu.org; Tue, 29 Mar 2016 10:37:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1akulc-0005Kq-Js for qemu-devel@nongnu.org; Tue, 29 Mar 2016 10:37:18 -0400 Received: from mail.avalus.com ([89.16.176.221]:39138) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1akulc-0005KZ-Aq for qemu-devel@nongnu.org; Tue, 29 Mar 2016 10:37:12 -0400 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: multipart/signed; boundary="Apple-Mail=_1450BBD8-CDD7-45DB-B81F-0D55DE7D96C4"; protocol="application/pgp-signature"; micalg=pgp-sha512 From: Alex Bligh In-Reply-To: <56FA8F5B.8060800@redhat.com> Date: Tue, 29 Mar 2016 15:37:09 +0100 Message-Id: <88E5F63B-B036-45C7-B2FD-B555D54E88F4@alex.org.uk> References: <1459173555-4890-1-git-send-email-eblake@redhat.com> <1459223796-28474-2-git-send-email-eblake@redhat.com> <55B49D68-2F63-4742-9B60-F6B428ABB3E9@alex.org.uk> <56FA8F5B.8060800@redhat.com> Subject: Re: [Qemu-devel] [Nbd] [PATCH 3/1] doc: Propose Structured Replies extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: "nbd-general@lists.sourceforge.net" , Wouter Verhelst , "qemu-devel@nongnu.org" , Alex Bligh --Apple-Mail=_1450BBD8-CDD7-45DB-B81F-0D55DE7D96C4 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Eric, > I guess what I need to add is that in transmission phase, most commands > have exactly one response per request; but commands may document > scenarios where there will be multiple responses to a single request. > NBD_CMD_READ uses the multiple responses to make partial read and error > handling possible Yes, this. > Yeah, but the reconstruction is easy; naively: > > while response_magic == structured: > copy len-8 bytes of data from response to given offset > response_magic == normal, read is complete It's easy if the result is written to memory. It's not easy if the purpose was (e.g.) to send it to a socket in a sendfile type way. It now requires the entire response be held in memory, which wasn't a requirement before. > Detecting overlap or incomplete reads would requires more complexity in > the client, but I don't know that a client has to care (the protocol is > specifically written that a client MAY detect bad servers, but not MUST; > a client that assumes the server is well-behaved is still compliant). Yep > However, you DO have a point that the server SHOULD send data in > reasonable-size chunks; and maybe I should propose a parallel extension > where, when negotiated between client and server, the server will > advertise minimum and preferred I/O sizes in response to the export name > request (for example, a server backed by a real block device may have a > minimum of 512 bytes or 4096 bytes, and a preferred size of 64k; while a > server backed by a normal file system may have a minimum of 1 byte); > then put in restrictions that a server SHOULD reject read/write requests > where offset and length are not multiples of the minimum, and that the > server SHOULD send read chunks aligned to the preferred size (with > exceptions for the head and tail of a larger buffer that meets minimum > alignment but not preferred alignment). What I'm really after is something that enables me to read 'nicely' in a manner where I won't get fragments. >> Also, given new commands aren't available unless you support structured >> replies, you now have to support reassembly of replies (if you want >> to use new features) even if all your reads are (e.g.) 1k. > > Are you arguing that there should be a flag that controls whether reads > must be in-order vs. reassembled? Reassembly must happen either way, > the question is whether having a way to allow out-of-order for > efficiency, vs. defaulting to in-order for easier computation, is worth it. No, that sounds overengineered. More a way of guaranteeing avoiding a fragmentation on 'simple' reads. Perhaps a 'DF' bit (don't fragment)! If the server doesn't like it, it can always error the command. >>> The server >>> + MUST NOT set the error field of a read chunk; if an error occurs, it >>> + MAY immediately end the sequence of structured response messages, >>> + MUST send the error in the concluding normal response, and SHOULD >>> + keep the connection open. The final non-structured response MUST >>> + set an error unless the sum of data sent by all read chunks totals >>> + the original client length request. >> >> add "and data for the entire range requested has been supplied." (I >> know this is technically implied by the fact data cannot be duplicated). > > Sure. But keep in mind that if (when?) we add a flag for allowing the > server to skip read chunks on holes, we'll have to tweak the wording to > allow the server to send fewer chunks than the client's length, where > the client must then assume zeroes for all chunks not received. Or alternatively a chunk representing a hole. I wonder whether you might be better to extend the chunk structure by 4 bytes to allow for future modifications like this (e.g. NBD_CHUNK_FLAG_HOLE means the chunk is full of zeroes and has no payload). -- Alex Bligh --Apple-Mail=_1450BBD8-CDD7-45DB-B81F-0D55DE7D96C4 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJW+pMWAAoJEBPm5K7i9iORNi0P/1spKCmyf78iXOik64v7eJdG 8udFcPu4L60VxCmIrC6Lcqol+0AV3CUzRTohRZ2x3R/hmmi0leR3D807yiligAKK 5Uav1Oz6kq4owHEjAM5ItGmTJ8+rZdhrRVgvaT3TGBstRdYKCwUPQsPm4U9uWOPo 3PQoxTct02piSoH9StF2MyuXS0VW6qkDNUk0Xyt7GnQqmR+fdq+M++OXuBq8SY21 OmeexgZuskETD1BTUZt/5YlKOc5ezt5cqn4wDxMU26EpHCgbDXcauhTUVSQ+uGPo HnYoqnTiOySI9PVVt93cnZ5BgZLQ6FXgb0H+ac9Quo9plLwGw/HcwIliuaUwI+dt /Da5OtTY+Tm5k7nDXD0+3xNK9z5pFhOnAIl1tcL2y7x/tGf9vHzxX6TyX+v5wpls PVx4En+WM+/2Bffu87NZl0lb1unv+DcwJPyEkQz8xR5LeEK3rrx3tjql1aRtbH4X Im8CiCGuL/Ct/rhsey9oSrZ8PmGlTiTOLxbU93uf95t1rY7//G8zeWiqcr6R25xx 7/lqvG3Va+MmuG3uxAos5Oap0dRO2A8Kl2S6qe6E0owKzwderZqZVjudhSzXw0VR Ox+xImuO2/qBExWG25VM0ZAwC4sQmBvOgdfDpBVGcMhdXqSGsmdOM6Z2o/6sKsOd lSe+OvNZl7fz4sGtCfYK =L4FL -----END PGP SIGNATURE----- --Apple-Mail=_1450BBD8-CDD7-45DB-B81F-0D55DE7D96C4--