From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52260) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aj0fi-0002gu-9F for qemu-devel@nongnu.org; Thu, 24 Mar 2016 04:31:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aj0fe-0001q9-V1 for qemu-devel@nongnu.org; Thu, 24 Mar 2016 04:31:14 -0400 Received: from mail-am1on0102.outbound.protection.outlook.com ([157.56.112.102]:9129 helo=emea01-am1-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aj0fe-0001pn-5k for qemu-devel@nongnu.org; Thu, 24 Mar 2016 04:31:10 -0400 Date: Thu, 24 Mar 2016 10:57:06 +0300 From: Pavel Borzenkov Message-ID: <20160324075706.GA24831@phobos.sw.ru> References: <1458742562-30624-1-git-send-email-den@openvz.org> <1458742562-30624-2-git-send-email-den@openvz.org> <20160323172116.GA2467@grep.be> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160323172116.GA2467@grep.be> Subject: Re: [Qemu-devel] [Nbd] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wouter Verhelst Cc: nbd-general@lists.sourceforge.net, Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi , Paolo Bonzini , "Denis V. Lunev" On Wed, Mar 23, 2016 at 06:21:16PM +0100, Wouter Verhelst wrote: > Hi, > > On Wed, Mar 23, 2016 at 05:16:01PM +0300, Denis V. Lunev wrote: > > From: Pavel Borzenkov > > > > There exist some cases when a client knows that the data it is going to > > write is all zeroes. Such cases include mirroring or backing up a device > > implemented by a sparse file. > > > > With current NBD command set, the client has to issue NBD_CMD_WRITE > > command with zeroed payload and transfer these zero bytes through the > > wire. The server has to write the data onto disk, effectively denying > > the sparseness. > > > > To remedy this, the patch adds WRITE_ZEROES extension with one new > > NBD_CMD_WRITE_ZEROES command. > > > > Signed-off-by: Pavel Borzenkov > > Reviewed-by: Roman Kagan > > Signed-off-by: Denis V. Lunev > > CC: Wouter Verhelst > > CC: Paolo Bonzini > > CC: Kevin Wolf > > CC: Stefan Hajnoczi > > --- > > doc/proto.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 44 insertions(+) > > > > diff --git a/doc/proto.md b/doc/proto.md > > index 463ef8a..cda213c 100644 > > --- a/doc/proto.md > > +++ b/doc/proto.md > > @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle negotiation: > > schedule I/O accesses as for a rotational medium > > - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports > > `NBD_CMD_TRIM` commands > > +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server > > + supports `NBD_CMD_WRITE_ZEROES` commands > > > > ##### Client flags > > > > @@ -471,6 +473,10 @@ The following request types exist: > > about the contents of the export affected by this command, until > > overwriting it again with `NBD_CMD_WRITE`. > > > > +* `NBD_CMD_WRITE_ZEROES` (6) > > + > > + Defined by the experimental `WRITE_ZEROES` extension; see below. > > + > > * Other requests > > > > Some third-party implementations may require additional protocol > > @@ -594,6 +600,44 @@ option reply type. > > message if they do not also send it as a reply to the > > `NBD_OPT_SELECT` message. > > > > +### `WRITE_ZEROES` extension > > + > > +There exist some cases when a client knows that the data it is going to write > > +is all zeroes. Such cases include mirroring or backing up a device implemented > > +by a sparse file. With current NBD command set, the client has to issue > > +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes > > +through the wire. The server has to write the data onto disk, effectively > > +denying the sparseness. > > + > > +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds > > +one new command with two command flags. > > + > > +* `NBD_CMD_WRITE_ZEROES` (6) > > + > > + A write request with no payload. Length and offset define the location > > + and amount of data to be zeroed. > > + > > + The server MUST zero out the data on disk, and then send the reply > > + message. The server MAY send the reply message before the data has > > + reached permanent storage. > > + > > + If the `NBD_FLAG_SEND_FUA` flag ("Force Unit Access") was set in the > > + export flags field, the client MAY set the flag `NBD_CMD_FLAG_FUA` (bit 0) > > + in the command flags field. If this flag was set, the server MUST NOT send > > + the reply until it has ensured that the newly-zeroed data has reached > > + permanent storage. > > + > > + If the flag `NBD_CMD_FLAG_MAY_TRIM` (bit 1) was set by the client in the > > + command flags field, the server MAY use trimming to zero out the area, > > + but it MUST ensure that the data reads back as zero. > > + > > + If an error occurs, the server SHOULD set the appropriate error code > > + in the error field. The server MAY then close the connection. > > + > > +The server SHOULD return `ENOSPC` if it receives a write zeroes request > > +including one or more sectors beyond the size of the device. It SHOULD > > +return `EPERM` if it receives a write zeroes request on a read-only export. > > + > > So, the semantics of your proposed WRITE_ZEROES are exactly the same as > the WRITE command, except that no payload is sent? > > In that case, I think it's slightly more sensible if we don't add a new > command, but instead just have an NBD_CMD_FLAG_ZEROES added to the WRITE > command instead. After all, they're going to be (mostly) the same > anyway. > > Did you propose a separate command for a specific reason that I'm > missing (or forgetting), or is that just an oversight? No, there is no specific reason. Looks like NBD_CMD_FLAG_ZEROES fits the spec and implementations nicely. So I'll rewrite the extension and add the flag instead of the whole command. > > -- > < ron> I mean, the main *practical* problem with C++, is there's like a dozen > people in the world who think they really understand all of its rules, > and pretty much all of them are just lying to themselves too. > -- #debian-devel, OFTC, 2016-02-12 > > ------------------------------------------------------------------------------ > Transform Data into Opportunity. > Accelerate data analysis in your applications with > Intel Data Analytics Acceleration Library. > Click to learn more. > http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140 > _______________________________________________ > Nbd-general mailing list > Nbd-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nbd-general