QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Pavel Borzenkov <pborzenkov@virtuozzo.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Wouter Verhelst <w@uter.be>,
	den@openvz.org
Subject: [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension
Date: Wed, 23 Mar 2016 17:16:01 +0300
Message-ID: <1458742562-30624-2-git-send-email-den@openvz.org> (raw)
In-Reply-To: <1458742562-30624-1-git-send-email-den@openvz.org>

From: Pavel Borzenkov <pborzenkov@virtuozzo.com>

There exist some cases when a client knows that the data it is going to
write is all zeroes. Such cases include mirroring or backing up a device
implemented by a sparse file.

With current NBD command set, the client has to issue NBD_CMD_WRITE
command with zeroed payload and transfer these zero bytes through the
wire. The server has to write the data onto disk, effectively denying
the sparseness.

To remedy this, the patch adds WRITE_ZEROES extension with one new
NBD_CMD_WRITE_ZEROES command.

Signed-off-by: Pavel Borzenkov <pborzenkov@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Wouter Verhelst <w@uter.be>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
 doc/proto.md | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/doc/proto.md b/doc/proto.md
index 463ef8a..cda213c 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle negotiation:
   schedule I/O accesses as for a rotational medium
 - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
   `NBD_CMD_TRIM` commands
+- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
+  supports `NBD_CMD_WRITE_ZEROES` commands
 
 ##### Client flags
 
@@ -471,6 +473,10 @@ The following request types exist:
     about the contents of the export affected by this command, until
     overwriting it again with `NBD_CMD_WRITE`.
 
+* `NBD_CMD_WRITE_ZEROES` (6)
+
+    Defined by the experimental `WRITE_ZEROES` extension; see below.
+
 * Other requests
 
     Some third-party implementations may require additional protocol
@@ -594,6 +600,44 @@ option reply type.
       message if they do not also send it as a reply to the
       `NBD_OPT_SELECT` message.
 
+### `WRITE_ZEROES` extension
+
+There exist some cases when a client knows that the data it is going to write
+is all zeroes. Such cases include mirroring or backing up a device implemented
+by a sparse file. With current NBD command set, the client has to issue
+`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes
+through the wire. The server has to write the data onto disk, effectively
+denying the sparseness.
+
+To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
+one new command with two command flags.
+
+* `NBD_CMD_WRITE_ZEROES` (6)
+
+    A write request with no payload. Length and offset define the location
+    and amount of data to be zeroed.
+
+    The server MUST zero out the data on disk, and then send the reply
+    message. The server MAY send the reply message before the data has
+    reached permanent storage.
+
+    If the `NBD_FLAG_SEND_FUA` flag ("Force Unit Access") was set in the
+    export flags field, the client MAY set the flag `NBD_CMD_FLAG_FUA` (bit 0)
+    in the command flags field. If this flag was set, the server MUST NOT send
+    the reply until it has ensured that the newly-zeroed data has reached
+    permanent storage.
+
+    If the flag `NBD_CMD_FLAG_MAY_TRIM` (bit 1) was set by the client in the
+    command flags field, the server MAY use trimming to zero out the area,
+    but it MUST ensure that the data reads back as zero.
+
+    If an error occurs, the server SHOULD set the appropriate error code
+    in the error field. The server MAY then close the connection.
+
+The server SHOULD return `ENOSPC` if it receives a write zeroes request
+including one or more sectors beyond the size of the device. It SHOULD
+return `EPERM` if it receives a write zeroes request on a read-only export.
+
 ## About this file
 
 This file tries to document the NBD protocol as it is currently
-- 
2.1.4

  reply index

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` Denis V. Lunev [this message]
2016-03-23 15:14   ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` [Qemu-devel] " Eric Blake
2016-03-25  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458742562-30624-2-git-send-email-den@openvz.org \
    --to=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=pborzenkov@virtuozzo.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git
	git clone --mirror https://lore.kernel.org/qemu-devel/2 qemu-devel/git/2.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git