All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES
@ 2016-04-01 21:29 Eric Blake
  2016-04-01 22:20 ` Alex Bligh
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Eric Blake @ 2016-04-01 21:29 UTC (permalink / raw)
  To: nbd-general; +Cc: den, qemu-devel, alex, pborzenkov

Rather than requiring allocation by default and allowing trims
only on request during WRITE_ZEROES, it seems like a better
default is to allow server optimizations by default and require
full allocation by specific request.  Since WRITE_ZEROES is
experimental and has not yet been implemented, we can flip the
sense of the command flag so that the default flags of all 0s
is the most efficient, and flags (whether _FUA or _NO_HOLE) are
added to make the operation have tighter constraints but
possibly slower execution.

The name NO_HOLE is also slightly nicer at expressing the fact
that there is no dependency on whether NBD_CMD_TRIM is supported.

Also tweak a couple of formatting issues for consistency (for
example, only reserve a bit number in one place).

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 doc/proto.md | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 758a661..725af4d 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -288,10 +288,10 @@ immediately after the handshake flags field in oldstyle negotiation:
   schedule I/O accesses as for a rotational medium
 - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
   `NBD_CMD_TRIM` commands
-- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
-  supports `NBD_CMD_WRITE_ZEROES` commands
-- bit 7, `NBD_FLAG_SEND_DF`; defined by the `STRUCTURED_REPLY` extension;
-  see below.
+- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; defined by the experimental
+  `WRITE_ZEROES` extension; see below.
+- bit 7, `NBD_FLAG_SEND_DF`; defined by the experimental `STRUCTURED_REPLY`
+  extension; see below.

 Clients SHOULD ignore unknown flags.

@@ -483,7 +483,7 @@ valid may depend on negotiation during the handshake phase.
   `NBD_CMD_WRITE_ZEROES` commands.  SHOULD be set to 1 if the client requires
   "Force Unit Access" mode of operation.  MUST NOT be set unless transmission
   flags included `NBD_FLAG_SEND_FUA`.
-- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES`
+- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
   extension; see below.
 - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
   extension; see below
@@ -736,7 +736,7 @@ losing the sparseness.
 To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
 one new command and one new command flag.

-* `NBD_CMD_WRITE_ZEROES` (6)
+* `NBD_CMD_WRITE_ZEROES`

     A write request with no payload. Length and offset define the location
     and amount of data to be zeroed.
@@ -753,9 +753,13 @@ one new command and one new command flag.
     If this flag was set, the server MUST NOT send the reply until it has
     ensured that the newly-zeroed data has reached permanent storage.

-    If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
-    flags field, the server MAY use trimming to zero out the area, but it
-    MUST ensure that the data reads back as zero.
+    By default, the server MAY use trimming to zero out the area, even
+    if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure
+    that the data reads back as zero.  However, the client MAY set the
+    command flag `NBD_CMD_FLAG_NO_HOLE` to inform the server that the
+    area MUST be fully provisioned, ensuring that future writes to the
+    same area will not cause fragmentation or cause failure due to
+    insufficient space.

     If an error occurs, the server SHOULD set the appropriate error code
     in the error field. The server MAY then close the connection.
@@ -766,9 +770,9 @@ return `EPERM` if it receives a write zeroes request on a read-only export.

 The extension adds the following new command flag:

-- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`.
-  SHOULD be set to 1 if the client allows the server to use trim to perform
-  the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even
+- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
+  SHOULD be set to 1 if the client wants to ensure that the server does
+  not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
   if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.

 ### `STRUCTURED_REPLY` extension
@@ -893,7 +897,7 @@ error, and alters the reply to the `NBD_CMD_READ` request.
       were sent earlier in the structured reply, the server SHOULD NOT
       send multiple distinct offsets that lie within the bounds of a
       single content chunk.  Valid as a reply to `NBD_CMD_READ`,
-      `NBD_CMD_WRITE`, and `NBD_CMD_TRIM`.
+      `NBD_CMD_WRITE`, `NBD_CMD_WRITE_ZEROES`, and `NBD_CMD_TRIM`.

       The payload is structured as:

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES
  2016-04-01 21:29 [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES Eric Blake
@ 2016-04-01 22:20 ` Alex Bligh
  2016-04-02  7:38 ` Denis V. Lunev
  2016-04-04 13:56 ` [Qemu-devel] [Nbd] " Eric Blake
  2 siblings, 0 replies; 5+ messages in thread
From: Alex Bligh @ 2016-04-01 22:20 UTC (permalink / raw)
  To: Eric Blake
  Cc: nbd-general, Denis V. Lunev, qemu-devel, Alex Bligh, pborzenkov


On 1 Apr 2016, at 22:29, Eric Blake <eblake@redhat.com> wrote:

> Rather than requiring allocation by default and allowing trims
> only on request during WRITE_ZEROES, it seems like a better
> default is to allow server optimizations by default and require
> full allocation by specific request.  Since WRITE_ZEROES is
> experimental and has not yet been implemented, we can flip the
> sense of the command flag so that the default flags of all 0s
> is the most efficient, and flags (whether _FUA or _NO_HOLE) are
> added to make the operation have tighter constraints but
> possibly slower execution.
> 
> The name NO_HOLE is also slightly nicer at expressing the fact
> that there is no dependency on whether NBD_CMD_TRIM is supported.
> 
> Also tweak a couple of formatting issues for consistency (for
> example, only reserve a bit number in one place).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Alex Bligh <alex@alex.org.uk>

+1

-- 
Alex Bligh


> ---
> doc/proto.md | 30 +++++++++++++++++-------------
> 1 file changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index 758a661..725af4d 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -288,10 +288,10 @@ immediately after the handshake flags field in oldstyle negotiation:
>   schedule I/O accesses as for a rotational medium
> - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
>   `NBD_CMD_TRIM` commands
> -- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
> -  supports `NBD_CMD_WRITE_ZEROES` commands
> -- bit 7, `NBD_FLAG_SEND_DF`; defined by the `STRUCTURED_REPLY` extension;
> -  see below.
> +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; defined by the experimental
> +  `WRITE_ZEROES` extension; see below.
> +- bit 7, `NBD_FLAG_SEND_DF`; defined by the experimental `STRUCTURED_REPLY`
> +  extension; see below.
> 
> Clients SHOULD ignore unknown flags.
> 
> @@ -483,7 +483,7 @@ valid may depend on negotiation during the handshake phase.
>   `NBD_CMD_WRITE_ZEROES` commands.  SHOULD be set to 1 if the client requires
>   "Force Unit Access" mode of operation.  MUST NOT be set unless transmission
>   flags included `NBD_FLAG_SEND_FUA`.
> -- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES`
> +- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
>   extension; see below.
> - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
>   extension; see below
> @@ -736,7 +736,7 @@ losing the sparseness.
> To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> one new command and one new command flag.
> 
> -* `NBD_CMD_WRITE_ZEROES` (6)
> +* `NBD_CMD_WRITE_ZEROES`
> 
>     A write request with no payload. Length and offset define the location
>     and amount of data to be zeroed.
> @@ -753,9 +753,13 @@ one new command and one new command flag.
>     If this flag was set, the server MUST NOT send the reply until it has
>     ensured that the newly-zeroed data has reached permanent storage.
> 
> -    If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
> -    flags field, the server MAY use trimming to zero out the area, but it
> -    MUST ensure that the data reads back as zero.
> +    By default, the server MAY use trimming to zero out the area, even
> +    if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure
> +    that the data reads back as zero.  However, the client MAY set the
> +    command flag `NBD_CMD_FLAG_NO_HOLE` to inform the server that the
> +    area MUST be fully provisioned, ensuring that future writes to the
> +    same area will not cause fragmentation or cause failure due to
> +    insufficient space.
> 
>     If an error occurs, the server SHOULD set the appropriate error code
>     in the error field. The server MAY then close the connection.
> @@ -766,9 +770,9 @@ return `EPERM` if it receives a write zeroes request on a read-only export.
> 
> The extension adds the following new command flag:
> 
> -- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`.
> -  SHOULD be set to 1 if the client allows the server to use trim to perform
> -  the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even
> +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
> +  SHOULD be set to 1 if the client wants to ensure that the server does
> +  not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
>   if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
> 
> ### `STRUCTURED_REPLY` extension
> @@ -893,7 +897,7 @@ error, and alters the reply to the `NBD_CMD_READ` request.
>       were sent earlier in the structured reply, the server SHOULD NOT
>       send multiple distinct offsets that lie within the bounds of a
>       single content chunk.  Valid as a reply to `NBD_CMD_READ`,
> -      `NBD_CMD_WRITE`, and `NBD_CMD_TRIM`.
> +      `NBD_CMD_WRITE`, `NBD_CMD_WRITE_ZEROES`, and `NBD_CMD_TRIM`.
> 
>       The payload is structured as:
> 
> -- 
> 2.5.5
> 

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES
  2016-04-01 21:29 [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES Eric Blake
  2016-04-01 22:20 ` Alex Bligh
@ 2016-04-02  7:38 ` Denis V. Lunev
  2016-04-04 13:56 ` [Qemu-devel] [Nbd] " Eric Blake
  2 siblings, 0 replies; 5+ messages in thread
From: Denis V. Lunev @ 2016-04-02  7:38 UTC (permalink / raw)
  To: Eric Blake, nbd-general; +Cc: qemu-devel, alex, pborzenkov

On 04/02/2016 12:29 AM, Eric Blake wrote:
> Rather than requiring allocation by default and allowing trims
> only on request during WRITE_ZEROES, it seems like a better
> default is to allow server optimizations by default and require
> full allocation by specific request.  Since WRITE_ZEROES is
> experimental and has not yet been implemented, we can flip the
> sense of the command flag so that the default flags of all 0s
> is the most efficient, and flags (whether _FUA or _NO_HOLE) are
> added to make the operation have tighter constraints but
> possibly slower execution.
>
> The name NO_HOLE is also slightly nicer at expressing the fact
> that there is no dependency on whether NBD_CMD_TRIM is supported.
>
> Also tweak a couple of formatting issues for consistency (for
> example, only reserve a bit number in one place).
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   doc/proto.md | 30 +++++++++++++++++-------------
>   1 file changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 758a661..725af4d 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -288,10 +288,10 @@ immediately after the handshake flags field in oldstyle negotiation:
>     schedule I/O accesses as for a rotational medium
>   - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
>     `NBD_CMD_TRIM` commands
> -- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
> -  supports `NBD_CMD_WRITE_ZEROES` commands
> -- bit 7, `NBD_FLAG_SEND_DF`; defined by the `STRUCTURED_REPLY` extension;
> -  see below.
> +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; defined by the experimental
> +  `WRITE_ZEROES` extension; see below.
> +- bit 7, `NBD_FLAG_SEND_DF`; defined by the experimental `STRUCTURED_REPLY`
> +  extension; see below.
>
>   Clients SHOULD ignore unknown flags.
>
> @@ -483,7 +483,7 @@ valid may depend on negotiation during the handshake phase.
>     `NBD_CMD_WRITE_ZEROES` commands.  SHOULD be set to 1 if the client requires
>     "Force Unit Access" mode of operation.  MUST NOT be set unless transmission
>     flags included `NBD_FLAG_SEND_FUA`.
> -- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES`
> +- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
>     extension; see below.
>   - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
>     extension; see below
> @@ -736,7 +736,7 @@ losing the sparseness.
>   To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
>   one new command and one new command flag.
>
> -* `NBD_CMD_WRITE_ZEROES` (6)
> +* `NBD_CMD_WRITE_ZEROES`
>
>       A write request with no payload. Length and offset define the location
>       and amount of data to be zeroed.
> @@ -753,9 +753,13 @@ one new command and one new command flag.
>       If this flag was set, the server MUST NOT send the reply until it has
>       ensured that the newly-zeroed data has reached permanent storage.
>
> -    If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
> -    flags field, the server MAY use trimming to zero out the area, but it
> -    MUST ensure that the data reads back as zero.
> +    By default, the server MAY use trimming to zero out the area, even
> +    if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure
> +    that the data reads back as zero.  However, the client MAY set the
> +    command flag `NBD_CMD_FLAG_NO_HOLE` to inform the server that the
> +    area MUST be fully provisioned, ensuring that future writes to the
> +    same area will not cause fragmentation or cause failure due to
> +    insufficient space.
>
>       If an error occurs, the server SHOULD set the appropriate error code
>       in the error field. The server MAY then close the connection.
> @@ -766,9 +770,9 @@ return `EPERM` if it receives a write zeroes request on a read-only export.
>
>   The extension adds the following new command flag:
>
> -- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`.
> -  SHOULD be set to 1 if the client allows the server to use trim to perform
> -  the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even
> +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
> +  SHOULD be set to 1 if the client wants to ensure that the server does
> +  not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
>     if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
>
>   ### `STRUCTURED_REPLY` extension
> @@ -893,7 +897,7 @@ error, and alters the reply to the `NBD_CMD_READ` request.
>         were sent earlier in the structured reply, the server SHOULD NOT
>         send multiple distinct offsets that lie within the bounds of a
>         single content chunk.  Valid as a reply to `NBD_CMD_READ`,
> -      `NBD_CMD_WRITE`, and `NBD_CMD_TRIM`.
> +      `NBD_CMD_WRITE`, `NBD_CMD_WRITE_ZEROES`, and `NBD_CMD_TRIM`.
>
>         The payload is structured as:
>
- do not see the problem in this change. It looks correct to me

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [Nbd] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES
  2016-04-01 21:29 [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES Eric Blake
  2016-04-01 22:20 ` Alex Bligh
  2016-04-02  7:38 ` Denis V. Lunev
@ 2016-04-04 13:56 ` Eric Blake
  2016-04-04 15:15   ` Alex Bligh
  2 siblings, 1 reply; 5+ messages in thread
From: Eric Blake @ 2016-04-04 13:56 UTC (permalink / raw)
  To: nbd-general; +Cc: den, qemu-devel, pborzenkov

[-- Attachment #1: Type: text/plain, Size: 2766 bytes --]

On 04/01/2016 03:29 PM, Eric Blake wrote:
> Rather than requiring allocation by default and allowing trims
> only on request during WRITE_ZEROES, it seems like a better
> default is to allow server optimizations by default and require
> full allocation by specific request.  Since WRITE_ZEROES is
> experimental and has not yet been implemented, we can flip the
> sense of the command flag so that the default flags of all 0s
> is the most efficient, and flags (whether _FUA or _NO_HOLE) are
> added to make the operation have tighter constraints but
> possibly slower execution.
> 
> The name NO_HOLE is also slightly nicer at expressing the fact
> that there is no dependency on whether NBD_CMD_TRIM is supported.
> 

> -    If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
> -    flags field, the server MAY use trimming to zero out the area, but it
> -    MUST ensure that the data reads back as zero.
> +    By default, the server MAY use trimming to zero out the area, even
> +    if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure
> +    that the data reads back as zero.  However, the client MAY set the
> +    command flag `NBD_CMD_FLAG_NO_HOLE` to inform the server that the
> +    area MUST be fully provisioned, ensuring that future writes to the
> +    same area will not cause fragmentation or cause failure due to
> +    insufficient space.

Question - qemu already has the notion of an nbd server that can be
configured to explicitly check for large repetitions of zero during
NBD_CMD_WRITE, and create holes in the corresponding file as a result
(basically because qemu is compensating for NBD_CMD_WRITE_ZEROES not
being part of the protocol yet, so that the destination file can still
be sparse even though lots of zeroes were sent over the wire). Should we
also allow servers to accept NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
as an explicit request to NOT perform the magic of searching for large
runs of zeroes, but to just write everything as written?

For older qemu clients that are not aware of the new WRITE_ZEROES
command, the client will also not use the NO_HOLE flag for normal
writes, and everything is at the mercy of the server recognizing zeroes.
 But for newer qemu clients, it can be usefully assumed that the client
will always prefer WRITE_ZEROES (without flags) when it is okay to punch
a hole, and therefore that the new client will also send the NO_HOLE
flag to normal WRITE to inform the server that it doesn't need to waste
time searching for large runs of zeroes.

If this sounds reasonable, I'll go ahead and propose a patch.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [Nbd] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES
  2016-04-04 13:56 ` [Qemu-devel] [Nbd] " Eric Blake
@ 2016-04-04 15:15   ` Alex Bligh
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Bligh @ 2016-04-04 15:15 UTC (permalink / raw)
  To: Eric Blake
  Cc: nbd-general, Denis V. Lunev, qemu-devel, Alex Bligh, pborzenkov

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]


On 4 Apr 2016, at 14:56, Eric Blake <eblake@redhat.com> wrote:

> Question - qemu already has the notion of an nbd server that can be
> configured to explicitly check for large repetitions of zero during
> NBD_CMD_WRITE, and create holes in the corresponding file as a result
> (basically because qemu is compensating for NBD_CMD_WRITE_ZEROES not
> being part of the protocol yet, so that the destination file can still
> be sparse even though lots of zeroes were sent over the wire). Should we
> also allow servers to accept NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
> as an explicit request to NOT perform the magic of searching for large
> runs of zeroes, but to just write everything as written?

Seems a good idea, but I'd make it a SHOULD rather than a MUST. IE
it's saying "please don't make a hole" but really this is up to the
nbd server. This is true in any case, as the underlying filing system
could create a hole.

--
Alex Bligh





[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 842 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-04 15:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-01 21:29 [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES Eric Blake
2016-04-01 22:20 ` Alex Bligh
2016-04-02  7:38 ` Denis V. Lunev
2016-04-04 13:56 ` [Qemu-devel] [Nbd] " Eric Blake
2016-04-04 15:15   ` Alex Bligh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.