All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements
@ 2016-05-17  9:15 Denis V. Lunev
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 1/5] block: split write_zeroes always Denis V. Lunev
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Eric Blake, Kevin Wolf

This series enables tracepoints in qemu-io and improves
bdrv/qcow2_co_write_zeroes framework.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>

Changes from v1:
- patch 1 is separated to additional patchset
- grammar fixes, thank you Eric

Denis V. Lunev (5):
  block: split write_zeroes always
  qcow2: simplify logic in qcow2_co_write_zeroes
  qcow2: add tracepoints for qcow2_co_write_zeroes
  qcow2: fix condition in is_zero_cluster
  qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes

 block/io.c    |  6 +++---
 block/qcow2.c | 63 ++++++++++++++++++++++-------------------------------------
 trace-events  |  2 ++
 3 files changed, 28 insertions(+), 43 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 1/5] block: split write_zeroes always
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
@ 2016-05-17  9:15 ` Denis V. Lunev
  2016-05-17 16:34   ` Kevin Wolf
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 2/5] qcow2: simplify logic in qcow2_co_write_zeroes Denis V. Lunev
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Eric Blake, Kevin Wolf

We should split requests even if they are less than write_zeroes_alignment.
For example we can have the following request:
  offset 62k
  size   4k
  write_zeroes_alignment 64k
The original code sent 1 request covering 2 qcow2 clusters, and resulted
in both clusters being allocated. But by splitting the request, we can
cater to the case where one of the two clusters can be zeroed as a
whole, for only 1 cluster allocated after the operation.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
---
 block/io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index cd6d71a..6a24ea8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1172,13 +1172,13 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
         /* Align request.  Block drivers can expect the "bulk" of the request
          * to be aligned.
          */
-        if (bs->bl.write_zeroes_alignment
-            && num > bs->bl.write_zeroes_alignment) {
+        if (bs->bl.write_zeroes_alignment) {
             if (sector_num % bs->bl.write_zeroes_alignment != 0) {
                 /* Make a small request up to the first aligned sector.  */
                 num = bs->bl.write_zeroes_alignment;
                 num -= sector_num % bs->bl.write_zeroes_alignment;
-            } else if ((sector_num + num) % bs->bl.write_zeroes_alignment != 0) {
+            } else if (num > bs->bl.write_zeroes_alignment &&
+                (sector_num + num) % bs->bl.write_zeroes_alignment != 0) {
                 /* Shorten the request to the last aligned sector.  num cannot
                  * underflow because num > bs->bl.write_zeroes_alignment.
                  */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 2/5] qcow2: simplify logic in qcow2_co_write_zeroes
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 1/5] block: split write_zeroes always Denis V. Lunev
@ 2016-05-17  9:15 ` Denis V. Lunev
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes Denis V. Lunev
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Eric Blake, Kevin Wolf

Unaligned requests will occupy only one cluster. This is true since the
previous commit. Simplify the code taking this consideration into
account.

In other words, the caller is now buggy if it ever passes us an unaligned
request that crosses cluster boundaries (the only requests that can cross
boundaries will be aligned).

There are no other changes so far.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 62febfc..9a54bbd 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2436,33 +2436,20 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
     int tail = (sector_num + nb_sectors) % s->cluster_sectors;
 
     if (head != 0 || tail != 0) {
-        int64_t cl_end = -1;
+        int64_t cl_start = sector_num - head;
 
-        sector_num -= head;
-        nb_sectors += head;
+        assert(cl_start + s->cluster_sectors >= sector_num + nb_sectors);
 
-        if (tail != 0) {
-            nb_sectors += s->cluster_sectors - tail;
-        }
+        sector_num = cl_start;
+        nb_sectors = s->cluster_sectors;
 
         if (!is_zero_cluster(bs, sector_num)) {
             return -ENOTSUP;
         }
 
-        if (nb_sectors > s->cluster_sectors) {
-            /* Technically the request can cover 2 clusters, f.e. 4k write
-               at s->cluster_sectors - 2k offset. One of these cluster can
-               be zeroed, one unallocated */
-            cl_end = sector_num + nb_sectors - s->cluster_sectors;
-            if (!is_zero_cluster(bs, cl_end)) {
-                return -ENOTSUP;
-            }
-        }
-
         qemu_co_mutex_lock(&s->lock);
         /* We can have new write after previous check */
-        if (!is_zero_cluster_top_locked(bs, sector_num) ||
-                (cl_end > 0 && !is_zero_cluster_top_locked(bs, cl_end))) {
+        if (!is_zero_cluster_top_locked(bs, sector_num)) {
             qemu_co_mutex_unlock(&s->lock);
             return -ENOTSUP;
         }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 1/5] block: split write_zeroes always Denis V. Lunev
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 2/5] qcow2: simplify logic in qcow2_co_write_zeroes Denis V. Lunev
@ 2016-05-17  9:15 ` Denis V. Lunev
  2016-05-17 15:37   ` Eric Blake
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster Denis V. Lunev
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Eric Blake, Kevin Wolf

This patch follows guidelines of all other tracepoints in qcow2, like ones
in qcow2_co_writev. I think that they should dump values in the same
quantities or be changed all to gather.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 5 +++++
 trace-events  | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 9a54bbd..97bf870 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2435,6 +2435,9 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
     int head = sector_num % s->cluster_sectors;
     int tail = (sector_num + nb_sectors) % s->cluster_sectors;
 
+    trace_qcow2_write_zeroes_start_req(qemu_coroutine_self(), sector_num,
+                                       nb_sectors);
+
     if (head != 0 || tail != 0) {
         int64_t cl_start = sector_num - head;
 
@@ -2457,6 +2460,8 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
         qemu_co_mutex_lock(&s->lock);
     }
 
+    trace_qcow2_write_zeroes(qemu_coroutine_self(), sector_num, nb_sectors);
+
     /* Whatever is left can use real zero clusters */
     ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS, nb_sectors);
     qemu_co_mutex_unlock(&s->lock);
diff --git a/trace-events b/trace-events
index 4fce005..627f34f 100644
--- a/trace-events
+++ b/trace-events
@@ -612,6 +612,8 @@ qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
 qcow2_writev_start_part(void *co) "co %p"
 qcow2_writev_done_part(void *co, int cur_nr_sectors) "co %p cur_nr_sectors %d"
 qcow2_writev_data(void *co, uint64_t offset) "co %p offset %" PRIx64
+qcow2_write_zeroes_start_req(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
+qcow2_write_zeroes(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
 
 # block/qcow2-cluster.c
 qcow2_alloc_clusters_offset(void *co, uint64_t offset, int num) "co %p offset %" PRIx64 " num %d"
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
                   ` (2 preceding siblings ...)
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes Denis V. Lunev
@ 2016-05-17  9:15 ` Denis V. Lunev
  2016-05-17 15:11   ` Kevin Wolf
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes Denis V. Lunev
  2016-05-17 15:11 ` [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Kevin Wolf
  5 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Kevin Wolf

We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
not possible.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 97bf870..05beb64 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2412,7 +2412,7 @@ static bool is_zero_cluster(BlockDriverState *bs, int64_t start)
     BlockDriverState *file;
     int64_t res = bdrv_get_block_status_above(bs, NULL, start,
                                               s->cluster_sectors, &nr, &file);
-    return res >= 0 && ((res & BDRV_BLOCK_ZERO) || !(res & BDRV_BLOCK_DATA));
+    return res >= 0 && (res & BDRV_BLOCK_ZERO);
 }
 
 static bool is_zero_cluster_top_locked(BlockDriverState *bs, int64_t start)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
                   ` (3 preceding siblings ...)
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster Denis V. Lunev
@ 2016-05-17  9:15 ` Denis V. Lunev
  2016-05-24 18:14   ` Eric Blake
  2016-05-17 15:11 ` [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Kevin Wolf
  5 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2016-05-17  9:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, Kevin Wolf

They are used once only. This makes code more compact.

The patch also improves comments in the code.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 05beb64..feaf146 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2405,27 +2405,6 @@ finish:
 }
 
 
-static bool is_zero_cluster(BlockDriverState *bs, int64_t start)
-{
-    BDRVQcow2State *s = bs->opaque;
-    int nr;
-    BlockDriverState *file;
-    int64_t res = bdrv_get_block_status_above(bs, NULL, start,
-                                              s->cluster_sectors, &nr, &file);
-    return res >= 0 && (res & BDRV_BLOCK_ZERO);
-}
-
-static bool is_zero_cluster_top_locked(BlockDriverState *bs, int64_t start)
-{
-    BDRVQcow2State *s = bs->opaque;
-    int nr = s->cluster_sectors;
-    uint64_t off;
-    int ret;
-
-    ret = qcow2_get_cluster_offset(bs, start << BDRV_SECTOR_BITS, &nr, &off);
-    return ret == QCOW2_CLUSTER_UNALLOCATED || ret == QCOW2_CLUSTER_ZERO;
-}
-
 static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
@@ -2439,20 +2418,32 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
                                        nb_sectors);
 
     if (head != 0 || tail != 0) {
-        int64_t cl_start = sector_num - head;
+        BlockDriverState *file;
+        uint64_t off;
+        int nr;
+
+        int64_t cl_start = sector_num - head, res;
 
         assert(cl_start + s->cluster_sectors >= sector_num + nb_sectors);
 
         sector_num = cl_start;
         nb_sectors = s->cluster_sectors;
 
-        if (!is_zero_cluster(bs, sector_num)) {
+        /* check that the cluster is zeroed taking into account entire
+           backing chain */
+        nr = s->cluster_sectors;
+        res = bdrv_get_block_status_above(bs, NULL, cl_start,
+                                          s->cluster_sectors, &nr, &file);
+        if (res < 0 || !(res & BDRV_BLOCK_ZERO)) {
             return -ENOTSUP;
         }
 
         qemu_co_mutex_lock(&s->lock);
         /* We can have new write after previous check */
-        if (!is_zero_cluster_top_locked(bs, sector_num)) {
+        nr = s->cluster_sectors;
+        ret = qcow2_get_cluster_offset(bs, cl_start << BDRV_SECTOR_BITS,
+                                       &nr, &off);
+        if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
             qemu_co_mutex_unlock(&s->lock);
             return -ENOTSUP;
         }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster Denis V. Lunev
@ 2016-05-17 15:11   ` Kevin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-05-17 15:11 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-devel

Am 17.05.2016 um 11:15 hat Denis V. Lunev geschrieben:
> We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
> will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
> not possible.

The patch is okay, but I'm correcting this paragraph into:

    We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
    will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
    not possible for images with bdi.unallocated_blocks_are_zero == true.

    For those images where it's false, however, it can happen and we must
    not consider the data zeroed then or we would corrupt the image.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements
  2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
                   ` (4 preceding siblings ...)
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes Denis V. Lunev
@ 2016-05-17 15:11 ` Kevin Wolf
  5 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2016-05-17 15:11 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-devel, Eric Blake

Am 17.05.2016 um 11:15 hat Denis V. Lunev geschrieben:
> This series enables tracepoints in qemu-io and improves
> bdrv/qcow2_co_write_zeroes framework.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes Denis V. Lunev
@ 2016-05-17 15:37   ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2016-05-17 15:37 UTC (permalink / raw)
  To: Denis V. Lunev, qemu-devel; +Cc: Kevin Wolf

[-- Attachment #1: Type: text/plain, Size: 448 bytes --]

On 05/17/2016 03:15 AM, Denis V. Lunev wrote:
> This patch follows guidelines of all other tracepoints in qcow2, like ones
> in qcow2_co_writev. I think that they should dump values in the same
> quantities or be changed all to gather.

s/to gather/together/

but only if Kevin wants to tweak what is already on his block branch

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] block: split write_zeroes always
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 1/5] block: split write_zeroes always Denis V. Lunev
@ 2016-05-17 16:34   ` Kevin Wolf
  2016-05-25 18:36     ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2016-05-17 16:34 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-devel, Eric Blake

Am 17.05.2016 um 11:15 hat Denis V. Lunev geschrieben:
> We should split requests even if they are less than write_zeroes_alignment.
> For example we can have the following request:
>   offset 62k
>   size   4k
>   write_zeroes_alignment 64k
> The original code sent 1 request covering 2 qcow2 clusters, and resulted
> in both clusters being allocated. But by splitting the request, we can
> cater to the case where one of the two clusters can be zeroed as a
> whole, for only 1 cluster allocated after the operation.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/io.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index cd6d71a..6a24ea8 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1172,13 +1172,13 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>          /* Align request.  Block drivers can expect the "bulk" of the request
>           * to be aligned.
>           */
> -        if (bs->bl.write_zeroes_alignment
> -            && num > bs->bl.write_zeroes_alignment) {
> +        if (bs->bl.write_zeroes_alignment) {
>              if (sector_num % bs->bl.write_zeroes_alignment != 0) {
>                  /* Make a small request up to the first aligned sector.  */
>                  num = bs->bl.write_zeroes_alignment;
>                  num -= sector_num % bs->bl.write_zeroes_alignment;

Turns out this doesn't work. If this is a small request that zeros
something in the middle of a single cluster (i.e. we have untouched data
both before and after the request in the same cluster), then num can now
become greater than nb_sectors, so that we end up zeroing too much.

I'll send a test case that catches this and unstage the series for the
time being.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes
  2016-05-17  9:15 ` [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes Denis V. Lunev
@ 2016-05-24 18:14   ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2016-05-24 18:14 UTC (permalink / raw)
  To: Denis V. Lunev, qemu-devel; +Cc: Kevin Wolf

[-- Attachment #1: Type: text/plain, Size: 2469 bytes --]

On 05/17/2016 03:15 AM, Denis V. Lunev wrote:
> They are used once only. This makes code more compact.
> 
> The patch also improves comments in the code.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2.c | 39 +++++++++++++++------------------------
>  1 file changed, 15 insertions(+), 24 deletions(-)
> 

> @@ -2439,20 +2418,32 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
>                                         nb_sectors);
>  
>      if (head != 0 || tail != 0) {
> -        int64_t cl_start = sector_num - head;
> +        BlockDriverState *file;
> +        uint64_t off;
> +        int nr;
> +
> +        int64_t cl_start = sector_num - head, res;
>  
>          assert(cl_start + s->cluster_sectors >= sector_num + nb_sectors);
>  
>          sector_num = cl_start;
>          nb_sectors = s->cluster_sectors;
>  
> -        if (!is_zero_cluster(bs, sector_num)) {
> +        /* check that the cluster is zeroed taking into account entire
> +           backing chain */
> +        nr = s->cluster_sectors;
> +        res = bdrv_get_block_status_above(bs, NULL, cl_start,
> +                                          s->cluster_sectors, &nr, &file);
> +        if (res < 0 || !(res & BDRV_BLOCK_ZERO)) {
>              return -ENOTSUP;
>          }

This is somewhat pessimistic in certain cases.  Suppose I have the
following cluster (borrowing from Kevin's nice notation in 154):

4k: -- -- XX --

and issue a 'write -z 2k 1k'

As written, the code will see that bdrv_get_block_status_above() does
NOT have all ZERO for the cluster, so it will return -ENOTSUP; we will
then call into the fallbacks and write explicit zeroes, so that the top
file now has an allocated cluster full of zero.  But if we were a bit
smarter, we would only check the allocation status of the backing
sectors that do not overlap with our write_zeroes request, because the
remaining sectors are about to be overwritten; in this case, we can
specifically optimize to write a zero cluster without allocation.

You'll also need to rebase this series on top of my pending work to
rewrite bdrv_co_write_zeroes to instead be bdrv_co_pwrite_zeroes with a
byte interface (patches to be posted later today), if mine gets reviewed
first.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] block: split write_zeroes always
  2016-05-17 16:34   ` Kevin Wolf
@ 2016-05-25 18:36     ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2016-05-25 18:36 UTC (permalink / raw)
  To: Kevin Wolf, Denis V. Lunev; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2317 bytes --]

On 05/17/2016 10:34 AM, Kevin Wolf wrote:
> Am 17.05.2016 um 11:15 hat Denis V. Lunev geschrieben:
>> We should split requests even if they are less than write_zeroes_alignment.
>> For example we can have the following request:
>>   offset 62k
>>   size   4k
>>   write_zeroes_alignment 64k
>> The original code sent 1 request covering 2 qcow2 clusters, and resulted
>> in both clusters being allocated. But by splitting the request, we can
>> cater to the case where one of the two clusters can be zeroed as a
>> whole, for only 1 cluster allocated after the operation.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Eric Blake <eblake@redhat.com>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> ---
>>  block/io.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index cd6d71a..6a24ea8 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1172,13 +1172,13 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>>          /* Align request.  Block drivers can expect the "bulk" of the request
>>           * to be aligned.
>>           */
>> -        if (bs->bl.write_zeroes_alignment
>> -            && num > bs->bl.write_zeroes_alignment) {
>> +        if (bs->bl.write_zeroes_alignment) {
>>              if (sector_num % bs->bl.write_zeroes_alignment != 0) {
>>                  /* Make a small request up to the first aligned sector.  */
>>                  num = bs->bl.write_zeroes_alignment;
>>                  num -= sector_num % bs->bl.write_zeroes_alignment;
> 
> Turns out this doesn't work. If this is a small request that zeros
> something in the middle of a single cluster (i.e. we have untouched data
> both before and after the request in the same cluster), then num can now
> become greater than nb_sectors, so that we end up zeroing too much.

I'm planning on folding in a working version of this patch in my
byte-based write_zeroes conversion series.  As part of the patch, I'm
also hoisting the division out of the loop (no guarantees that the
compiler can spot that bs->bl.write_zeroes_alignment will be a power of
two, to optimize it to a shift).



-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-05-25 18:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-17  9:15 [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Denis V. Lunev
2016-05-17  9:15 ` [Qemu-devel] [PATCH 1/5] block: split write_zeroes always Denis V. Lunev
2016-05-17 16:34   ` Kevin Wolf
2016-05-25 18:36     ` Eric Blake
2016-05-17  9:15 ` [Qemu-devel] [PATCH 2/5] qcow2: simplify logic in qcow2_co_write_zeroes Denis V. Lunev
2016-05-17  9:15 ` [Qemu-devel] [PATCH 3/5] qcow2: add tracepoints for qcow2_co_write_zeroes Denis V. Lunev
2016-05-17 15:37   ` Eric Blake
2016-05-17  9:15 ` [Qemu-devel] [PATCH 4/5] qcow2: fix condition in is_zero_cluster Denis V. Lunev
2016-05-17 15:11   ` Kevin Wolf
2016-05-17  9:15 ` [Qemu-devel] [PATCH 5/5] qcow2: merge is_zero_cluster helpers into qcow2_co_write_zeroes Denis V. Lunev
2016-05-24 18:14   ` Eric Blake
2016-05-17 15:11 ` [Qemu-devel] [PATCH v2 0/5] qcow2_co_write_zeroes and related improvements Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.