qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster
@ 2020-09-21 14:30 Alberto Garcia
  2020-09-21 14:30 ` [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() Alberto Garcia
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Alberto Garcia @ 2020-09-21 14:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Max Reitz

I had to rebase the series due to conflicting changes on master. There
are no other differences.

Berto

v4:
- Fix rebase conflicts after cb8503159a

v3: https://lists.gnu.org/archive/html/qemu-block/2020-09/msg00912.html
- Add a new patch to improve the reporting of BDRV_BLOCK_ZERO [Vladimir]
- Rename function to bdrv_co_is_zero_fast() [Vladimir, Kevin]
- Don't call bdrv_common_block_status_above() if bytes == 0

v2: https://lists.gnu.org/archive/html/qemu-block/2020-08/msg01165.html
- Add new, simpler API: bdrv_is_unallocated_or_zero_above()

v1: https://lists.gnu.org/archive/html/qemu-block/2020-08/msg00403.html

Alberto Garcia (2):
  qcow2: Report BDRV_BLOCK_ZERO more accurately in
    bdrv_co_block_status()
  qcow2: Skip copy-on-write when allocating a zero cluster

 include/block/block.h |  2 ++
 block/io.c            | 35 +++++++++++++++++++++++++++++++----
 block/qcow2.c         | 35 +++++++++++++++++++----------------
 3 files changed, 52 insertions(+), 20 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status()
  2020-09-21 14:30 [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster Alberto Garcia
@ 2020-09-21 14:30 ` Alberto Garcia
  2020-10-22  9:22   ` Vladimir Sementsov-Ogievskiy
  2020-09-21 14:30 ` [PATCH v4 2/2] qcow2: Skip copy-on-write when allocating a zero cluster Alberto Garcia
  2020-10-22  9:02 ` [PATCH v4 0/2] " Alberto Garcia
  2 siblings, 1 reply; 5+ messages in thread
From: Alberto Garcia @ 2020-09-21 14:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Max Reitz

If a BlockDriverState supports backing files but has none then any
unallocated area reads back as zeroes.

bdrv_co_block_status() is only reporting this is if want_zero is true,
but this is an inexpensive test and there is no reason not to do it in
all cases.

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/io.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/io.c b/block/io.c
index a2389bb38c..ef1ea806e8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2391,17 +2391,17 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
 
     if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
         ret |= BDRV_BLOCK_ALLOCATED;
-    } else if (want_zero && bs->drv->supports_backing) {
+    } else if (bs->drv->supports_backing) {
         BlockDriverState *cow_bs = bdrv_cow_bs(bs);
 
-        if (cow_bs) {
+        if (!cow_bs) {
+            ret |= BDRV_BLOCK_ZERO;
+        } else if (want_zero) {
             int64_t size2 = bdrv_getlength(cow_bs);
 
             if (size2 >= 0 && offset >= size2) {
                 ret |= BDRV_BLOCK_ZERO;
             }
-        } else {
-            ret |= BDRV_BLOCK_ZERO;
         }
     }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/2] qcow2: Skip copy-on-write when allocating a zero cluster
  2020-09-21 14:30 [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster Alberto Garcia
  2020-09-21 14:30 ` [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() Alberto Garcia
@ 2020-09-21 14:30 ` Alberto Garcia
  2020-10-22  9:02 ` [PATCH v4 0/2] " Alberto Garcia
  2 siblings, 0 replies; 5+ messages in thread
From: Alberto Garcia @ 2020-09-21 14:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Max Reitz

Since commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a when a write
request results in a new allocation QEMU first tries to see if the
rest of the cluster outside the written area contains only zeroes.

In that case, instead of doing a normal copy-on-write operation and
writing explicit zero buffers to disk, the code zeroes the whole
cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK.

This improves performance very significantly but it only happens when
we are writing to an area that was completely unallocated before. Zero
clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and
are therefore slower to allocate.

This happens because the code uses bdrv_is_allocated_above() rather
bdrv_block_status_above(). The former is not as accurate for this
purpose but it is faster. However in the case of qcow2 the underlying
call does already report zero clusters just fine so there is no reason
why we cannot use that information.

After testing 4KB writes on an image that only contains zero clusters
this patch results in almost five times more IOPS.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  2 ++
 block/io.c            | 27 +++++++++++++++++++++++++++
 block/qcow2.c         | 35 +++++++++++++++++++----------------
 3 files changed, 48 insertions(+), 16 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 981ab5b314..26ada4445b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -496,6 +496,8 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
                             bool include_base, int64_t offset, int64_t bytes,
                             int64_t *pnum);
+int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
+                                      int64_t bytes);
 
 bool bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only,
diff --git a/block/io.c b/block/io.c
index ef1ea806e8..8084dec522 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2541,6 +2541,33 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes,
                                    offset, bytes, pnum, map, file);
 }
 
+/*
+ * Check @bs (and its backing chain) to see if the range defined
+ * by @offset and @bytes is known to read as zeroes.
+ * Return 1 if that is the case, 0 otherwise and -errno on error.
+ * This test is meant to be fast rather than accurate so returning 0
+ * does not guarantee non-zero data.
+ */
+int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
+                                      int64_t bytes)
+{
+    int ret;
+    int64_t pnum = bytes;
+
+    if (!bytes) {
+        return 1;
+    }
+
+    ret = bdrv_common_block_status_above(bs, NULL, false, offset,
+                                         bytes, &pnum, NULL, NULL);
+
+    if (ret < 0) {
+        return ret;
+    }
+
+    return (pnum == bytes) && (ret & BDRV_BLOCK_ZERO);
+}
+
 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
                                    int64_t bytes, int64_t *pnum)
 {
diff --git a/block/qcow2.c b/block/qcow2.c
index b05512718c..114526ce62 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2387,26 +2387,26 @@ static bool merge_cow(uint64_t offset, unsigned bytes,
     return false;
 }
 
-static bool is_unallocated(BlockDriverState *bs, int64_t offset, int64_t bytes)
-{
-    int64_t nr;
-    return !bytes ||
-        (!bdrv_is_allocated_above(bs, NULL, false, offset, bytes, &nr) &&
-         nr == bytes);
-}
-
-static bool is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
+/*
+ * Return 1 if the COW regions read as zeroes, 0 if not, < 0 on error.
+ * Note that returning 0 does not guarantee non-zero data.
+ */
+static int is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
 {
     /*
      * This check is designed for optimization shortcut so it must be
      * efficient.
-     * Instead of is_zero(), use is_unallocated() as it is faster (but not
-     * as accurate and can result in false negatives).
+     * Instead of is_zero(), use bdrv_co_is_zero_fast() as it is
+     * faster (but not as accurate and can result in false negatives).
      */
-    return is_unallocated(bs, m->offset + m->cow_start.offset,
-                          m->cow_start.nb_bytes) &&
-           is_unallocated(bs, m->offset + m->cow_end.offset,
-                          m->cow_end.nb_bytes);
+    int ret = bdrv_co_is_zero_fast(bs, m->offset + m->cow_start.offset,
+                                   m->cow_start.nb_bytes);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    return bdrv_co_is_zero_fast(bs, m->offset + m->cow_end.offset,
+                                m->cow_end.nb_bytes);
 }
 
 static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
@@ -2432,7 +2432,10 @@ static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
             continue;
         }
 
-        if (!is_zero_cow(bs, m)) {
+        ret = is_zero_cow(bs, m);
+        if (ret < 0) {
+            return ret;
+        } else if (ret == 0) {
             continue;
         }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster
  2020-09-21 14:30 [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster Alberto Garcia
  2020-09-21 14:30 ` [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() Alberto Garcia
  2020-09-21 14:30 ` [PATCH v4 2/2] qcow2: Skip copy-on-write when allocating a zero cluster Alberto Garcia
@ 2020-10-22  9:02 ` Alberto Garcia
  2 siblings, 0 replies; 5+ messages in thread
From: Alberto Garcia @ 2020-10-22  9:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, Max Reitz

ping

On Mon 21 Sep 2020 04:30:48 PM CEST, Alberto Garcia wrote:
> I had to rebase the series due to conflicting changes on master. There
> are no other differences.
>
> Berto
>
> v4:
> - Fix rebase conflicts after cb8503159a
>
> v3: https://lists.gnu.org/archive/html/qemu-block/2020-09/msg00912.html
> - Add a new patch to improve the reporting of BDRV_BLOCK_ZERO [Vladimir]
> - Rename function to bdrv_co_is_zero_fast() [Vladimir, Kevin]
> - Don't call bdrv_common_block_status_above() if bytes == 0
>
> v2: https://lists.gnu.org/archive/html/qemu-block/2020-08/msg01165.html
> - Add new, simpler API: bdrv_is_unallocated_or_zero_above()
>
> v1: https://lists.gnu.org/archive/html/qemu-block/2020-08/msg00403.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status()
  2020-09-21 14:30 ` [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() Alberto Garcia
@ 2020-10-22  9:22   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 5+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-22  9:22 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz

21.09.2020 17:30, Alberto Garcia wrote:
> If a BlockDriverState supports backing files but has none then any
> unallocated area reads back as zeroes.
> 
> bdrv_co_block_status() is only reporting this is if want_zero is true,
> but this is an inexpensive test and there is no reason not to do it in
> all cases.
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Alberto Garcia <berto@igalia.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/io.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index a2389bb38c..ef1ea806e8 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2391,17 +2391,17 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>   
>       if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
>           ret |= BDRV_BLOCK_ALLOCATED;
> -    } else if (want_zero && bs->drv->supports_backing) {
> +    } else if (bs->drv->supports_backing) {
>           BlockDriverState *cow_bs = bdrv_cow_bs(bs);
>   
> -        if (cow_bs) {
> +        if (!cow_bs) {
> +            ret |= BDRV_BLOCK_ZERO;
> +        } else if (want_zero) {
>               int64_t size2 = bdrv_getlength(cow_bs);
>   
>               if (size2 >= 0 && offset >= size2) {
>                   ret |= BDRV_BLOCK_ZERO;
>               }
> -        } else {
> -            ret |= BDRV_BLOCK_ZERO;
>           }
>       }
>   
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-22  9:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 14:30 [PATCH v4 0/2] Skip copy-on-write when allocating a zero cluster Alberto Garcia
2020-09-21 14:30 ` [PATCH v4 1/2] qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status() Alberto Garcia
2020-10-22  9:22   ` Vladimir Sementsov-Ogievskiy
2020-09-21 14:30 ` [PATCH v4 2/2] qcow2: Skip copy-on-write when allocating a zero cluster Alberto Garcia
2020-10-22  9:02 ` [PATCH v4 0/2] " Alberto Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).