All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full
@ 2013-12-19  2:27 Hu Tao
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

This series implements full image preallocation to create a non-sparse image
file at creation time, both for raw and qcow2 format. The purpose is to avoid
performance deterioration of the guest cause by sparse image.


v3:  - Fix comments to v2 by Fam.
     - qcow2: first fallocate disk space, then allocate metadata. This avoids
       the problem in v2 that bdrv_preallocate may clear all information in
       metadata. This does not necessarily map all data clusters sequentially
       but does keep information in metadata. Peter, is this acceptable?


Hu Tao (6):
  block: introduce prealloc_mode
  block: add BlockDriver.bdrv_preallocate.
  block/raw-posix: implement bdrv_preallocate
  raw-posix: Add full image preallocation option
  qcow2: implement bdrv_preallocate
  qcow2: Add full image preallocation option

 block.c                   | 13 +++++++++++
 block/qcow2.c             | 33 ++++++++++++++++++++++------
 block/raw-posix.c         | 56 +++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block.h     |  7 ++++++
 include/block/block_int.h |  3 +++
 5 files changed, 105 insertions(+), 7 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-20 10:08   ` Stefan Hajnoczi
  2013-12-20 12:36   ` Eric Blake
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate Hu Tao
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

This patch prepares for the subsequent patches.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c         | 6 +++---
 include/block/block.h | 6 ++++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index f29aa88..32cb39f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1614,7 +1614,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options,
     uint64_t sectors = 0;
     int flags = 0;
     size_t cluster_size = DEFAULT_CLUSTER_SIZE;
-    int prealloc = 0;
+    int prealloc = PREALLOC_OFF;
     int version = 3;
     Error *local_err = NULL;
     int ret;
@@ -1635,9 +1635,9 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options,
             }
         } else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
             if (!options->value.s || !strcmp(options->value.s, "off")) {
-                prealloc = 0;
+                prealloc = PREALLOC_OFF;
             } else if (!strcmp(options->value.s, "metadata")) {
-                prealloc = 1;
+                prealloc = PREALLOC_METADATA;
             } else {
                 error_setg(errp, "Invalid preallocation mode: '%s'",
                            options->value.s);
diff --git a/include/block/block.h b/include/block/block.h
index 36efaea..3732f25 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -527,4 +527,10 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag);
 int bdrv_debug_resume(BlockDriverState *bs, const char *tag);
 bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
 
+enum prealloc_mode {
+    PREALLOC_OFF = 0,
+    PREALLOC_METADATA,
+    PREALLOC_FULL,
+};
+
 #endif
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate.
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-20 10:10   ` Stefan Hajnoczi
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate Hu Tao
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

This field is used to preallocate disk space for block device.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block.c                   | 13 +++++++++++++
 include/block/block.h     |  1 +
 include/block/block_int.h |  3 +++
 3 files changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 64e7d22..b901587 100644
--- a/block.c
+++ b/block.c
@@ -3216,6 +3216,19 @@ bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs)
     return false;
 }
 
+int bdrv_preallocate(BlockDriverState *bs, int64_t offset, int64_t length)
+{
+    if (bs->backing_hd) {
+        return -ENOTSUP;
+    }
+
+    if (bs->drv->bdrv_preallocate) {
+        return bs->drv->bdrv_preallocate(bs, offset, length);
+    }
+
+    return -ENOTSUP;
+}
+
 typedef struct BdrvCoGetBlockStatusData {
     BlockDriverState *bs;
     BlockDriverState *base;
diff --git a/include/block/block.h b/include/block/block.h
index 3732f25..bc1f277 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -349,6 +349,7 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
+int bdrv_preallocate(BlockDriverState *bs, int64_t offset, int64_t length);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
 int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8b132d7..5bb1005 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -232,6 +232,9 @@ struct BlockDriver {
      */
     int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+    int (*bdrv_preallocate)(BlockDriverState *bs, int64_t offset,
+                            int64_t length);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-20 10:14   ` Stefan Hajnoczi
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 4/6] raw-posix: Add full image preallocation option Hu Tao
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/raw-posix.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 10c6b34..19181f2 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1160,6 +1160,39 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
     return (int64_t)st.st_blocks * 512;
 }
 
+#ifdef __linux__
+static int raw_preallocate2(int fd, int64_t offset, int64_t length)
+{
+    int ret = -1;
+
+    ret = fallocate(fd, 0, offset, length);
+
+    /* fallback to posix_fallocate() if fallocate() is not supported */
+    if (ret < 0 && (errno == ENOSYS || errno == EOPNOTSUPP)) {
+        ret = posix_fallocate(fd, offset, length);
+    }
+
+    return ret;
+}
+#else
+static int raw_preallocate2(int fd, int64_t offset, int64_t length)
+{
+    return posix_fallocate(fd, offset, length);
+}
+#endif
+
+static int raw_preallocate(BlockDriverState *bs, int64_t offset, int64_t length)
+{
+    BDRVRawState *s = bs->opaque;
+    int64_t len = bdrv_getlength(bs);
+
+    if (offset + length < 0 || offset + length > len) {
+        return -EINVAL;
+    }
+
+    return raw_preallocate2(s->fd, offset, length);
+}
+
 static int raw_create(const char *filename, QEMUOptionParameter *options,
                       Error **errp)
 {
@@ -1356,6 +1389,7 @@ static BlockDriver bdrv_file = {
     .bdrv_close = raw_close,
     .bdrv_create = raw_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
+    .bdrv_preallocate = raw_preallocate,
     .bdrv_co_get_block_status = raw_co_get_block_status,
     .bdrv_co_write_zeroes = raw_co_write_zeroes,
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 4/6] raw-posix: Add full image preallocation option
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
                   ` (2 preceding siblings ...)
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate Hu Tao
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

This patch adds a new option preallocation for raw format, and implements
full preallocation.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/raw-posix.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 19181f2..e09e170 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1199,11 +1199,22 @@ static int raw_create(const char *filename, QEMUOptionParameter *options,
     int fd;
     int result = 0;
     int64_t total_size = 0;
+    int prealloc = PREALLOC_OFF;
 
     /* Read out options */
     while (options && options->name) {
         if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
             total_size = options->value.n / BDRV_SECTOR_SIZE;
+        } else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
+            if (!options->value.s || !strcmp(options->value.s, "off")) {
+                prealloc = PREALLOC_OFF;
+            } else if (!strcmp(options->value.s, "full")) {
+                prealloc = PREALLOC_FULL;
+            } else {
+                error_setg(errp, "Invalid preallocation mode: '%s'",
+                           options->value.s);
+                return -EINVAL;
+            }
         }
         options++;
     }
@@ -1218,6 +1229,12 @@ static int raw_create(const char *filename, QEMUOptionParameter *options,
             result = -errno;
             error_setg_errno(errp, -result, "Could not resize file");
         }
+        if (prealloc == PREALLOC_FULL &&
+            raw_preallocate2(fd, 0, total_size * BDRV_SECTOR_SIZE) != 0) {
+            result = -errno;
+            error_setg_errno(errp, -result,
+                             "Could not preallocate data for the new file");
+        }
         if (qemu_close(fd) != 0) {
             result = -errno;
             error_setg_errno(errp, -result, "Could not close the new file");
@@ -1373,6 +1390,11 @@ static QEMUOptionParameter raw_create_options[] = {
         .type = OPT_SIZE,
         .help = "Virtual disk size"
     },
+    {
+        .name = BLOCK_OPT_PREALLOC,
+        .type = OPT_STRING,
+        .help = "Preallocation mode (allowed values: off, full)"
+    },
     { NULL }
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
                   ` (3 preceding siblings ...)
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 4/6] raw-posix: Add full image preallocation option Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-20 10:19   ` Stefan Hajnoczi
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option Hu Tao
  2013-12-20 10:30 ` [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Stefan Hajnoczi
  6 siblings, 1 reply; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 32cb39f..487a595 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2188,6 +2188,12 @@ static int qcow2_amend_options(BlockDriverState *bs,
     return 0;
 }
 
+static int qcow2_preallocate(BlockDriverState *bs, int64_t offset,
+                             int64_t length)
+{
+    return bdrv_preallocate(bs->file, offset, length);
+}
+
 static QEMUOptionParameter qcow2_create_options[] = {
     {
         .name = BLOCK_OPT_SIZE,
@@ -2242,6 +2248,7 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_reopen_prepare  = qcow2_reopen_prepare,
     .bdrv_create        = qcow2_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
+    .bdrv_preallocate   = qcow2_preallocate,
     .bdrv_co_get_block_status = qcow2_co_get_block_status,
     .bdrv_set_key       = qcow2_set_key,
     .bdrv_make_empty    = qcow2_make_empty,
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
                   ` (4 preceding siblings ...)
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate Hu Tao
@ 2013-12-19  2:27 ` Hu Tao
  2013-12-20 10:21   ` Stefan Hajnoczi
  2013-12-20 10:30 ` [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Stefan Hajnoczi
  6 siblings, 1 reply; 18+ messages in thread
From: Hu Tao @ 2013-12-19  2:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, hutao

This adds a preallocation=full mode to qcow2 image creation, which
creates a non-sparse image file.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 487a595..3c41d4a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1386,7 +1386,7 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
     return qcow2_update_header(bs);
 }
 
-static int preallocate(BlockDriverState *bs)
+static int preallocate(BlockDriverState *bs, enum prealloc_mode mode)
 {
     uint64_t nb_sectors;
     uint64_t offset;
@@ -1395,9 +1395,19 @@ static int preallocate(BlockDriverState *bs)
     int ret;
     QCowL2Meta *meta;
 
+    assert(mode != PREALLOC_OFF);
+
     nb_sectors = bdrv_getlength(bs) >> 9;
     offset = 0;
 
+    if (mode == PREALLOC_FULL) {
+        ret = bdrv_preallocate(bs->file, 0, bdrv_getlength(bs));
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    /* allocate metadata */
     while (nb_sectors) {
         num = MIN(nb_sectors, INT_MAX >> 9);
         ret = qcow2_alloc_cluster_offset(bs, offset, 0, num, &num,
@@ -1577,11 +1587,11 @@ static int qcow2_create2(const char *filename, int64_t total_size,
         }
     }
 
-    /* And if we're supposed to preallocate metadata, do that now */
+    /* And if we're supposed to preallocate data, do that now */
     if (prealloc) {
         BDRVQcowState *s = bs->opaque;
         qemu_co_mutex_lock(&s->lock);
-        ret = preallocate(bs);
+        ret = preallocate(bs, prealloc);
         qemu_co_mutex_unlock(&s->lock);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Could not preallocate metadata");
@@ -1638,6 +1648,8 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options,
                 prealloc = PREALLOC_OFF;
             } else if (!strcmp(options->value.s, "metadata")) {
                 prealloc = PREALLOC_METADATA;
+            } else if (!strcmp(options->value.s, "full")) {
+                prealloc = PREALLOC_FULL;
             } else {
                 error_setg(errp, "Invalid preallocation mode: '%s'",
                            options->value.s);
@@ -2229,7 +2241,7 @@ static QEMUOptionParameter qcow2_create_options[] = {
     {
         .name = BLOCK_OPT_PREALLOC,
         .type = OPT_STRING,
-        .help = "Preallocation mode (allowed values: off, metadata)"
+        .help = "Preallocation mode (allowed values: off, metadata, full)"
     },
     {
         .name = BLOCK_OPT_LAZY_REFCOUNTS,
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
@ 2013-12-20 10:08   ` Stefan Hajnoczi
  2013-12-20 12:36   ` Eric Blake
  1 sibling, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:08 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:36AM +0800, Hu Tao wrote:
> diff --git a/include/block/block.h b/include/block/block.h
> index 36efaea..3732f25 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -527,4 +527,10 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag);
>  int bdrv_debug_resume(BlockDriverState *bs, const char *tag);
>  bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
>  
> +enum prealloc_mode {

Please follow modern QEMU coding style and use CamelCase: PreallocMode

It would be nice to use PreallocMode instead int for variables.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate.
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate Hu Tao
@ 2013-12-20 10:10   ` Stefan Hajnoczi
  2013-12-24  8:31     ` Hu Tao
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:10 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:37AM +0800, Hu Tao wrote:
> diff --git a/block.c b/block.c
> index 64e7d22..b901587 100644
> --- a/block.c
> +++ b/block.c
> @@ -3216,6 +3216,19 @@ bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs)
>      return false;
>  }
>  
> +int bdrv_preallocate(BlockDriverState *bs, int64_t offset, int64_t length)
> +{
> +    if (bs->backing_hd) {
> +        return -ENOTSUP;
> +    }

Depending on the image file format it may be possible to preallocate
metadata while using a backing file.  Why prevent this?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate Hu Tao
@ 2013-12-20 10:14   ` Stefan Hajnoczi
  2013-12-23  1:45     ` Hu Tao
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:14 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:38AM +0800, Hu Tao wrote:
> +static int raw_preallocate2(int fd, int64_t offset, int64_t length)
> +{
> +    int ret = -1;
> +
> +    ret = fallocate(fd, 0, offset, length);
> +
> +    /* fallback to posix_fallocate() if fallocate() is not supported */
> +    if (ret < 0 && (errno == ENOSYS || errno == EOPNOTSUPP)) {
> +        ret = posix_fallocate(fd, offset, length);
> +    }
> +
> +    return ret;

Return value semantics differ between the two functions:
 * fallocate - return 0 or -1 with errno set
 * posix_fallocate - return 0 or error number (without using errno!)

Please make it consistent.  Usually in QEMU we return 0 on success and
-errno on failure.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate Hu Tao
@ 2013-12-20 10:19   ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:19 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:40AM +0800, Hu Tao wrote:
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> ---
>  block/qcow2.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 32cb39f..487a595 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2188,6 +2188,12 @@ static int qcow2_amend_options(BlockDriverState *bs,
>      return 0;
>  }
>  
> +static int qcow2_preallocate(BlockDriverState *bs, int64_t offset,
> +                             int64_t length)
> +{
> +    return bdrv_preallocate(bs->file, offset, length);
> +}

What about qcow2-level preallocation (metadata)?

I'm not sure what the meaning of offset and length are here - are they
supposed to be virtual disk LBAs.  They are being passed through to
bs->file so they actually become physical file offsets.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option Hu Tao
@ 2013-12-20 10:21   ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:21 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:41AM +0800, Hu Tao wrote:
> -static int preallocate(BlockDriverState *bs)
> +static int preallocate(BlockDriverState *bs, enum prealloc_mode mode)
>  {
>      uint64_t nb_sectors;
>      uint64_t offset;
> @@ -1395,9 +1395,19 @@ static int preallocate(BlockDriverState *bs)
>      int ret;
>      QCowL2Meta *meta;
>  
> +    assert(mode != PREALLOC_OFF);
> +
>      nb_sectors = bdrv_getlength(bs) >> 9;
>      offset = 0;
>  
> +    if (mode == PREALLOC_FULL) {
> +        ret = bdrv_preallocate(bs->file, 0, bdrv_getlength(bs));

This doesn't make sense since it fails to factor in qcow2 metadata and
the file layout.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full
  2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
                   ` (5 preceding siblings ...)
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option Hu Tao
@ 2013-12-20 10:30 ` Stefan Hajnoczi
  2013-12-20 12:54   ` Eric Blake
  2013-12-23  1:59   ` Hu Tao
  6 siblings, 2 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-12-20 10:30 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Thu, Dec 19, 2013 at 10:27:35AM +0800, Hu Tao wrote:
> This series implements full image preallocation to create a non-sparse image
> file at creation time, both for raw and qcow2 format. The purpose is to avoid
> performance deterioration of the guest cause by sparse image.

I'm not sure the new bdrv_preallocate() interface is necessary.  Can
qcow2_create() simply pass a preallocate=full option to
bdrv_create_file()?

It's simpler if we avoid bdrv_preallocate(), just keep it a
.bdrv_create() option.  That way we also avoid the question of how
exactly preallocate functions on an image file that already has blocks
allocated.

So what I imagine is:
1. Add preallocation=full support to raw-posix.c
2. Add preallocation=full support to qcow2.c, calculate image file size
including metadata for bdrv_create_file().

CCing Eric Blake so libvirt can consider how full preallocation
interacts with their safezero() preallocation.  If QEMU handles
preallocation can we skip safezero()?  Is there a concern about full
preallocation in QEMU taking a long time without progress report?

Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode
  2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
  2013-12-20 10:08   ` Stefan Hajnoczi
@ 2013-12-20 12:36   ` Eric Blake
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Blake @ 2013-12-20 12:36 UTC (permalink / raw)
  To: Hu Tao, qemu-devel; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

On 12/18/2013 07:27 PM, Hu Tao wrote:
> This patch prepares for the subsequent patches.
> 
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> ---
>  block/qcow2.c         | 6 +++---
>  include/block/block.h | 6 ++++++
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 

> +++ b/include/block/block.h
> @@ -527,4 +527,10 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag);
>  int bdrv_debug_resume(BlockDriverState *bs, const char *tag);
>  bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
>  
> +enum prealloc_mode {
> +    PREALLOC_OFF = 0,
> +    PREALLOC_METADATA,
> +    PREALLOC_FULL,
> +};

Why not make this a QAPI enum:

{ 'enum': 'PreallocMode', 'data': ['off', 'metadata', 'full'] }

After all, this is the sort of thing where we will eventually want to
select the prealloc mode of new files being created by online operations
such as snapshot creation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full
  2013-12-20 10:30 ` [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Stefan Hajnoczi
@ 2013-12-20 12:54   ` Eric Blake
  2013-12-23  1:59   ` Hu Tao
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Blake @ 2013-12-20 12:54 UTC (permalink / raw)
  To: Stefan Hajnoczi, Hu Tao; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2295 bytes --]

On 12/20/2013 03:30 AM, Stefan Hajnoczi wrote:
> On Thu, Dec 19, 2013 at 10:27:35AM +0800, Hu Tao wrote:
>> This series implements full image preallocation to create a non-sparse image
>> file at creation time, both for raw and qcow2 format. The purpose is to avoid
>> performance deterioration of the guest cause by sparse image.
> 
> I'm not sure the new bdrv_preallocate() interface is necessary.  Can
> qcow2_create() simply pass a preallocate=full option to
> bdrv_create_file()?
> 
> It's simpler if we avoid bdrv_preallocate(), just keep it a
> .bdrv_create() option.  That way we also avoid the question of how
> exactly preallocate functions on an image file that already has blocks
> allocated.
> 
> So what I imagine is:
> 1. Add preallocation=full support to raw-posix.c
> 2. Add preallocation=full support to qcow2.c, calculate image file size
> including metadata for bdrv_create_file().
> 
> CCing Eric Blake so libvirt can consider how full preallocation
> interacts with their safezero() preallocation.  If QEMU handles
> preallocation can we skip safezero()?  Is there a concern about full
> preallocation in QEMU taking a long time without progress report?

If libvirt can detect that qemu is new enough to do preallocation, then
libvirt can skip its own preallocation.  Which means you need to make
sure that QMP can probe the existence of this new feature (probably
already possible via query-command-line-options, but see also my reply
to 1/6 about starting to put this in QAPI).

The fact that preallocation is sometimes long running is definitely
worth considering; but it raises the more generic issue of how to turn
any long running blocking command into an asynchronous job that gets
started, then has both polling and event-notification progress until
completion, as well as the option of an early abort if it is taking too
long.

And there's my long-standing gripe that qemu-img doesn't support -p
progress meters for enough of its already-existing long-running
commands, which makes it harder for libvirt to provide wrappers around
qemu-img operations when manipulating images not associated with a
running VM.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate
  2013-12-20 10:14   ` Stefan Hajnoczi
@ 2013-12-23  1:45     ` Hu Tao
  0 siblings, 0 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-23  1:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Fri, Dec 20, 2013 at 11:14:05AM +0100, Stefan Hajnoczi wrote:
> On Thu, Dec 19, 2013 at 10:27:38AM +0800, Hu Tao wrote:
> > +static int raw_preallocate2(int fd, int64_t offset, int64_t length)
> > +{
> > +    int ret = -1;
> > +
> > +    ret = fallocate(fd, 0, offset, length);
> > +
> > +    /* fallback to posix_fallocate() if fallocate() is not supported */
> > +    if (ret < 0 && (errno == ENOSYS || errno == EOPNOTSUPP)) {
> > +        ret = posix_fallocate(fd, offset, length);
> > +    }
> > +
> > +    return ret;
> 
> Return value semantics differ between the two functions:
>  * fallocate - return 0 or -1 with errno set
>  * posix_fallocate - return 0 or error number (without using errno!)
> 
> Please make it consistent.  Usually in QEMU we return 0 on success and
> -errno on failure.

Thanks for catching this!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full
  2013-12-20 10:30 ` [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Stefan Hajnoczi
  2013-12-20 12:54   ` Eric Blake
@ 2013-12-23  1:59   ` Hu Tao
  1 sibling, 0 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-23  1:59 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Peter Lieven, Fam Zheng, qemu-devel

On Fri, Dec 20, 2013 at 11:30:21AM +0100, Stefan Hajnoczi wrote:
> On Thu, Dec 19, 2013 at 10:27:35AM +0800, Hu Tao wrote:
> > This series implements full image preallocation to create a non-sparse image
> > file at creation time, both for raw and qcow2 format. The purpose is to avoid
> > performance deterioration of the guest cause by sparse image.
> 
> I'm not sure the new bdrv_preallocate() interface is necessary.  Can
> qcow2_create() simply pass a preallocate=full option to
> bdrv_create_file()?

Looks simpler.  I'll try this. 

> 
> It's simpler if we avoid bdrv_preallocate(), just keep it a
> .bdrv_create() option.  That way we also avoid the question of how
> exactly preallocate functions on an image file that already has blocks
> allocated.
> 
> So what I imagine is:
> 1. Add preallocation=full support to raw-posix.c
> 2. Add preallocation=full support to qcow2.c, calculate image file size
> including metadata for bdrv_create_file().
> 
> CCing Eric Blake so libvirt can consider how full preallocation
> interacts with their safezero() preallocation.  If QEMU handles
> preallocation can we skip safezero()?  Is there a concern about full
> preallocation in QEMU taking a long time without progress report?
> 
> Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate.
  2013-12-20 10:10   ` Stefan Hajnoczi
@ 2013-12-24  8:31     ` Hu Tao
  0 siblings, 0 replies; 18+ messages in thread
From: Hu Tao @ 2013-12-24  8:31 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Fam Zheng, Peter Lieven, qemu-devel

On Fri, Dec 20, 2013 at 11:10:30AM +0100, Stefan Hajnoczi wrote:
> On Thu, Dec 19, 2013 at 10:27:37AM +0800, Hu Tao wrote:
> > diff --git a/block.c b/block.c
> > index 64e7d22..b901587 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -3216,6 +3216,19 @@ bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs)
> >      return false;
> >  }
> >  
> > +int bdrv_preallocate(BlockDriverState *bs, int64_t offset, int64_t length)
> > +{
> > +    if (bs->backing_hd) {
> > +        return -ENOTSUP;
> > +    }
> 
> Depending on the image file format it may be possible to preallocate
> metadata while using a backing file.  Why prevent this?

I thought in the case we have no need to preallocate forbacking file.
But yes, we can also preallocate for bs when there is backing file.
Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-12-24  8:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-19  2:27 [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Hu Tao
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 1/6] block: introduce prealloc_mode Hu Tao
2013-12-20 10:08   ` Stefan Hajnoczi
2013-12-20 12:36   ` Eric Blake
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 2/6] block: add BlockDriver.bdrv_preallocate Hu Tao
2013-12-20 10:10   ` Stefan Hajnoczi
2013-12-24  8:31     ` Hu Tao
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 3/6] block/raw-posix: implement bdrv_preallocate Hu Tao
2013-12-20 10:14   ` Stefan Hajnoczi
2013-12-23  1:45     ` Hu Tao
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 4/6] raw-posix: Add full image preallocation option Hu Tao
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 5/6] qcow2: implement bdrv_preallocate Hu Tao
2013-12-20 10:19   ` Stefan Hajnoczi
2013-12-19  2:27 ` [Qemu-devel] [RFC PATCH v3 6/6] qcow2: Add full image preallocation option Hu Tao
2013-12-20 10:21   ` Stefan Hajnoczi
2013-12-20 10:30 ` [Qemu-devel] [RFC PATCH v3 0/6] qemu-img: add preallocation=full Stefan Hajnoczi
2013-12-20 12:54   ` Eric Blake
2013-12-23  1:59   ` Hu Tao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.