All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
@ 2014-07-11  6:09 Hu Tao
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector Hu Tao
                   ` (6 more replies)
  0 siblings, 7 replies; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

This series adds two preallocation mode to qcow2 and raw:

Option preallocation=full preallocates disk space for image by writing
zeros to disk, this ensures disk space in any cases.

Option preallocation=falloc preallocates disk space by calling
posix_fallocate(). This is faster than preallocation=full.

This series depends on patches 1-3 of Max's series 'qemu-img: Implement
commit like QMP'. Specifically, patch 6 'qcow2: Add falloc and full
preallocation option' uses minimal_blob_size() introduced by Max's patch
'qcow2: Optimize bdrv_make_empty()'.

The series is also at https://github.com/taohu/qemu/commits/preallocation-v12
for you to check out, including depended patches from Max.

Eric, I'm afraid now we missed qemu 2.1, so patch 1 is still sent
with this series.

changes to v11:

 - fix test case 049 (patch 4)
 - unsigned nl2e -> uint64_t nl2e (patch 6)
 - use >> instead of / (patch 6)


Hu Tao (6):
  block: round up file size to nearest sector
  raw, qcow2: don't convert file size to sector size
  rename parse_enum_option to qapi_enum_parse and make it public
  qapi: introduce PreallocMode and a new PreallocMode full.
  raw-posix: Add falloc and full preallocation option
  qcow2: Add falloc and full preallocation option

 block/qcow2.c              | 56 +++++++++++++++++++++-------
 block/raw-posix.c          | 92 +++++++++++++++++++++++++++++++++++-----------
 block/raw-win32.c          |  6 +--
 blockdev.c                 | 30 +++------------
 include/qapi/util.h        | 17 +++++++++
 qapi/Makefile.objs         |  1 +
 qapi/block-core.json       | 17 +++++++++
 qapi/qapi-util.c           | 32 ++++++++++++++++
 tests/qemu-iotests/049.out |  2 +-
 tests/qemu-iotests/082.out | 54 +++++++++++++--------------
 tests/qemu-iotests/096     | 64 ++++++++++++++++++++++++++++++++
 tests/qemu-iotests/096.out | 14 +++++++
 tests/qemu-iotests/group   |  1 +
 13 files changed, 296 insertions(+), 90 deletions(-)
 create mode 100644 include/qapi/util.h
 create mode 100644 qapi/qapi-util.c
 create mode 100755 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out

-- 
1.9.3

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
@ 2014-07-11  6:09 ` Hu Tao
  2014-08-22 10:55   ` Kevin Wolf
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 2/6] raw, qcow2: don't convert file size to sector size Hu Tao
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c              |  3 ++-
 block/raw-posix.c          |  4 +--
 block/raw-win32.c          |  4 +--
 tests/qemu-iotests/096     | 64 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/096.out | 14 ++++++++++
 tests/qemu-iotests/group   |  1 +
 6 files changed, 85 insertions(+), 5 deletions(-)
 create mode 100755 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out

diff --git a/block/qcow2.c b/block/qcow2.c
index 0aba8dd..0dfbb9a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1777,7 +1777,8 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
     int ret;
 
     /* Read out options */
-    sectors = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
+    sectors = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                           BDRV_SECTOR_SIZE);
     backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
     backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
     if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
diff --git a/block/raw-posix.c b/block/raw-posix.c
index a857def..822522c 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1332,8 +1332,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     strstart(filename, "file:", &filename);
 
     /* Read out options */
-    total_size =
-        qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / BDRV_SECTOR_SIZE;
+    total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                              BDRV_SECTOR_SIZE);
     nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
 
     fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 902eab6..1e1880d 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -511,8 +511,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     strstart(filename, "file:", &filename);
 
     /* Read out options */
-    total_size =
-        qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0) / 512;
+    total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                              BDRV_SECTOR_SIZE);
 
     fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
                    0644);
diff --git a/tests/qemu-iotests/096 b/tests/qemu-iotests/096
new file mode 100755
index 0000000..1c2f6a3
--- /dev/null
+++ b/tests/qemu-iotests/096
@@ -0,0 +1,64 @@
+#!/bin/bash
+#
+# Test qcow2 creation with aligned and unaligned sizes
+#
+# Copyright (C) 2014 Fujitsu.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=hutao@cn.fujitsu.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+function test_qemu_img()
+{
+    echo qemu-img "$@" | _filter_testdir
+    $QEMU_IMG "$@" 2>&1 | _filter_testdir
+    echo
+}
+
+echo "=== Check qemu-img info output ==="
+echo
+image_sizes="1024 1234"
+
+for s in $image_sizes; do
+    _make_test_img $s
+    _img_info
+done
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/096.out b/tests/qemu-iotests/096.out
new file mode 100644
index 0000000..5f6c262
--- /dev/null
+++ b/tests/qemu-iotests/096.out
@@ -0,0 +1,14 @@
+QA output created by 096
+=== Check qemu-img info output ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1024 
+image: TEST_DIR/t.IMGFMT
+file format: IMGFMT
+virtual size: 1.0K (1024 bytes)
+cluster_size: 65536
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1234 
+image: TEST_DIR/t.IMGFMT
+file format: IMGFMT
+virtual size: 1.5K (1536 bytes)
+cluster_size: 65536
+***done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 6e67f61..714158f 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -100,3 +100,4 @@
 091 rw auto quick
 092 rw auto quick
 095 rw auto quick
+096 rw auto
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 2/6] raw, qcow2: don't convert file size to sector size
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector Hu Tao
@ 2014-07-11  6:09 ` Hu Tao
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 3/6] rename parse_enum_option to qapi_enum_parse and make it public Hu Tao
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

and avoid converting it back later.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c     | 10 +++++-----
 block/raw-posix.c |  6 +++---
 block/raw-win32.c |  6 +++---
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0dfbb9a..05df4b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1715,7 +1715,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
     }
 
     /* Okay, now that we have a valid image, let's give it the right size */
-    ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
+    ret = bdrv_truncate(bs, total_size);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not resize image");
         goto out;
@@ -1768,7 +1768,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
     char *backing_file = NULL;
     char *backing_fmt = NULL;
     char *buf = NULL;
-    uint64_t sectors = 0;
+    uint64_t size = 0;
     int flags = 0;
     size_t cluster_size = DEFAULT_CLUSTER_SIZE;
     int prealloc = 0;
@@ -1777,8 +1777,8 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
     int ret;
 
     /* Read out options */
-    sectors = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                           BDRV_SECTOR_SIZE);
+    size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                    BDRV_SECTOR_SIZE);
     backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
     backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
     if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
@@ -1828,7 +1828,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
         goto finish;
     }
 
-    ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
+    ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
                         cluster_size, prealloc, opts, version, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 822522c..2af24af 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1332,8 +1332,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     strstart(filename, "file:", &filename);
 
     /* Read out options */
-    total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                              BDRV_SECTOR_SIZE);
+    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                          BDRV_SECTOR_SIZE);
     nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
 
     fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
@@ -1357,7 +1357,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
 #endif
         }
 
-        if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
+        if (ftruncate(fd, total_size) != 0) {
             result = -errno;
             error_setg_errno(errp, -result, "Could not resize file");
         }
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 1e1880d..9bf8225 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -511,8 +511,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     strstart(filename, "file:", &filename);
 
     /* Read out options */
-    total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                              BDRV_SECTOR_SIZE);
+    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                          BDRV_SECTOR_SIZE);
 
     fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
                    0644);
@@ -521,7 +521,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
         return -EIO;
     }
     set_sparse(fd);
-    ftruncate(fd, total_size * 512);
+    ftruncate(fd, total_size);
     qemu_close(fd);
     return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 3/6] rename parse_enum_option to qapi_enum_parse and make it public
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector Hu Tao
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 2/6] raw, qcow2: don't convert file size to sector size Hu Tao
@ 2014-07-11  6:10 ` Hu Tao
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full Hu Tao
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 blockdev.c          | 30 ++++++------------------------
 include/qapi/util.h | 17 +++++++++++++++++
 qapi/Makefile.objs  |  1 +
 qapi/qapi-util.c    | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 56 insertions(+), 24 deletions(-)
 create mode 100644 include/qapi/util.h
 create mode 100644 qapi/qapi-util.c

diff --git a/blockdev.c b/blockdev.c
index 48bd9a3..9b93cf2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -39,6 +39,7 @@
 #include "qapi/qmp/types.h"
 #include "qapi-visit.h"
 #include "qapi/qmp-output-visitor.h"
+#include "qapi/util.h"
 #include "sysemu/sysemu.h"
 #include "block/block_int.h"
 #include "qmp-commands.h"
@@ -274,25 +275,6 @@ static int parse_block_error_action(const char *buf, bool is_read, Error **errp)
     }
 }
 
-static inline int parse_enum_option(const char *lookup[], const char *buf,
-                                    int max, int def, Error **errp)
-{
-    int i;
-
-    if (!buf) {
-        return def;
-    }
-
-    for (i = 0; i < max; i++) {
-        if (!strcmp(buf, lookup[i])) {
-            return i;
-        }
-    }
-
-    error_setg(errp, "invalid parameter value: %s", buf);
-    return def;
-}
-
 static bool check_throttle_config(ThrottleConfig *cfg, Error **errp)
 {
     if (throttle_conflicting(cfg)) {
@@ -456,11 +438,11 @@ static DriveInfo *blockdev_init(const char *file, QDict *bs_opts,
     }
 
     detect_zeroes =
-        parse_enum_option(BlockdevDetectZeroesOptions_lookup,
-                          qemu_opt_get(opts, "detect-zeroes"),
-                          BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
-                          BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
-                          &error);
+        qapi_enum_parse(BlockdevDetectZeroesOptions_lookup,
+                        qemu_opt_get(opts, "detect-zeroes"),
+                        BLOCKDEV_DETECT_ZEROES_OPTIONS_MAX,
+                        BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
+                        &error);
     if (error) {
         error_propagate(errp, error);
         goto early_err;
diff --git a/include/qapi/util.h b/include/qapi/util.h
new file mode 100644
index 0000000..de9238b
--- /dev/null
+++ b/include/qapi/util.h
@@ -0,0 +1,17 @@
+/*
+ * QAPI util functions
+ *
+ * Copyright Fujitsu, Inc. 2014
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef QAPI_UTIL_H
+#define QAPI_UTIL_H
+
+int qapi_enum_parse(const char *lookup[], const char *buf,
+                    int max, int def, Error **errp);
+
+#endif
diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
index d14b769..ffd88a6 100644
--- a/qapi/Makefile.objs
+++ b/qapi/Makefile.objs
@@ -4,3 +4,4 @@ util-obj-y += string-input-visitor.o string-output-visitor.o
 
 util-obj-y += opts-visitor.o
 util-obj-y += qmp-event.o
+util-obj-y += qapi-util.o
diff --git a/qapi/qapi-util.c b/qapi/qapi-util.c
new file mode 100644
index 0000000..3e4e07c
--- /dev/null
+++ b/qapi/qapi-util.c
@@ -0,0 +1,32 @@
+/*
+ * QAPI util functions
+ *
+ * Copyright Fujitsu, Inc. 2014
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "qapi/error.h"
+#include "qapi/util.h"
+
+int qapi_enum_parse(const char *lookup[], const char *buf,
+                    int max, int def, Error **errp)
+{
+    int i;
+
+    if (!buf) {
+        return def;
+    }
+
+    for (i = 0; i < max; i++) {
+        if (!strcmp(buf, lookup[i])) {
+            return i;
+        }
+    }
+
+    error_setg(errp, "invalid parameter value: %s", buf);
+    return def;
+}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full.
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
                   ` (2 preceding siblings ...)
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 3/6] rename parse_enum_option to qapi_enum_parse and make it public Hu Tao
@ 2014-07-11  6:10 ` Hu Tao
  2014-08-22 10:57   ` Kevin Wolf
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option Hu Tao
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

This patch prepares for the subsequent patches.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c              | 16 ++++++++--------
 qapi/block-core.json       | 17 +++++++++++++++++
 tests/qemu-iotests/049.out |  2 +-
 3 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 05df4b9..cfba93b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -30,6 +30,7 @@
 #include "qemu/error-report.h"
 #include "qapi/qmp/qerror.h"
 #include "qapi/qmp/qbool.h"
+#include "qapi/util.h"
 #include "trace.h"
 #include "qemu/option_int.h"
 
@@ -1594,7 +1595,7 @@ static int preallocate(BlockDriverState *bs)
 
 static int qcow2_create2(const char *filename, int64_t total_size,
                          const char *backing_file, const char *backing_format,
-                         int flags, size_t cluster_size, int prealloc,
+                         int flags, size_t cluster_size, PreallocMode prealloc,
                          QemuOpts *opts, int version,
                          Error **errp)
 {
@@ -1771,7 +1772,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
     uint64_t size = 0;
     int flags = 0;
     size_t cluster_size = DEFAULT_CLUSTER_SIZE;
-    int prealloc = 0;
+    PreallocMode prealloc = PREALLOC_MODE_OFF;
     int version = 3;
     Error *local_err = NULL;
     int ret;
@@ -1787,12 +1788,11 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
     cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
                                          DEFAULT_CLUSTER_SIZE);
     buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-    if (!buf || !strcmp(buf, "off")) {
-        prealloc = 0;
-    } else if (!strcmp(buf, "metadata")) {
-        prealloc = 1;
-    } else {
-        error_setg(errp, "Invalid preallocation mode: '%s'", buf);
+    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
+                               PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+                               &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
         ret = -EINVAL;
         goto finish;
     }
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e378653..78d804f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1643,3 +1643,20 @@
             'len'   : 'int',
             'offset': 'int',
             'speed' : 'int' } }
+
+# @PreallocMode
+#
+# Preallocation mode of QEMU image file
+#
+# @off: no preallocation
+# @metadata: preallocate only for metadata
+# @falloc: like @full preallocation but allocate disk space by
+#          posix_fallocate() rather than writing zeros.
+# @full: preallocate all data by writing zeros to device to ensure disk
+#        space is really available. @full preallocation also sets up
+#        metadata correctly.
+#
+# Since 2.2
+##
+{ 'enum': 'PreallocMode',
+  'data': [ 'off', 'metadata', 'falloc', 'full' ] }
diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
index 71ca44d..09ca0ae 100644
--- a/tests/qemu-iotests/049.out
+++ b/tests/qemu-iotests/049.out
@@ -179,7 +179,7 @@ qemu-img create -f qcow2 -o preallocation=metadata TEST_DIR/t.qcow2 64M
 Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off 
 
 qemu-img create -f qcow2 -o preallocation=1234 TEST_DIR/t.qcow2 64M
-qemu-img: TEST_DIR/t.qcow2: Invalid preallocation mode: '1234'
+qemu-img: TEST_DIR/t.qcow2: invalid parameter value: 1234
 Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 preallocation='1234' lazy_refcounts=off 
 
 == Check encryption option ==
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
                   ` (3 preceding siblings ...)
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full Hu Tao
@ 2014-07-11  6:10 ` Hu Tao
  2014-08-22 10:58   ` Kevin Wolf
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 6/6] qcow2: " Hu Tao
  2014-07-28  8:48 ` [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
  6 siblings, 1 reply; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

This patch adds a new option preallocation for raw format, and implements
falloc and full preallocation.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/raw-posix.c | 88 +++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 69 insertions(+), 19 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 2af24af..834ce2a 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -30,6 +30,7 @@
 #include "block/thread-pool.h"
 #include "qemu/iov.h"
 #include "raw-aio.h"
+#include "qapi/util.h"
 
 #if defined(__APPLE__) && (__MACH__)
 #include <paths.h>
@@ -1328,6 +1329,9 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     int result = 0;
     int64_t total_size = 0;
     bool nocow = false;
+    PreallocMode prealloc = PREALLOC_MODE_OFF;
+    char *buf = NULL;
+    Error *local_err = NULL;
 
     strstart(filename, "file:", &filename);
 
@@ -1335,37 +1339,78 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
     total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
                           BDRV_SECTOR_SIZE);
     nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
+    buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
+    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
+                               PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+                               &local_err);
+    g_free(buf);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        result = -EINVAL;
+        goto out;
+    }
 
     fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
                    0644);
     if (fd < 0) {
         result = -errno;
         error_setg_errno(errp, -result, "Could not create file");
-    } else {
-        if (nocow) {
+        goto out;
+    }
+
+    if (nocow) {
 #ifdef __linux__
-            /* Set NOCOW flag to solve performance issue on fs like btrfs.
-             * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
-             * will be ignored since any failure of this operation should not
-             * block the left work.
-             */
-            int attr;
-            if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
-                attr |= FS_NOCOW_FL;
-                ioctl(fd, FS_IOC_SETFLAGS, &attr);
-            }
-#endif
+        /* Set NOCOW flag to solve performance issue on fs like btrfs.
+         * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
+         * will be ignored since any failure of this operation should not
+         * block the left work.
+         */
+        int attr;
+        if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
+            attr |= FS_NOCOW_FL;
+            ioctl(fd, FS_IOC_SETFLAGS, &attr);
         }
+#endif
+    }
+
+    if (ftruncate(fd, total_size) != 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not resize file");
+        goto out_close;
+    }
 
-        if (ftruncate(fd, total_size) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not resize file");
+    if (prealloc == PREALLOC_MODE_FALLOC) {
+        /* posix_fallocate() doesn't set errno. */
+        result = -posix_fallocate(fd, 0, total_size);
+        if (result != 0) {
+            error_setg_errno(errp, -result,
+                             "Could not preallocate data for the new file");
         }
-        if (qemu_close(fd) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not close the new file");
+    } else if (prealloc == PREALLOC_MODE_FULL) {
+        buf = g_malloc0(65536);
+        int64_t num = 0, left = total_size;
+
+        while (left > 0) {
+            num = MIN(left, 65536);
+            result = write(fd, buf, num);
+            if (result < 0) {
+                result = -errno;
+                error_setg_errno(errp, -result,
+                                 "Could not write to the new file");
+                g_free(buf);
+                goto out_close;
+            }
+            left -= num;
         }
+        fsync(fd);
+        g_free(buf);
     }
+out_close:
+    if (qemu_close(fd) != 0 && result == 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not close the new file");
+    }
+out:
     return result;
 }
 
@@ -1548,6 +1593,11 @@ static QemuOptsList raw_create_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "Turn off copy-on-write (valid only on btrfs)"
         },
+        {
+            .name = BLOCK_OPT_PREALLOC,
+            .type = QEMU_OPT_STRING,
+            .help = "Preallocation mode (allowed values: off, falloc, full)"
+        },
         { /* end of list */ }
     }
 };
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v12 6/6] qcow2: Add falloc and full preallocation option
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
                   ` (4 preceding siblings ...)
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option Hu Tao
@ 2014-07-11  6:10 ` Hu Tao
  2014-07-11 21:07   ` Max Reitz
  2014-08-22 11:00   ` Kevin Wolf
  2014-07-28  8:48 ` [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
  6 siblings, 2 replies; 35+ messages in thread
From: Hu Tao @ 2014-07-11  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

This adds preallocation=falloc and preallocation=full mode to qcow2
image creation.

preallocation=full allocates disk space by writing zeros to disk to
ensure disk space in any cases.

preallocation=falloc likes preallocation=full, but allocates disk space
by posix_fallocate().

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
---
 block/qcow2.c              | 31 ++++++++++++++++++++++++--
 tests/qemu-iotests/082.out | 54 +++++++++++++++++++++++-----------------------
 2 files changed, 56 insertions(+), 29 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cfba93b..f48e915 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1593,6 +1593,9 @@ static int preallocate(BlockDriverState *bs)
     return 0;
 }
 
+static uint64_t minimal_blob_size(uint64_t ts, int cb, int spcb,
+                                  uint64_t overhead);
+
 static int qcow2_create2(const char *filename, int64_t total_size,
                          const char *backing_file, const char *backing_format,
                          int flags, size_t cluster_size, PreallocMode prealloc,
@@ -1628,6 +1631,29 @@ static int qcow2_create2(const char *filename, int64_t total_size,
     Error *local_err = NULL;
     int ret;
 
+    if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
+        int64_t meta_size = 0;
+        uint64_t nl2e;
+
+        total_size = align_offset(total_size, cluster_size);
+
+        /* total size of L2 tables */
+        nl2e = total_size >> cluster_bits;
+        nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
+        uint64_t l2_clusters = nl2e * sizeof(uint64_t) >> cluster_bits;
+
+        meta_size =
+            (1 +
+             minimal_blob_size(total_size >> BDRV_SECTOR_BITS,
+                               cluster_bits, cluster_bits - BDRV_SECTOR_BITS,
+                               1 + l2_clusters +
+                               (total_size >> cluster_bits)) +
+             l2_clusters) << cluster_bits;
+
+        qemu_opt_set_number(opts, BLOCK_OPT_SIZE, total_size + meta_size);
+        qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_lookup[prealloc]);
+    }
+
     ret = bdrv_create_file(filename, opts, &local_err);
     if (ret < 0) {
         error_propagate(errp, local_err);
@@ -1733,7 +1759,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
     }
 
     /* And if we're supposed to preallocate metadata, do that now */
-    if (prealloc) {
+    if (prealloc != PREALLOC_MODE_OFF) {
         BDRVQcowState *s = bs->opaque;
         qemu_co_mutex_lock(&s->lock);
         ret = preallocate(bs);
@@ -2760,7 +2786,8 @@ static QemuOptsList qcow2_create_opts = {
         {
             .name = BLOCK_OPT_PREALLOC,
             .type = QEMU_OPT_STRING,
-            .help = "Preallocation mode (allowed values: off, metadata)"
+            .help = "Preallocation mode (allowed values: off, metadata, "
+                    "falloc, full)"
         },
         {
             .name = BLOCK_OPT_LAZY_REFCOUNTS,
diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 413e7ef..90c21c8 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -64,7 +64,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -76,7 +76,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -88,7 +88,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -100,7 +100,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -112,7 +112,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -124,7 +124,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -136,7 +136,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -148,7 +148,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -175,7 +175,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 
 Testing: create -o help
@@ -253,7 +253,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -265,7 +265,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -277,7 +277,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -289,7 +289,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -301,7 +301,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -313,7 +313,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -325,7 +325,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -337,7 +337,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -364,7 +364,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 
 Testing: convert -o help
@@ -431,7 +431,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -443,7 +443,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -455,7 +455,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -467,7 +467,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -479,7 +479,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -491,7 +491,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -503,7 +503,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -515,7 +515,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 nocow            Turn off copy-on-write (valid only on btrfs)
 
@@ -544,7 +544,7 @@ backing_file     File name of a base image
 backing_fmt      Image format of the base image
 encryption       Encrypt the image
 cluster_size     qcow2 cluster size
-preallocation    Preallocation mode (allowed values: off, metadata)
+preallocation    Preallocation mode (allowed values: off, metadata, falloc, full)
 lazy_refcounts   Postpone refcount updates
 
 Testing: convert -o help
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 6/6] qcow2: Add falloc and full preallocation option
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 6/6] qcow2: " Hu Tao
@ 2014-07-11 21:07   ` Max Reitz
  2014-08-22 11:00   ` Kevin Wolf
  1 sibling, 0 replies; 35+ messages in thread
From: Max Reitz @ 2014-07-11 21:07 UTC (permalink / raw)
  To: Hu Tao, qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi

On 11.07.2014 08:10, Hu Tao wrote:
> This adds preallocation=falloc and preallocation=full mode to qcow2
> image creation.
>
> preallocation=full allocates disk space by writing zeros to disk to
> ensure disk space in any cases.
>
> preallocation=falloc likes preallocation=full, but allocates disk space
> by posix_fallocate().
>
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> ---
>   block/qcow2.c              | 31 ++++++++++++++++++++++++--
>   tests/qemu-iotests/082.out | 54 +++++++++++++++++++++++-----------------------
>   2 files changed, 56 insertions(+), 29 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
                   ` (5 preceding siblings ...)
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 6/6] qcow2: " Hu Tao
@ 2014-07-28  8:48 ` Hu Tao
  2014-08-22 10:54   ` Kevin Wolf
  2014-08-22 12:25   ` Richard W.M. Jones
  6 siblings, 2 replies; 35+ messages in thread
From: Hu Tao @ 2014-07-28  8:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Yasunori Goto, Stefan Hajnoczi, Max Reitz

ping...

All the 6 patches have reviewed-by now.

On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> This series adds two preallocation mode to qcow2 and raw:
> 
> Option preallocation=full preallocates disk space for image by writing
> zeros to disk, this ensures disk space in any cases.
> 
> Option preallocation=falloc preallocates disk space by calling
> posix_fallocate(). This is faster than preallocation=full.
> 
> This series depends on patches 1-3 of Max's series 'qemu-img: Implement
> commit like QMP'. Specifically, patch 6 'qcow2: Add falloc and full
> preallocation option' uses minimal_blob_size() introduced by Max's patch
> 'qcow2: Optimize bdrv_make_empty()'.
> 
> The series is also at https://github.com/taohu/qemu/commits/preallocation-v12
> for you to check out, including depended patches from Max.
> 
> Eric, I'm afraid now we missed qemu 2.1, so patch 1 is still sent
> with this series.
> 
> changes to v11:
> 
>  - fix test case 049 (patch 4)
>  - unsigned nl2e -> uint64_t nl2e (patch 6)
>  - use >> instead of / (patch 6)
> 
> 
> Hu Tao (6):
>   block: round up file size to nearest sector
>   raw, qcow2: don't convert file size to sector size
>   rename parse_enum_option to qapi_enum_parse and make it public
>   qapi: introduce PreallocMode and a new PreallocMode full.
>   raw-posix: Add falloc and full preallocation option
>   qcow2: Add falloc and full preallocation option
> 
>  block/qcow2.c              | 56 +++++++++++++++++++++-------
>  block/raw-posix.c          | 92 +++++++++++++++++++++++++++++++++++-----------
>  block/raw-win32.c          |  6 +--
>  blockdev.c                 | 30 +++------------
>  include/qapi/util.h        | 17 +++++++++
>  qapi/Makefile.objs         |  1 +
>  qapi/block-core.json       | 17 +++++++++
>  qapi/qapi-util.c           | 32 ++++++++++++++++
>  tests/qemu-iotests/049.out |  2 +-
>  tests/qemu-iotests/082.out | 54 +++++++++++++--------------
>  tests/qemu-iotests/096     | 64 ++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/096.out | 14 +++++++
>  tests/qemu-iotests/group   |  1 +
>  13 files changed, 296 insertions(+), 90 deletions(-)
>  create mode 100644 include/qapi/util.h
>  create mode 100644 qapi/qapi-util.c
>  create mode 100755 tests/qemu-iotests/096
>  create mode 100644 tests/qemu-iotests/096.out
> 
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-07-28  8:48 ` [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
@ 2014-08-22 10:54   ` Kevin Wolf
  2014-08-25  1:35     ` Hu Tao
  2014-08-22 12:25   ` Richard W.M. Jones
  1 sibling, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 10:54 UTC (permalink / raw)
  To: Hu Tao; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

Am 28.07.2014 um 10:48 hat Hu Tao geschrieben:
> ping...
> 
> All the 6 patches have reviewed-by now.

Looks mostly good to me, I have only a few minor comments that wouldn't
block inclusion but could be addressed in follow-up patches.

However, you have a dependency on a patch series from Max (you use
minimal_blob_size()), which hasn't been reviewed and merged yet, so your
series is blocked on that.

If you want to get your series merged quicker, you could replace this
with a rough estimate that excludes the clusters used by refcount table
and blocks. If full preallocation isn't really full, but only
preallocation of 99.9%, that's probably good enough in practice.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector
  2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector Hu Tao
@ 2014-08-22 10:55   ` Kevin Wolf
  2014-08-25  1:11     ` Hu Tao
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 10:55 UTC (permalink / raw)
  To: Hu Tao; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

Am 11.07.2014 um 08:09 hat Hu Tao geschrieben:
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>

If we make this change, shouldn't we do it consistently across all image
formats? With this patch we have some formats that round up, and many
others that round down.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full.
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full Hu Tao
@ 2014-08-22 10:57   ` Kevin Wolf
  2014-08-25  1:12     ` Hu Tao
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 10:57 UTC (permalink / raw)
  To: Hu Tao; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> This patch prepares for the subsequent patches.
> 
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>

Strictly speaking, qcow2 should error out if it is given full or falloc
after this patch. Two patches later it will actually interpret those
values correctly, so it's not that big of a problem, but it would
definitely be nicer.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option Hu Tao
@ 2014-08-22 10:58   ` Kevin Wolf
  2014-08-25  1:18     ` Hu Tao
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 10:58 UTC (permalink / raw)
  To: Hu Tao; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> This patch adds a new option preallocation for raw format, and implements
> falloc and full preallocation.
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>

raw-posix needs to error out if called with preallocation=metadata, it
doesn't implement this option.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 6/6] qcow2: Add falloc and full preallocation option
  2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 6/6] qcow2: " Hu Tao
  2014-07-11 21:07   ` Max Reitz
@ 2014-08-22 11:00   ` Kevin Wolf
  2014-08-25  1:36     ` Hu Tao
  1 sibling, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 11:00 UTC (permalink / raw)
  To: Hu Tao; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> This adds preallocation=falloc and preallocation=full mode to qcow2
> image creation.
> 
> preallocation=full allocates disk space by writing zeros to disk to
> ensure disk space in any cases.
> 
> preallocation=falloc likes preallocation=full, but allocates disk space
> by posix_fallocate().
> 
> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> ---
>  block/qcow2.c              | 31 ++++++++++++++++++++++++--
>  tests/qemu-iotests/082.out | 54 +++++++++++++++++++++++-----------------------
>  2 files changed, 56 insertions(+), 29 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index cfba93b..f48e915 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1593,6 +1593,9 @@ static int preallocate(BlockDriverState *bs)
>      return 0;
>  }
>  
> +static uint64_t minimal_blob_size(uint64_t ts, int cb, int spcb,
> +                                  uint64_t overhead);
> +
>  static int qcow2_create2(const char *filename, int64_t total_size,
>                           const char *backing_file, const char *backing_format,
>                           int flags, size_t cluster_size, PreallocMode prealloc,
> @@ -1628,6 +1631,29 @@ static int qcow2_create2(const char *filename, int64_t total_size,
>      Error *local_err = NULL;
>      int ret;
>  
> +    if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
> +        int64_t meta_size = 0;
> +        uint64_t nl2e;
> +
> +        total_size = align_offset(total_size, cluster_size);

I don't think it's a good idea to let the virtual disk size depend on
whether preallocation is enabled or not. You should always get the same
rounding (which is rounding up to the next sector boundary).

Do you need full clusters for your calculations below or what is this
good for? If so, please use a local variable and leave the value used
for the bdrv_truncate() call unmodified.

> +        /* total size of L2 tables */
> +        nl2e = total_size >> cluster_bits;
> +        nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
> +        uint64_t l2_clusters = nl2e * sizeof(uint64_t) >> cluster_bits;
> +
> +        meta_size =
> +            (1 +
> +             minimal_blob_size(total_size >> BDRV_SECTOR_BITS,
> +                               cluster_bits, cluster_bits - BDRV_SECTOR_BITS,
> +                               1 + l2_clusters +
> +                               (total_size >> cluster_bits)) +
> +             l2_clusters) << cluster_bits;
> +
> +        qemu_opt_set_number(opts, BLOCK_OPT_SIZE, total_size + meta_size);
> +        qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_lookup[prealloc]);
> +    }
> +
>      ret = bdrv_create_file(filename, opts, &local_err);
>      if (ret < 0) {
>          error_propagate(errp, local_err);

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-07-28  8:48 ` [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
  2014-08-22 10:54   ` Kevin Wolf
@ 2014-08-22 12:25   ` Richard W.M. Jones
  2014-08-22 12:36     ` Daniel P. Berrange
  2014-08-22 13:13     ` Kevin Wolf
  1 sibling, 2 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-22 12:25 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz


On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> ping...
> 
> All the 6 patches have reviewed-by now.
> 
> On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > This series adds two preallocation mode to qcow2 and raw:
> > 
> > Option preallocation=full preallocates disk space for image by writing
> > zeros to disk, this ensures disk space in any cases.
> > 
> > Option preallocation=falloc preallocates disk space by calling
> > posix_fallocate(). This is faster than preallocation=full.

Sorry if this was discussed before, but why would anyone use
preallocation=full if preallocation=falloc was possible?

Shouldn't preallocation=full simply use posix_fallocate if it's
available, and fall back to writing zeroes if not?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 12:25   ` Richard W.M. Jones
@ 2014-08-22 12:36     ` Daniel P. Berrange
  2014-08-22 13:13     ` Kevin Wolf
  1 sibling, 0 replies; 35+ messages in thread
From: Daniel P. Berrange @ 2014-08-22 12:36 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, Hu Tao, qemu-devel, Max Reitz, Stefan Hajnoczi,
	Yasunori Goto

On Fri, Aug 22, 2014 at 01:25:56PM +0100, Richard W.M. Jones wrote:
> 
> On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> > ping...
> > 
> > All the 6 patches have reviewed-by now.
> > 
> > On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > > This series adds two preallocation mode to qcow2 and raw:
> > > 
> > > Option preallocation=full preallocates disk space for image by writing
> > > zeros to disk, this ensures disk space in any cases.
> > > 
> > > Option preallocation=falloc preallocates disk space by calling
> > > posix_fallocate(). This is faster than preallocation=full.
> 
> Sorry if this was discussed before, but why would anyone use
> preallocation=full if preallocation=falloc was possible?
> 
> Shouldn't preallocation=full simply use posix_fallocate if it's
> available, and fall back to writing zeroes if not?

posix_fallocate() will itself fallback to writing zeros to
disk if the filesystem can't do the fast allocation method.
So I can't see any reason to have separate options for this.

Just have a preallocation=full option which - at compile
time - chooses between posix_fallocate and write(). This
is what we've done in libvirt for many years now and no
one has ever asked for more

http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virfile.c;h=b6f5e3f2cbbcc6768d764009b238b2349bee3fee;hb=HEAD#l1037

We actually try to use mmap() before going to write() since
that is slightly faster.

http://log.amitshah.net/2009/03/comparison-of-file-systems-and-speeding-up-applications/

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 12:25   ` Richard W.M. Jones
  2014-08-22 12:36     ` Daniel P. Berrange
@ 2014-08-22 13:13     ` Kevin Wolf
  2014-08-22 13:26       ` Richard W.M. Jones
  2014-08-22 14:20       ` Daniel P. Berrange
  1 sibling, 2 replies; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 13:13 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Hu Tao, Max Reitz, qemu-devel, Stefan Hajnoczi, Yasunori Goto

Am 22.08.2014 um 14:25 hat Richard W.M. Jones geschrieben:
> 
> On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> > ping...
> > 
> > All the 6 patches have reviewed-by now.
> > 
> > On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > > This series adds two preallocation mode to qcow2 and raw:
> > > 
> > > Option preallocation=full preallocates disk space for image by writing
> > > zeros to disk, this ensures disk space in any cases.
> > > 
> > > Option preallocation=falloc preallocates disk space by calling
> > > posix_fallocate(). This is faster than preallocation=full.
> 
> Sorry if this was discussed before, but why would anyone use
> preallocation=full if preallocation=falloc was possible?
> 
> Shouldn't preallocation=full simply use posix_fallocate if it's
> available, and fall back to writing zeroes if not?

posix_fallocate() is basically metadata preallocation on the file
system level. If any lower levels involve allocations as well, does
posix_fallocate() allocate them there?

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 13:13     ` Kevin Wolf
@ 2014-08-22 13:26       ` Richard W.M. Jones
  2014-08-22 14:20       ` Daniel P. Berrange
  1 sibling, 0 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-22 13:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Hu Tao, Max Reitz, qemu-devel, Stefan Hajnoczi, Yasunori Goto

On Fri, Aug 22, 2014 at 03:13:31PM +0200, Kevin Wolf wrote:
> Am 22.08.2014 um 14:25 hat Richard W.M. Jones geschrieben:
> > 
> > On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> > > ping...
> > > 
> > > All the 6 patches have reviewed-by now.
> > > 
> > > On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > > > This series adds two preallocation mode to qcow2 and raw:
> > > > 
> > > > Option preallocation=full preallocates disk space for image by writing
> > > > zeros to disk, this ensures disk space in any cases.
> > > > 
> > > > Option preallocation=falloc preallocates disk space by calling
> > > > posix_fallocate(). This is faster than preallocation=full.
> > 
> > Sorry if this was discussed before, but why would anyone use
> > preallocation=full if preallocation=falloc was possible?
> > 
> > Shouldn't preallocation=full simply use posix_fallocate if it's
> > available, and fall back to writing zeroes if not?
> 
> posix_fallocate() is basically metadata preallocation on the file
> system level. If any lower levels involve allocations as well, does
> posix_fallocate() allocate them there?

It's a good question.

>From observation of ext4 using libguestfs, the answer is
(surprisingly to me) no.

Writing zeroes won't necessarily allocate either -- eg.  if
detect_zeroes is enabled, or lots of other exotic storage schemes that
look into what you're writing.

I think my point still stands however.  There's no point giving
callers an option.  QEmu should do the best possible job.  It is the
only agent in a position to know how best to fully allocate the
backing file.

Rich.

$ guestfish -N fs:ext4:1G -m /dev/sda1

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

><fs> fallocate64 /test 100M
><fs> ll /
total 102424
drwxr-xr-x  3 root root      4096 Aug 22 13:22 .
drwxr-xr-x 19 root root      4096 Aug 22 13:21 ..
drwx------  2 root root     16384 Aug 22 13:21 lost+found
-rw-r--r--  1 root root 104857600 Aug 22 13:22 test
><fs> exit

$ du -sh test1.img 
33M  test1.img



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 13:13     ` Kevin Wolf
  2014-08-22 13:26       ` Richard W.M. Jones
@ 2014-08-22 14:20       ` Daniel P. Berrange
  2014-08-22 15:22         ` Kevin Wolf
  1 sibling, 1 reply; 35+ messages in thread
From: Daniel P. Berrange @ 2014-08-22 14:20 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Hu Tao, qemu-devel, Richard W.M. Jones, Stefan Hajnoczi,
	Yasunori Goto, Max Reitz

On Fri, Aug 22, 2014 at 03:13:31PM +0200, Kevin Wolf wrote:
> Am 22.08.2014 um 14:25 hat Richard W.M. Jones geschrieben:
> > 
> > On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> > > ping...
> > > 
> > > All the 6 patches have reviewed-by now.
> > > 
> > > On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > > > This series adds two preallocation mode to qcow2 and raw:
> > > > 
> > > > Option preallocation=full preallocates disk space for image by writing
> > > > zeros to disk, this ensures disk space in any cases.
> > > > 
> > > > Option preallocation=falloc preallocates disk space by calling
> > > > posix_fallocate(). This is faster than preallocation=full.
> > 
> > Sorry if this was discussed before, but why would anyone use
> > preallocation=full if preallocation=falloc was possible?
> > 
> > Shouldn't preallocation=full simply use posix_fallocate if it's
> > available, and fall back to writing zeroes if not?
> 
> posix_fallocate() is basically metadata preallocation on the file
> system level. If any lower levels involve allocations as well, does
> posix_fallocate() allocate them there?

Well the man page says

  "After a successful call to posix_fallocate(), subsequent writes to
   bytes in the specified range are guaranteed not to fail because of
   lack of disk space."

Which seems like it is what users would want when they ask for
preallocate=full. So I'm not seeing the benefit of instead being
able to ask to write zeros would bring.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 14:20       ` Daniel P. Berrange
@ 2014-08-22 15:22         ` Kevin Wolf
  2014-08-22 15:34           ` Richard W.M. Jones
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 15:22 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Hu Tao, qemu-devel, Richard W.M. Jones, Stefan Hajnoczi,
	Yasunori Goto, Max Reitz

Am 22.08.2014 um 16:20 hat Daniel P. Berrange geschrieben:
> On Fri, Aug 22, 2014 at 03:13:31PM +0200, Kevin Wolf wrote:
> > Am 22.08.2014 um 14:25 hat Richard W.M. Jones geschrieben:
> > > 
> > > On Mon, Jul 28, 2014 at 04:48:46PM +0800, Hu Tao wrote:
> > > > ping...
> > > > 
> > > > All the 6 patches have reviewed-by now.
> > > > 
> > > > On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
> > > > > This series adds two preallocation mode to qcow2 and raw:
> > > > > 
> > > > > Option preallocation=full preallocates disk space for image by writing
> > > > > zeros to disk, this ensures disk space in any cases.
> > > > > 
> > > > > Option preallocation=falloc preallocates disk space by calling
> > > > > posix_fallocate(). This is faster than preallocation=full.
> > > 
> > > Sorry if this was discussed before, but why would anyone use
> > > preallocation=full if preallocation=falloc was possible?
> > > 
> > > Shouldn't preallocation=full simply use posix_fallocate if it's
> > > available, and fall back to writing zeroes if not?
> > 
> > posix_fallocate() is basically metadata preallocation on the file
> > system level. If any lower levels involve allocations as well, does
> > posix_fallocate() allocate them there?
> 
> Well the man page says
> 
>   "After a successful call to posix_fallocate(), subsequent writes to
>    bytes in the specified range are guaranteed not to fail because of
>    lack of disk space."
> 
> Which seems like it is what users would want when they ask for
> preallocate=full. So I'm not seeing the benefit of instead being
> able to ask to write zeros would bring.

This is what the spec says. There are two problems with it:

1. The actual implementation doesn't do what the spec says. As Rich's
   test shows (and I expected), the preallocation stays on the
   filesystem level and isn't propagated to lower levels.

2. It's also not what most users want. They don't use preallocation
   because they are afraid of running out of disk space, but because
   they want performance. What they are looking for is "subsequent
   writes to bytes in the specified range are guaranteed to require no
   metadata updates". But this is neither what the spec promises you nor
   (afaik) what actual implementations do.

It's still useful because it happens to reduce the overhead in most
implementations and it's a relatively quick operation, but the best way
I know of to actually _fully_ preallocate is still writing zeros. Which
of the two the user wants, is a decision that qemu can't make for them.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 15:22         ` Kevin Wolf
@ 2014-08-22 15:34           ` Richard W.M. Jones
  2014-08-22 15:39             ` Richard W.M. Jones
  2014-08-22 15:53             ` Kevin Wolf
  0 siblings, 2 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-22 15:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Hu Tao, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Fri, Aug 22, 2014 at 05:22:33PM +0200, Kevin Wolf wrote:
> It's still useful because it happens to reduce the overhead in most
> implementations and it's a relatively quick operation, but the best way
> I know of to actually _fully_ preallocate is still writing zeros. Which
> of the two the user wants, is a decision that qemu can't make for them.

This is a difficult situation.  Possibly the choice is between

 - efficiently make the file fully allocated, that works in the vast
   majority of cases, but don't go crazy (ie. fallocate)

 - really really try as hard as possible to make sure that future
   allocations will never fail (ie. write random non-zero data to the
   file)

Note that neither of these is the preallocation=... option as
specified in this patch.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 15:34           ` Richard W.M. Jones
@ 2014-08-22 15:39             ` Richard W.M. Jones
  2014-08-22 15:53             ` Kevin Wolf
  1 sibling, 0 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-22 15:39 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, Hu Tao, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 04:34:40PM +0100, Richard W.M. Jones wrote:
>  - really really try as hard as possible to make sure that future
>    allocations will never fail (ie. write random non-zero data to the
>    file)

s/will never fail/are not needed and will never fail/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 15:34           ` Richard W.M. Jones
  2014-08-22 15:39             ` Richard W.M. Jones
@ 2014-08-22 15:53             ` Kevin Wolf
  2014-08-22 16:00               ` Richard W.M. Jones
  1 sibling, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2014-08-22 15:53 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Hu Tao, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

Am 22.08.2014 um 17:34 hat Richard W.M. Jones geschrieben:
> On Fri, Aug 22, 2014 at 05:22:33PM +0200, Kevin Wolf wrote:
> > It's still useful because it happens to reduce the overhead in most
> > implementations and it's a relatively quick operation, but the best way
> > I know of to actually _fully_ preallocate is still writing zeros. Which
> > of the two the user wants, is a decision that qemu can't make for them.
> 
> This is a difficult situation.  Possibly the choice is between
> 
>  - efficiently make the file fully allocated, that works in the vast
>    majority of cases, but don't go crazy (ie. fallocate)
> 
>  - really really try as hard as possible to make sure that future
>    allocations will never fail (ie. write random non-zero data to the
>    file)
> 
> Note that neither of these is the preallocation=... option as
> specified in this patch.

Isn't the first one exactly preallocation=falloc and the second is
preallocation=full, except that we're not writing non-zero blocks? (And
probably shouldn't, because that would change the content.)

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 15:53             ` Kevin Wolf
@ 2014-08-22 16:00               ` Richard W.M. Jones
  2014-08-25  5:18                 ` Hu Tao
  2014-08-26  5:27                 ` Hu Tao
  0 siblings, 2 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-22 16:00 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Hu Tao, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Fri, Aug 22, 2014 at 05:53:22PM +0200, Kevin Wolf wrote:
> Am 22.08.2014 um 17:34 hat Richard W.M. Jones geschrieben:
> > On Fri, Aug 22, 2014 at 05:22:33PM +0200, Kevin Wolf wrote:
> > > It's still useful because it happens to reduce the overhead in most
> > > implementations and it's a relatively quick operation, but the best way
> > > I know of to actually _fully_ preallocate is still writing zeros. Which
> > > of the two the user wants, is a decision that qemu can't make for them.
> > 
> > This is a difficult situation.  Possibly the choice is between
> > 
> >  - efficiently make the file fully allocated, that works in the vast
> >    majority of cases, but don't go crazy (ie. fallocate)
> > 
> >  - really really try as hard as possible to make sure that future
> >    allocations will never fail (ie. write random non-zero data to the
> >    file)
> > 
> > Note that neither of these is the preallocation=... option as
> > specified in this patch.
> 
> Isn't the first one exactly preallocation=falloc and the second is
> preallocation=full, except that we're not writing non-zero blocks? (And
> probably shouldn't, because that would change the content.)

Well no for a few reasons:

What is proposed to be called 'preallocation=falloc' should fall back
to other methods (eg. writing random, writing zeroes).  It should
also be called something more useful like 'preallocation=best'.

What is proposed to be called 'preallocation=full' should not write
just zeroes.  It needs to write random data since otherwise lower
layers could discard those writes and that would mean metadata
allocations could still take time (or fail).  It could also be called
something more useful, say, 'preallocation=tryveryhard'.

TBH I think this whole thing is overkill and we should just have a
preallocation option that works like in libvirt.  Anything else is
silly [see above] or pushes the problem to upper layers that are in no
position to make that decision.

Remember that the upper layer is probably not even running on the same
machine.  It has no knowledge of the backing LUN.  It doesn't know
about the hypervisor kernel (ie. if fallocate will fail).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector
  2014-08-22 10:55   ` Kevin Wolf
@ 2014-08-25  1:11     ` Hu Tao
  0 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-25  1:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 12:55:51PM +0200, Kevin Wolf wrote:
> Am 11.07.2014 um 08:09 hat Hu Tao geschrieben:
> > Reviewed-by: Max Reitz <mreitz@redhat.com>
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> > Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> 
> If we make this change, shouldn't we do it consistently across all image
> formats? With this patch we have some formats that round up, and many
> others that round down.

We should.

> 
> Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full.
  2014-08-22 10:57   ` Kevin Wolf
@ 2014-08-25  1:12     ` Hu Tao
  0 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-25  1:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 12:57:15PM +0200, Kevin Wolf wrote:
> Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> > This patch prepares for the subsequent patches.
> > 
> > Reviewed-by: Fam Zheng <famz@redhat.com>
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> > Reviewed-by: Max Reitz <mreitz@redhat.com>
> > Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> 
> Strictly speaking, qcow2 should error out if it is given full or falloc
> after this patch. Two patches later it will actually interpret those
> values correctly, so it's not that big of a problem, but it would
> definitely be nicer.

Okay.

> 
> Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option
  2014-08-22 10:58   ` Kevin Wolf
@ 2014-08-25  1:18     ` Hu Tao
  0 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-25  1:18 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 12:58:23PM +0200, Kevin Wolf wrote:
> Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> > This patch adds a new option preallocation for raw format, and implements
> > falloc and full preallocation.
> > 
> > Reviewed-by: Max Reitz <mreitz@redhat.com>
> > Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> 
> raw-posix needs to error out if called with preallocation=metadata, it
> doesn't implement this option.

Yes.

> 
> Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 10:54   ` Kevin Wolf
@ 2014-08-25  1:35     ` Hu Tao
  2014-08-26 10:44       ` Stefan Hajnoczi
  0 siblings, 1 reply; 35+ messages in thread
From: Hu Tao @ 2014-08-25  1:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 12:54:29PM +0200, Kevin Wolf wrote:
> Am 28.07.2014 um 10:48 hat Hu Tao geschrieben:
> > ping...
> > 
> > All the 6 patches have reviewed-by now.
> 
> Looks mostly good to me, I have only a few minor comments that wouldn't
> block inclusion but could be addressed in follow-up patches.
> 
> However, you have a dependency on a patch series from Max (you use
> minimal_blob_size()), which hasn't been reviewed and merged yet, so your
> series is blocked on that.
> 
> If you want to get your series merged quicker, you could replace this
> with a rough estimate that excludes the clusters used by refcount table
> and blocks. If full preallocation isn't really full, but only
> preallocation of 99.9%, that's probably good enough in practice.

How about my calculation in v10?
https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg02844.html

It doesn't depend on minimal_blob_size(). In most cases the calculated
size is the same as the size before this patch. I have no test results
on hand, but if you need I can do it.

Regards,
Hu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 6/6] qcow2: Add falloc and full preallocation option
  2014-08-22 11:00   ` Kevin Wolf
@ 2014-08-25  1:36     ` Hu Tao
  0 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-25  1:36 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Yasunori Goto, qemu-devel, Stefan Hajnoczi, Max Reitz

On Fri, Aug 22, 2014 at 01:00:53PM +0200, Kevin Wolf wrote:
> Am 11.07.2014 um 08:10 hat Hu Tao geschrieben:
> > This adds preallocation=falloc and preallocation=full mode to qcow2
> > image creation.
> > 
> > preallocation=full allocates disk space by writing zeros to disk to
> > ensure disk space in any cases.
> > 
> > preallocation=falloc likes preallocation=full, but allocates disk space
> > by posix_fallocate().
> > 
> > Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
> > ---
> >  block/qcow2.c              | 31 ++++++++++++++++++++++++--
> >  tests/qemu-iotests/082.out | 54 +++++++++++++++++++++++-----------------------
> >  2 files changed, 56 insertions(+), 29 deletions(-)
> > 
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index cfba93b..f48e915 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -1593,6 +1593,9 @@ static int preallocate(BlockDriverState *bs)
> >      return 0;
> >  }
> >  
> > +static uint64_t minimal_blob_size(uint64_t ts, int cb, int spcb,
> > +                                  uint64_t overhead);
> > +
> >  static int qcow2_create2(const char *filename, int64_t total_size,
> >                           const char *backing_file, const char *backing_format,
> >                           int flags, size_t cluster_size, PreallocMode prealloc,
> > @@ -1628,6 +1631,29 @@ static int qcow2_create2(const char *filename, int64_t total_size,
> >      Error *local_err = NULL;
> >      int ret;
> >  
> > +    if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
> > +        int64_t meta_size = 0;
> > +        uint64_t nl2e;
> > +
> > +        total_size = align_offset(total_size, cluster_size);
> 
> I don't think it's a good idea to let the virtual disk size depend on
> whether preallocation is enabled or not. You should always get the same
> rounding (which is rounding up to the next sector boundary).
> 
> Do you need full clusters for your calculations below or what is this
> good for? If so, please use a local variable and leave the value used
> for the bdrv_truncate() call unmodified.

Yes, it is for the calculation. I'll use a local variable for it.

Regards,
Hu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 16:00               ` Richard W.M. Jones
@ 2014-08-25  5:18                 ` Hu Tao
  2014-08-25 10:31                   ` Richard W.M. Jones
  2014-08-25 13:44                   ` Richard W.M. Jones
  2014-08-26  5:27                 ` Hu Tao
  1 sibling, 2 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-25  5:18 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Fri, Aug 22, 2014 at 05:00:08PM +0100, Richard W.M. Jones wrote:
> On Fri, Aug 22, 2014 at 05:53:22PM +0200, Kevin Wolf wrote:
> > Am 22.08.2014 um 17:34 hat Richard W.M. Jones geschrieben:
> > > On Fri, Aug 22, 2014 at 05:22:33PM +0200, Kevin Wolf wrote:
> > > > It's still useful because it happens to reduce the overhead in most
> > > > implementations and it's a relatively quick operation, but the best way
> > > > I know of to actually _fully_ preallocate is still writing zeros. Which
> > > > of the two the user wants, is a decision that qemu can't make for them.
> > > 
> > > This is a difficult situation.  Possibly the choice is between
> > > 
> > >  - efficiently make the file fully allocated, that works in the vast
> > >    majority of cases, but don't go crazy (ie. fallocate)
> > > 
> > >  - really really try as hard as possible to make sure that future
> > >    allocations will never fail (ie. write random non-zero data to the
> > >    file)
> > > 
> > > Note that neither of these is the preallocation=... option as
> > > specified in this patch.
> > 
> > Isn't the first one exactly preallocation=falloc and the second is
> > preallocation=full, except that we're not writing non-zero blocks? (And
> > probably shouldn't, because that would change the content.)
> 
> Well no for a few reasons:
> 
> What is proposed to be called 'preallocation=falloc' should fall back
> to other methods (eg. writing random, writing zeroes).  It should
> also be called something more useful like 'preallocation=best'.

What if user cares about time(writing zeroes or non-zeroes is
time-consuming) and wants falloc only sometimes? I think this is the
main difference between preallocation=falloc and preallocation=full.

> 
> What is proposed to be called 'preallocation=full' should not write
> just zeroes.  It needs to write random data since otherwise lower
> layers could discard those writes and that would mean metadata
> allocations could still take time (or fail).  It could also be called
> something more useful, say, 'preallocation=tryveryhard'.

I agree with you on this one, but does it need to be random? does ~0 work?

Regards,
Hu

> 
> TBH I think this whole thing is overkill and we should just have a
> preallocation option that works like in libvirt.  Anything else is
> silly [see above] or pushes the problem to upper layers that are in no
> position to make that decision.
> 
> Remember that the upper layer is probably not even running on the same
> machine.  It has no knowledge of the backing LUN.  It doesn't know
> about the hypervisor kernel (ie. if fallocate will fail).
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-p2v converts physical machines to virtual machines.  Boot with a
> live CD or over the network (PXE) and turn machines into KVM guests.
> http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-25  5:18                 ` Hu Tao
@ 2014-08-25 10:31                   ` Richard W.M. Jones
  2014-08-25 13:44                   ` Richard W.M. Jones
  1 sibling, 0 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-25 10:31 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Mon, Aug 25, 2014 at 01:18:30PM +0800, Hu Tao wrote:
> On Fri, Aug 22, 2014 at 05:00:08PM +0100, Richard W.M. Jones wrote:
> > What is proposed to be called 'preallocation=falloc' should fall back
> > to other methods (eg. writing random, writing zeroes).  It should
> > also be called something more useful like 'preallocation=best'.
> 
> What if user cares about time(writing zeroes or non-zeroes is
> time-consuming) and wants falloc only sometimes? I think this is the
> main difference between preallocation=falloc and preallocation=full.

So fallocate fails, and then what?

The user wants something which just works, not something which will
fail inexplicably.  As a rule of thumb, end users and higher layers of
the virt stack have absolutely no idea about the details of what
happens in qemu, and requiring to have this knowledge is impossible.

If it takes too long, that's a performance problem to look into --
fixed by using a filesystem that uses extents.

> > What is proposed to be called 'preallocation=full' should not write
> > just zeroes.  It needs to write random data since otherwise lower
> > layers could discard those writes and that would mean metadata
> > allocations could still take time (or fail).  It could also be called
> > something more useful, say, 'preallocation=tryveryhard'.
> 
> I agree with you on this one, but does it need to be random? does ~0 work?

As I said in the other part of this email, I don't actually think
we should implement such an option.  A simple preallocation option
that just works most of the time is fine.

However since you asked, ~0 will not work.  For a long time SANs have
been able to detect the same data in two blocks and combine them, a
process called 'deduplication'.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-25  5:18                 ` Hu Tao
  2014-08-25 10:31                   ` Richard W.M. Jones
@ 2014-08-25 13:44                   ` Richard W.M. Jones
  1 sibling, 0 replies; 35+ messages in thread
From: Richard W.M. Jones @ 2014-08-25 13:44 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Mon, Aug 25, 2014 at 01:18:30PM +0800, Hu Tao wrote:
> What if user cares about time(writing zeroes or non-zeroes is
> time-consuming) and wants falloc only sometimes? I think this is the
> main difference between preallocation=falloc and preallocation=full.

Also posix_fallocate in glibc falls back to writing zeroes when the
kernel/VFS doesn't support a true fallocate.  So I'm afraid your patch
5/6 has a slow path, at least on common Linux distros.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-22 16:00               ` Richard W.M. Jones
  2014-08-25  5:18                 ` Hu Tao
@ 2014-08-26  5:27                 ` Hu Tao
  1 sibling, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-26  5:27 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, qemu-devel, Max Reitz, Stefan Hajnoczi, Yasunori Goto

On Fri, Aug 22, 2014 at 05:00:08PM +0100, Richard W.M. Jones wrote:
> On Fri, Aug 22, 2014 at 05:53:22PM +0200, Kevin Wolf wrote:
> > Am 22.08.2014 um 17:34 hat Richard W.M. Jones geschrieben:
> > > On Fri, Aug 22, 2014 at 05:22:33PM +0200, Kevin Wolf wrote:
> > > > It's still useful because it happens to reduce the overhead in most
> > > > implementations and it's a relatively quick operation, but the best way
> > > > I know of to actually _fully_ preallocate is still writing zeros. Which
> > > > of the two the user wants, is a decision that qemu can't make for them.
> > > 
> > > This is a difficult situation.  Possibly the choice is between
> > > 
> > >  - efficiently make the file fully allocated, that works in the vast
> > >    majority of cases, but don't go crazy (ie. fallocate)
> > > 
> > >  - really really try as hard as possible to make sure that future
> > >    allocations will never fail (ie. write random non-zero data to the
> > >    file)
> > > 
> > > Note that neither of these is the preallocation=... option as
> > > specified in this patch.
> > 
> > Isn't the first one exactly preallocation=falloc and the second is
> > preallocation=full, except that we're not writing non-zero blocks? (And
> > probably shouldn't, because that would change the content.)
> 
> Well no for a few reasons:
> 
> What is proposed to be called 'preallocation=falloc' should fall back
> to other methods (eg. writing random, writing zeroes).  It should
> also be called something more useful like 'preallocation=best'.

I think you suggested this one. Kevin, how do you think this
implementation?

> 
> What is proposed to be called 'preallocation=full' should not write
> just zeroes.  It needs to write random data since otherwise lower
> layers could discard those writes and that would mean metadata
> allocations could still take time (or fail).  It could also be called
> something more useful, say, 'preallocation=tryveryhard'.
> 
> TBH I think this whole thing is overkill and we should just have a
> preallocation option that works like in libvirt.  Anything else is
> silly [see above] or pushes the problem to upper layers that are in no
> position to make that decision.
> 
> Remember that the upper layer is probably not even running on the same
> machine.  It has no knowledge of the backing LUN.  It doesn't know
> about the hypervisor kernel (ie. if fallocate will fail).
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-p2v converts physical machines to virtual machines.  Boot with a
> live CD or over the network (PXE) and turn machines into KVM guests.
> http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-25  1:35     ` Hu Tao
@ 2014-08-26 10:44       ` Stefan Hajnoczi
  2014-08-28  5:04         ` Hu Tao
  0 siblings, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2014-08-26 10:44 UTC (permalink / raw)
  To: Hu Tao; +Cc: Kevin Wolf, Yasunori Goto, qemu-devel, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1274 bytes --]

On Mon, Aug 25, 2014 at 09:35:15AM +0800, Hu Tao wrote:
> On Fri, Aug 22, 2014 at 12:54:29PM +0200, Kevin Wolf wrote:
> > Am 28.07.2014 um 10:48 hat Hu Tao geschrieben:
> > > ping...
> > > 
> > > All the 6 patches have reviewed-by now.
> > 
> > Looks mostly good to me, I have only a few minor comments that wouldn't
> > block inclusion but could be addressed in follow-up patches.
> > 
> > However, you have a dependency on a patch series from Max (you use
> > minimal_blob_size()), which hasn't been reviewed and merged yet, so your
> > series is blocked on that.
> > 
> > If you want to get your series merged quicker, you could replace this
> > with a rough estimate that excludes the clusters used by refcount table
> > and blocks. If full preallocation isn't really full, but only
> > preallocation of 99.9%, that's probably good enough in practice.
> 
> How about my calculation in v10?
> https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg02844.html
> 
> It doesn't depend on minimal_blob_size(). In most cases the calculated
> size is the same as the size before this patch. I have no test results
> on hand, but if you need I can do it.

Kevin is on vacation this week.

I think the previous calculation could work.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc
  2014-08-26 10:44       ` Stefan Hajnoczi
@ 2014-08-28  5:04         ` Hu Tao
  0 siblings, 0 replies; 35+ messages in thread
From: Hu Tao @ 2014-08-28  5:04 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Yasunori Goto, qemu-devel, Max Reitz

On Tue, Aug 26, 2014 at 11:44:26AM +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 25, 2014 at 09:35:15AM +0800, Hu Tao wrote:
> > On Fri, Aug 22, 2014 at 12:54:29PM +0200, Kevin Wolf wrote:
> > > Am 28.07.2014 um 10:48 hat Hu Tao geschrieben:
> > > > ping...
> > > > 
> > > > All the 6 patches have reviewed-by now.
> > > 
> > > Looks mostly good to me, I have only a few minor comments that wouldn't
> > > block inclusion but could be addressed in follow-up patches.
> > > 
> > > However, you have a dependency on a patch series from Max (you use
> > > minimal_blob_size()), which hasn't been reviewed and merged yet, so your
> > > series is blocked on that.
> > > 
> > > If you want to get your series merged quicker, you could replace this
> > > with a rough estimate that excludes the clusters used by refcount table
> > > and blocks. If full preallocation isn't really full, but only
> > > preallocation of 99.9%, that's probably good enough in practice.
> > 
> > How about my calculation in v10?
> > https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg02844.html
> > 
> > It doesn't depend on minimal_blob_size(). In most cases the calculated
> > size is the same as the size before this patch. I have no test results
> > on hand, but if you need I can do it.
> 
> Kevin is on vacation this week.
> 
> I think the previous calculation could work.
> 
> Stefan

Okay, I'll send v13 patchset implementing preallocation as Rich
suggested for you to review.

Regards,
Hu

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2014-08-28  5:06 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-11  6:09 [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 1/6] block: round up file size to nearest sector Hu Tao
2014-08-22 10:55   ` Kevin Wolf
2014-08-25  1:11     ` Hu Tao
2014-07-11  6:09 ` [Qemu-devel] [PATCH v12 2/6] raw, qcow2: don't convert file size to sector size Hu Tao
2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 3/6] rename parse_enum_option to qapi_enum_parse and make it public Hu Tao
2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 4/6] qapi: introduce PreallocMode and a new PreallocMode full Hu Tao
2014-08-22 10:57   ` Kevin Wolf
2014-08-25  1:12     ` Hu Tao
2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 5/6] raw-posix: Add falloc and full preallocation option Hu Tao
2014-08-22 10:58   ` Kevin Wolf
2014-08-25  1:18     ` Hu Tao
2014-07-11  6:10 ` [Qemu-devel] [PATCH v12 6/6] qcow2: " Hu Tao
2014-07-11 21:07   ` Max Reitz
2014-08-22 11:00   ` Kevin Wolf
2014-08-25  1:36     ` Hu Tao
2014-07-28  8:48 ` [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc Hu Tao
2014-08-22 10:54   ` Kevin Wolf
2014-08-25  1:35     ` Hu Tao
2014-08-26 10:44       ` Stefan Hajnoczi
2014-08-28  5:04         ` Hu Tao
2014-08-22 12:25   ` Richard W.M. Jones
2014-08-22 12:36     ` Daniel P. Berrange
2014-08-22 13:13     ` Kevin Wolf
2014-08-22 13:26       ` Richard W.M. Jones
2014-08-22 14:20       ` Daniel P. Berrange
2014-08-22 15:22         ` Kevin Wolf
2014-08-22 15:34           ` Richard W.M. Jones
2014-08-22 15:39             ` Richard W.M. Jones
2014-08-22 15:53             ` Kevin Wolf
2014-08-22 16:00               ` Richard W.M. Jones
2014-08-25  5:18                 ` Hu Tao
2014-08-25 10:31                   ` Richard W.M. Jones
2014-08-25 13:44                   ` Richard W.M. Jones
2014-08-26  5:27                 ` Hu Tao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.