All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Replace posix_fallocate() with falloate()
@ 2020-08-31 14:01 Nir Soffer
  2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Nir Soffer @ 2020-08-31 14:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Markus Armbruster, Max Reitz, Nir Soffer

Change preallocation=falloc to use fallocate() instead of
posix_fallocte(), improving performance when legacy filesystem that do
not support fallocate, and avoiding issues seen with OFD locks.

More work is needed to respect cache mode when using full preallocation
and maybe optimize buffer size.

Continuing the discussion at:
https://lists.nongnu.org/archive/html/qemu-block/2020-08/msg00947.html

Nir Soffer (2):
  block: file-posix: Extract preallocate helpers
  block: file-posix: Replace posix_fallocate with fallocate

 block/file-posix.c                     | 202 ++++++++++++++-----------
 docs/system/qemu-block-drivers.rst.inc |  11 +-
 docs/tools/qemu-img.rst                |  11 +-
 qapi/block-core.json                   |   4 +-
 4 files changed, 127 insertions(+), 101 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] block: file-posix: Extract preallocate helpers
  2020-08-31 14:01 [PATCH 0/2] Replace posix_fallocate() with falloate() Nir Soffer
@ 2020-08-31 14:01 ` Nir Soffer
  2020-09-01 10:24   ` Alberto Garcia
  2020-09-01 10:26   ` Alberto Garcia
  2020-08-31 14:01 ` [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate Nir Soffer
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Nir Soffer @ 2020-08-31 14:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Markus Armbruster, Max Reitz, Nir Soffer

handle_aiocb_truncate() was too big and complex, implementing 3
different preallocation modes. In a future patch I want to introduce a
fallback from "falloc" to "full"; it will be too messy and error prone
with the current code.

Extract a helper for each of the preallocation modes (falloc, full, off)
and leave only the common preparation and cleanup code in
handle_aiocb_truncate().

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---
 block/file-posix.c | 206 ++++++++++++++++++++++++++-------------------
 1 file changed, 120 insertions(+), 86 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 9a00d4190a..341ffb1cb4 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1832,12 +1832,128 @@ static int allocate_first_block(int fd, size_t max_size)
     return ret;
 }
 
+static int preallocate_falloc(int fd, int64_t current_length, int64_t offset,
+                              Error **errp)
+{
+#ifdef CONFIG_POSIX_FALLOCATE
+    int result;
+
+    if (offset == current_length)
+        return 0;
+
+    /*
+     * Truncating before posix_fallocate() makes it about twice slower on
+     * file systems that do not support fallocate(), trying to check if a
+     * block is allocated before allocating it, so don't do that here.
+     */
+
+    result = -posix_fallocate(fd, current_length,
+                              offset - current_length);
+    if (result != 0) {
+        /* posix_fallocate() doesn't set errno. */
+        error_setg_errno(errp, -result,
+                         "Could not preallocate new data");
+        return result;
+    }
+
+    if (current_length == 0) {
+        /*
+         * posix_fallocate() uses fallocate() if the filesystem supports
+         * it, or fallback to manually writing zeroes. If fallocate()
+         * was used, unaligned reads from the fallocated area in
+         * raw_probe_alignment() will succeed, hence we need to allocate
+         * the first block.
+         *
+         * Optimize future alignment probing; ignore failures.
+         */
+        allocate_first_block(fd, offset);
+    }
+
+    return 0;
+#else
+    return -ENOTSUP;
+#endif
+}
+
+static int preallocate_full(int fd, int64_t current_length, int64_t offset,
+                            Error **errp)
+{
+    int64_t num = 0, left = offset - current_length;
+    off_t seek_result;
+    int result;
+    char *buf = NULL;
+
+    /*
+     * Knowing the final size from the beginning could allow the file
+     * system driver to do less allocations and possibly avoid
+     * fragmentation of the file.
+     */
+    if (ftruncate(fd, offset) != 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not resize file");
+        goto out;
+    }
+
+    buf = g_malloc0(65536);
+
+    seek_result = lseek(fd, current_length, SEEK_SET);
+    if (seek_result < 0) {
+        result = -errno;
+        error_setg_errno(errp, -result,
+                         "Failed to seek to the old end of file");
+        goto out;
+    }
+
+    while (left > 0) {
+        num = MIN(left, 65536);
+        result = write(fd, buf, num);
+        if (result < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+            result = -errno;
+            error_setg_errno(errp, -result,
+                             "Could not write zeros for preallocation");
+            goto out;
+        }
+        left -= result;
+    }
+
+    result = fsync(fd);
+    if (result < 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not flush file to disk");
+        goto out;
+    }
+
+out:
+    g_free(buf);
+
+    return result;
+}
+
+static int preallocate_off(int fd, int64_t current_length, int64_t offset,
+                           Error **errp)
+{
+    if (ftruncate(fd, offset) != 0) {
+        int result = -errno;
+        error_setg_errno(errp, -result, "Could not resize file");
+        return result;
+    }
+
+    if (current_length == 0 && offset > current_length) {
+        /* Optimize future alignment probing; ignore failures. */
+        allocate_first_block(fd, offset);
+    }
+
+    return 0;
+}
+
 static int handle_aiocb_truncate(void *opaque)
 {
     RawPosixAIOData *aiocb = opaque;
     int result = 0;
     int64_t current_length = 0;
-    char *buf = NULL;
     struct stat st;
     int fd = aiocb->aio_fildes;
     int64_t offset = aiocb->aio_offset;
@@ -1859,95 +1975,14 @@ static int handle_aiocb_truncate(void *opaque)
     switch (prealloc) {
 #ifdef CONFIG_POSIX_FALLOCATE
     case PREALLOC_MODE_FALLOC:
-        /*
-         * Truncating before posix_fallocate() makes it about twice slower on
-         * file systems that do not support fallocate(), trying to check if a
-         * block is allocated before allocating it, so don't do that here.
-         */
-        if (offset != current_length) {
-            result = -posix_fallocate(fd, current_length,
-                                      offset - current_length);
-            if (result != 0) {
-                /* posix_fallocate() doesn't set errno. */
-                error_setg_errno(errp, -result,
-                                 "Could not preallocate new data");
-            } else if (current_length == 0) {
-                /*
-                 * posix_fallocate() uses fallocate() if the filesystem
-                 * supports it, or fallback to manually writing zeroes. If
-                 * fallocate() was used, unaligned reads from the fallocated
-                 * area in raw_probe_alignment() will succeed, hence we need to
-                 * allocate the first block.
-                 *
-                 * Optimize future alignment probing; ignore failures.
-                 */
-                allocate_first_block(fd, offset);
-            }
-        } else {
-            result = 0;
-        }
+        result = preallocate_falloc(fd, current_length, offset, errp);
         goto out;
 #endif
     case PREALLOC_MODE_FULL:
-    {
-        int64_t num = 0, left = offset - current_length;
-        off_t seek_result;
-
-        /*
-         * Knowing the final size from the beginning could allow the file
-         * system driver to do less allocations and possibly avoid
-         * fragmentation of the file.
-         */
-        if (ftruncate(fd, offset) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not resize file");
-            goto out;
-        }
-
-        buf = g_malloc0(65536);
-
-        seek_result = lseek(fd, current_length, SEEK_SET);
-        if (seek_result < 0) {
-            result = -errno;
-            error_setg_errno(errp, -result,
-                             "Failed to seek to the old end of file");
-            goto out;
-        }
-
-        while (left > 0) {
-            num = MIN(left, 65536);
-            result = write(fd, buf, num);
-            if (result < 0) {
-                if (errno == EINTR) {
-                    continue;
-                }
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not write zeros for preallocation");
-                goto out;
-            }
-            left -= result;
-        }
-        if (result >= 0) {
-            result = fsync(fd);
-            if (result < 0) {
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not flush file to disk");
-                goto out;
-            }
-        }
+        result = preallocate_full(fd, current_length, offset, errp);
         goto out;
-    }
     case PREALLOC_MODE_OFF:
-        if (ftruncate(fd, offset) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not resize file");
-        } else if (current_length == 0 && offset > current_length) {
-            /* Optimize future alignment probing; ignore failures. */
-            allocate_first_block(fd, offset);
-        }
-        return result;
+        return preallocate_off(fd, current_length, offset, errp);
     default:
         result = -ENOTSUP;
         error_setg(errp, "Unsupported preallocation mode: %s",
@@ -1963,7 +1998,6 @@ out:
         }
     }
 
-    g_free(buf);
     return result;
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate
  2020-08-31 14:01 [PATCH 0/2] Replace posix_fallocate() with falloate() Nir Soffer
  2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
@ 2020-08-31 14:01 ` Nir Soffer
  2020-09-01 15:51   ` Alberto Garcia
  2020-09-14 17:32   ` Daniel P. Berrangé
  2020-08-31 15:55 ` [PATCH 0/2] Replace posix_fallocate() with falloate() no-reply
  2020-09-14 17:19 ` Nir Soffer
  3 siblings, 2 replies; 11+ messages in thread
From: Nir Soffer @ 2020-08-31 14:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Markus Armbruster, Max Reitz, Nir Soffer

If fallocate() is not supported, posix_fallocate() falls back to
inefficient allocation, writing one byte for every 4k bytes[1]. This is
very slow compared with writing zeros. In oVirt we measured ~400%
improvement in allocation time when replacing posix_fallocate() with
manually writing zeroes[2].

We also know that posix_fallocated() does not work well when using OFD
locks[3]. We don't know the reason yet for this issue yet.

Change preallocate_falloc() to use fallocate() instead of
posix_falloate(), and fall back to full preallocation if not supported.

Here are quick test results with this change.

Before (qemu-img-5.1.0-2.fc32.x86_64):

$ time qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc

real 0m42.100s
user 0m0.602s
sys 0m4.137s

NFS stats:
calls      retrans    authrefrsh    write
1571583    0          1572205       1571321

After:

$ time ./qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc

real 0m15.551s
user 0m0.070s
sys 0m2.623s

NFS stats:
calls      retrans    authrefrsh    write
24620      0          24624         24567  

[1] https://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html#96
[2] https://bugzilla.redhat.com/1850267#c25
[3] https://bugzilla.redhat.com/1851097

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---
 block/file-posix.c                     | 32 +++++++++-----------------
 docs/system/qemu-block-drivers.rst.inc | 11 +++++----
 docs/tools/qemu-img.rst                | 11 +++++----
 qapi/block-core.json                   |  4 ++--
 4 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 341ffb1cb4..eac3c0b412 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1835,36 +1835,24 @@ static int allocate_first_block(int fd, size_t max_size)
 static int preallocate_falloc(int fd, int64_t current_length, int64_t offset,
                               Error **errp)
 {
-#ifdef CONFIG_POSIX_FALLOCATE
+#ifdef CONFIG_FALLOCATE
     int result;
 
     if (offset == current_length)
         return 0;
 
-    /*
-     * Truncating before posix_fallocate() makes it about twice slower on
-     * file systems that do not support fallocate(), trying to check if a
-     * block is allocated before allocating it, so don't do that here.
-     */
-
-    result = -posix_fallocate(fd, current_length,
-                              offset - current_length);
+    result = do_fallocate(fd, 0, current_length, offset - current_length);
     if (result != 0) {
-        /* posix_fallocate() doesn't set errno. */
-        error_setg_errno(errp, -result,
-                         "Could not preallocate new data");
+        error_setg_errno(errp, -result, "Could not preallocate new data");
         return result;
     }
 
     if (current_length == 0) {
         /*
-         * posix_fallocate() uses fallocate() if the filesystem supports
-         * it, or fallback to manually writing zeroes. If fallocate()
-         * was used, unaligned reads from the fallocated area in
-         * raw_probe_alignment() will succeed, hence we need to allocate
-         * the first block.
+         * Unaligned reads from the fallocated area in raw_probe_alignment()
+         * will succeed, hence we need to allocate the first block.
          *
-         * Optimize future alignment probing; ignore failures.
+         * Optimizes future alignment probing; ignore failures.
          */
         allocate_first_block(fd, offset);
     }
@@ -1973,10 +1961,12 @@ static int handle_aiocb_truncate(void *opaque)
     }
 
     switch (prealloc) {
-#ifdef CONFIG_POSIX_FALLOCATE
+#ifdef CONFIG_FALLOCATE
     case PREALLOC_MODE_FALLOC:
         result = preallocate_falloc(fd, current_length, offset, errp);
-        goto out;
+        if (result != -ENOTSUP)
+            goto out;
+        /* If fallocate() is not supported, fallback to full preallocation. */
 #endif
     case PREALLOC_MODE_FULL:
         result = preallocate_full(fd, current_length, offset, errp);
@@ -3080,7 +3070,7 @@ static QemuOptsList raw_create_opts = {
             .name = BLOCK_OPT_PREALLOC,
             .type = QEMU_OPT_STRING,
             .help = "Preallocation mode (allowed values: off"
-#ifdef CONFIG_POSIX_FALLOCATE
+#ifdef CONFIG_FALLOCATE
                     ", falloc"
 #endif
                     ", full)"
diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc
index b052a6d14e..8e4acf397e 100644
--- a/docs/system/qemu-block-drivers.rst.inc
+++ b/docs/system/qemu-block-drivers.rst.inc
@@ -25,11 +25,12 @@ This section describes each format and the options that are supported for it.
   .. program:: raw
   .. option:: preallocation
 
-    Preallocation mode (allowed values: ``off``, ``falloc``,
-    ``full``). ``falloc`` mode preallocates space for image by
-    calling ``posix_fallocate()``. ``full`` mode preallocates space
-    for image by writing data to underlying storage. This data may or
-    may not be zero, depending on the storage location.
+    Preallocation mode (allowed values: ``off``, ``falloc``, ``full``).
+    ``falloc`` mode preallocates space for image by calling
+    ``fallocate()``, and falling back to ``full` mode if not supported.
+    ``full`` mode preallocates space for image by writing data to
+    underlying storage. This data may or may not be zero, depending on
+    the storage location.
 
 .. program:: image-formats
 .. option:: qcow2
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c35bd64822..a2089bd1b7 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -750,11 +750,12 @@ Supported image file formats:
   Supported options:
 
   ``preallocation``
-    Preallocation mode (allowed values: ``off``, ``falloc``,
-    ``full``).  ``falloc`` mode preallocates space for image by
-    calling ``posix_fallocate()``.  ``full`` mode preallocates space
-    for image by writing data to underlying storage.  This data may or
-    may not be zero, depending on the storage location.
+    Preallocation mode (allowed values: ``off``, ``falloc``, ``full``).
+    ``falloc`` mode preallocates space for image by calling
+    ``fallocate()``, and falling back to ``full` mode if not supported.
+    ``full`` mode preallocates space for image by writing data to
+    underlying storage. This data may or may not be zero, depending on
+    the storage location.
 
 ``qcow2``
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index db08c58d78..681d79ec63 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5021,8 +5021,8 @@
 #
 # @off: no preallocation
 # @metadata: preallocate only for metadata
-# @falloc: like @full preallocation but allocate disk space by
-#          posix_fallocate() rather than writing data.
+# @falloc: try to allocate disk space by fallocate(), and fallback to
+#          @full preallocation if not supported.
 # @full: preallocate all data by writing it to the device to ensure
 #        disk space is really available. This data may or may not be
 #        zero, depending on the image format and storage.
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Replace posix_fallocate() with falloate()
  2020-08-31 14:01 [PATCH 0/2] Replace posix_fallocate() with falloate() Nir Soffer
  2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
  2020-08-31 14:01 ` [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate Nir Soffer
@ 2020-08-31 15:55 ` no-reply
  2020-09-14 17:19 ` Nir Soffer
  3 siblings, 0 replies; 11+ messages in thread
From: no-reply @ 2020-08-31 15:55 UTC (permalink / raw)
  To: nirsof; +Cc: kwolf, qemu-block, armbru, qemu-devel, mreitz, nsoffer

Patchew URL: https://patchew.org/QEMU/20200831140127.657134-1-nsoffer@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20200831140127.657134-1-nsoffer@redhat.com
Subject: [PATCH 0/2] Replace posix_fallocate() with falloate()

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20200831140127.657134-1-nsoffer@redhat.com -> patchew/20200831140127.657134-1-nsoffer@redhat.com
Switched to a new branch 'test'
70e35ed block: file-posix: Replace posix_fallocate with fallocate
35d89d1 block: file-posix: Extract preallocate helpers

=== OUTPUT BEGIN ===
1/2 Checking commit 35d89d1300e4 (block: file-posix: Extract preallocate helpers)
ERROR: braces {} are necessary for all arms of this statement
#33: FILE: block/file-posix.c:1841:
+    if (offset == current_length)
[...]

total: 1 errors, 0 warnings, 234 lines checked

Patch 1/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

2/2 Checking commit 70e35eda5bc9 (block: file-posix: Replace posix_fallocate with fallocate)
ERROR: braces {} are necessary for all arms of this statement
#110: FILE: block/file-posix.c:1967:
+        if (result != -ENOTSUP)
[...]

total: 1 errors, 0 warnings, 108 lines checked

Patch 2/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200831140127.657134-1-nsoffer@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] block: file-posix: Extract preallocate helpers
  2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
@ 2020-09-01 10:24   ` Alberto Garcia
  2020-09-01 10:26   ` Alberto Garcia
  1 sibling, 0 replies; 11+ messages in thread
From: Alberto Garcia @ 2020-09-01 10:24 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Kevin Wolf, Markus Armbruster, qemu-block, Max Reitz

On Mon 31 Aug 2020 04:01:26 PM CEST, Nir Soffer wrote:
> handle_aiocb_truncate() was too big and complex, implementing 3
> different preallocation modes. In a future patch I want to introduce a
> fallback from "falloc" to "full"; it will be too messy and error prone
> with the current code.
>
> Extract a helper for each of the preallocation modes (falloc, full, off)
> and leave only the common preparation and cleanup code in
> handle_aiocb_truncate().
>
> Signed-off-by: Nir Soffer <nsoffer@redhat.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] block: file-posix: Extract preallocate helpers
  2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
  2020-09-01 10:24   ` Alberto Garcia
@ 2020-09-01 10:26   ` Alberto Garcia
  2020-09-01 10:47     ` Nir Soffer
  1 sibling, 1 reply; 11+ messages in thread
From: Alberto Garcia @ 2020-09-01 10:26 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Kevin Wolf, Markus Armbruster, qemu-block, Max Reitz

On Mon 31 Aug 2020 04:01:26 PM CEST, Nir Soffer wrote:
> +static int preallocate_falloc(int fd, int64_t current_length, int64_t offset,
> +                              Error **errp)
> +{
> +#ifdef CONFIG_POSIX_FALLOCATE
> +    int result;
> +
> +    if (offset == current_length)
> +        return 0;

You can also take the chance to add the missing braces here (there's a
similar warning for the other patch).

Berto


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] block: file-posix: Extract preallocate helpers
  2020-09-01 10:26   ` Alberto Garcia
@ 2020-09-01 10:47     ` Nir Soffer
  0 siblings, 0 replies; 11+ messages in thread
From: Nir Soffer @ 2020-09-01 10:47 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: Kevin Wolf, Nir Soffer, qemu-block, Markus Armbruster,
	QEMU Developers, Max Reitz

On Tue, Sep 1, 2020 at 1:27 PM Alberto Garcia <berto@igalia.com> wrote:
>
> On Mon 31 Aug 2020 04:01:26 PM CEST, Nir Soffer wrote:
> > +static int preallocate_falloc(int fd, int64_t current_length, int64_t offset,
> > +                              Error **errp)
> > +{
> > +#ifdef CONFIG_POSIX_FALLOCATE
> > +    int result;
> > +
> > +    if (offset == current_length)
> > +        return 0;
>
> You can also take the chance to add the missing braces here (there's a
> similar warning for the other patch).

Sure, I'll change it in the next version.

I forgot to run checkpatch.pl, and it also seems extra work when using
git publish.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate
  2020-08-31 14:01 ` [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate Nir Soffer
@ 2020-09-01 15:51   ` Alberto Garcia
  2020-09-14 17:32   ` Daniel P. Berrangé
  1 sibling, 0 replies; 11+ messages in thread
From: Alberto Garcia @ 2020-09-01 15:51 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Kevin Wolf, Markus Armbruster, qemu-block, Max Reitz

On Mon 31 Aug 2020 04:01:27 PM CEST, Nir Soffer wrote:
> If fallocate() is not supported, posix_fallocate() falls back to
> inefficient allocation, writing one byte for every 4k bytes[1]. This is
> very slow compared with writing zeros. In oVirt we measured ~400%
> improvement in allocation time when replacing posix_fallocate() with
> manually writing zeroes[2].
>
> We also know that posix_fallocated() does not work well when using OFD
> locks[3]. We don't know the reason yet for this issue yet.
>
> Change preallocate_falloc() to use fallocate() instead of
> posix_falloate(), and fall back to full preallocation if not
> supported.


>      case PREALLOC_MODE_FALLOC:
>          result = preallocate_falloc(fd, current_length, offset, errp);
> -        goto out;
> +        if (result != -ENOTSUP)
> +            goto out;
> +        /* If fallocate() is not supported, fallback to full preallocation. */

With the missing braces in this if statement,

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Replace posix_fallocate() with falloate()
  2020-08-31 14:01 [PATCH 0/2] Replace posix_fallocate() with falloate() Nir Soffer
                   ` (2 preceding siblings ...)
  2020-08-31 15:55 ` [PATCH 0/2] Replace posix_fallocate() with falloate() no-reply
@ 2020-09-14 17:19 ` Nir Soffer
  3 siblings, 0 replies; 11+ messages in thread
From: Nir Soffer @ 2020-09-14 17:19 UTC (permalink / raw)
  To: Nir Soffer
  Cc: Kevin Wolf, qemu-block, Markus Armbruster, QEMU Developers, Max Reitz

On Mon, Aug 31, 2020 at 5:01 PM Nir Soffer <nirsof@gmail.com> wrote:
>
> Change preallocation=falloc to use fallocate() instead of
> posix_fallocte(), improving performance when legacy filesystem that do
> not support fallocate, and avoiding issues seen with OFD locks.
>
> More work is needed to respect cache mode when using full preallocation
> and maybe optimize buffer size.
>
> Continuing the discussion at:
> https://lists.nongnu.org/archive/html/qemu-block/2020-08/msg00947.html
>
> Nir Soffer (2):
>   block: file-posix: Extract preallocate helpers
>   block: file-posix: Replace posix_fallocate with fallocate
>
>  block/file-posix.c                     | 202 ++++++++++++++-----------
>  docs/system/qemu-block-drivers.rst.inc |  11 +-
>  docs/tools/qemu-img.rst                |  11 +-
>  qapi/block-core.json                   |   4 +-
>  4 files changed, 127 insertions(+), 101 deletions(-)

Ping



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate
  2020-08-31 14:01 ` [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate Nir Soffer
  2020-09-01 15:51   ` Alberto Garcia
@ 2020-09-14 17:32   ` Daniel P. Berrangé
  2020-09-15  8:55     ` Nir Soffer
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel P. Berrangé @ 2020-09-14 17:32 UTC (permalink / raw)
  To: Nir Soffer
  Cc: Kevin Wolf, qemu-block, Markus Armbruster, qemu-devel, Max Reitz,
	Nir Soffer

On Mon, Aug 31, 2020 at 05:01:27PM +0300, Nir Soffer wrote:
> If fallocate() is not supported, posix_fallocate() falls back to
> inefficient allocation, writing one byte for every 4k bytes[1]. This is
> very slow compared with writing zeros. In oVirt we measured ~400%
> improvement in allocation time when replacing posix_fallocate() with
> manually writing zeroes[2].
> 
> We also know that posix_fallocated() does not work well when using OFD
> locks[3]. We don't know the reason yet for this issue yet.
> 
> Change preallocate_falloc() to use fallocate() instead of
> posix_falloate(), and fall back to full preallocation if not supported.
> 
> Here are quick test results with this change.
> 
> Before (qemu-img-5.1.0-2.fc32.x86_64):
> 
> $ time qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
> Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc
> 
> real 0m42.100s
> user 0m0.602s
> sys 0m4.137s
> 
> NFS stats:
> calls      retrans    authrefrsh    write
> 1571583    0          1572205       1571321
> 
> After:
> 
> $ time ./qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
> Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc
> 
> real 0m15.551s
> user 0m0.070s
> sys 0m2.623s
> 
> NFS stats:
> calls      retrans    authrefrsh    write
> 24620      0          24624         24567  
> 
> [1] https://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html#96
> [2] https://bugzilla.redhat.com/1850267#c25
> [3] https://bugzilla.redhat.com/1851097

This bug appears to be private to RH employees only, so rather than link
to it, please summarize any important facts in it for benefit of nonn-RH
QEMU contributors.

> 
> Signed-off-by: Nir Soffer <nsoffer@redhat.com>
> ---
>  block/file-posix.c                     | 32 +++++++++-----------------
>  docs/system/qemu-block-drivers.rst.inc | 11 +++++----
>  docs/tools/qemu-img.rst                | 11 +++++----
>  qapi/block-core.json                   |  4 ++--
>  4 files changed, 25 insertions(+), 33 deletions(-)

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate
  2020-09-14 17:32   ` Daniel P. Berrangé
@ 2020-09-15  8:55     ` Nir Soffer
  0 siblings, 0 replies; 11+ messages in thread
From: Nir Soffer @ 2020-09-15  8:55 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Nir Soffer, qemu-block, Markus Armbruster,
	QEMU Developers, Max Reitz

On Mon, Sep 14, 2020 at 8:32 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Mon, Aug 31, 2020 at 05:01:27PM +0300, Nir Soffer wrote:
> > If fallocate() is not supported, posix_fallocate() falls back to
> > inefficient allocation, writing one byte for every 4k bytes[1]. This is
> > very slow compared with writing zeros. In oVirt we measured ~400%
> > improvement in allocation time when replacing posix_fallocate() with
> > manually writing zeroes[2].
> >
> > We also know that posix_fallocated() does not work well when using OFD
> > locks[3]. We don't know the reason yet for this issue yet.
> >
> > Change preallocate_falloc() to use fallocate() instead of
> > posix_falloate(), and fall back to full preallocation if not supported.
> >
> > Here are quick test results with this change.
> >
> > Before (qemu-img-5.1.0-2.fc32.x86_64):
> >
> > $ time qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
> > Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc
> >
> > real 0m42.100s
> > user 0m0.602s
> > sys 0m4.137s
> >
> > NFS stats:
> > calls      retrans    authrefrsh    write
> > 1571583    0          1572205       1571321
> >
> > After:
> >
> > $ time ./qemu-img create -f raw -o preallocation=falloc /tmp/nfs3/test.raw 6g
> > Formatting '/tmp/nfs3/test.raw', fmt=raw size=6442450944 preallocation=falloc
> >
> > real 0m15.551s
> > user 0m0.070s
> > sys 0m2.623s
> >
> > NFS stats:
> > calls      retrans    authrefrsh    write
> > 24620      0          24624         24567
> >
> > [1] https://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html#96
> > [2] https://bugzilla.redhat.com/1850267#c25
> > [3] https://bugzilla.redhat.com/1851097
>
> This bug appears to be private to RH employees only, so rather than link
> to it, please summarize any important facts in it for benefit of nonn-RH
> QEMU contributors.

Thanks, I missed that detail when linking to the bug. The bug is public now.

> > Signed-off-by: Nir Soffer <nsoffer@redhat.com>
> > ---
> >  block/file-posix.c                     | 32 +++++++++-----------------
> >  docs/system/qemu-block-drivers.rst.inc | 11 +++++----
> >  docs/tools/qemu-img.rst                | 11 +++++----
> >  qapi/block-core.json                   |  4 ++--
> >  4 files changed, 25 insertions(+), 33 deletions(-)
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-15  8:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 14:01 [PATCH 0/2] Replace posix_fallocate() with falloate() Nir Soffer
2020-08-31 14:01 ` [PATCH 1/2] block: file-posix: Extract preallocate helpers Nir Soffer
2020-09-01 10:24   ` Alberto Garcia
2020-09-01 10:26   ` Alberto Garcia
2020-09-01 10:47     ` Nir Soffer
2020-08-31 14:01 ` [PATCH 2/2] block: file-posix: Replace posix_fallocate with fallocate Nir Soffer
2020-09-01 15:51   ` Alberto Garcia
2020-09-14 17:32   ` Daniel P. Berrangé
2020-09-15  8:55     ` Nir Soffer
2020-08-31 15:55 ` [PATCH 0/2] Replace posix_fallocate() with falloate() no-reply
2020-09-14 17:19 ` Nir Soffer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.