qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/12] preallocate filter
@ 2020-08-17  9:15 Vladimir Sementsov-Ogievskiy
  2020-08-17  9:15 ` [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING Vladimir Sementsov-Ogievskiy
                   ` (14 more replies)
  0 siblings, 15 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Hi all!

Here is a filter, which does preallocation on write.

In Virtuozzo we have to deal with some custom distributed storage
solution, where allocation is relatively expensive operation. We have to
workaround it in Qemu, so here is a new filter.

Patches 1-10 introduces the new filter and suggested to be merged
to master.

Patches 11,12 are here just to show how we are going to use the feature.
I don't know can they be somehow generally useful, ideas are welcome.

Difference from v1:
1-6 are new and substitutes bdrv_co_range_try_lock mechanism used in v1
07: add note to docs/system/qemu-block-drivers.rst.inc
    add open options
    rebase on new BDRV_REQ_NO_WAIT flag
    drop bs->file check in _co_flush()
    add active logic, to make it possible to insert filter dynamically
08,09: new
10: - use new iotests.verify_o_direct()
10: - add qemu-img check call
11,12: not for commit

Vladimir Sementsov-Ogievskiy (12):
  block: simplify comment to BDRV_REQ_SERIALISING
  block/io.c: drop assertion on double waiting for request serialisation
  block/io: split out bdrv_find_conflicting_request
  block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg
  block: bdrv_mark_request_serialising: split non-waiting function
  block: introduce BDRV_REQ_NO_WAIT flag
  block: introduce preallocate filter
  iotests.py: add verify_o_direct helper
  iotests.py: add filter_img_check
  iotests: add 298 to test new preallocate filter driver
  block: add bdrv_is_file_on_fuse helper
  block/qcow2: automatically insert preallocate filter when on FUSE

 docs/system/qemu-block-drivers.rst.inc |  26 +++
 qapi/block-core.json                   |  20 +-
 include/block/block.h                  |  22 +-
 include/block/block_int.h              |   3 +-
 block/file-posix.c                     |  23 +-
 block/io.c                             | 131 ++++++-----
 block/preallocate.c                    | 291 +++++++++++++++++++++++++
 block/qcow2.c                          |  38 ++++
 block/Makefile.objs                    |   1 +
 tests/qemu-iotests/298                 |  50 +++++
 tests/qemu-iotests/298.out             |   6 +
 tests/qemu-iotests/group               |   1 +
 tests/qemu-iotests/iotests.py          |  10 +
 13 files changed, 554 insertions(+), 68 deletions(-)
 create mode 100644 block/preallocate.c
 create mode 100644 tests/qemu-iotests/298
 create mode 100644 tests/qemu-iotests/298.out

-- 
2.18.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

1. BDRV_REQ_NO_SERIALISING doesn't exist already, don't mention it.

2. We are going to add one more user of BDRV_REQ_SERIALISING, so
   comment about backup becomes a bit confusing here. The use case in
   backup is documented in block/backup.c, so let's just drop
   duplication here.

3. The fact that BDRV_REQ_SERIALISING is only for write requests is
   omitted. Add a note.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 6e36154061..b8f4e86e8d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -53,16 +53,7 @@ typedef enum {
      * content. */
     BDRV_REQ_WRITE_UNCHANGED    = 0x40,
 
-    /*
-     * BDRV_REQ_SERIALISING forces request serialisation for writes.
-     * It is used to ensure that writes to the backing file of a backup process
-     * target cannot race with a read of the backup target that defers to the
-     * backing file.
-     *
-     * Note, that BDRV_REQ_SERIALISING is _not_ opposite in meaning to
-     * BDRV_REQ_NO_SERIALISING. A more descriptive name for the latter might be
-     * _DO_NOT_WAIT_FOR_SERIALISING, except that is too long.
-     */
+    /* Forces request serialisation. Use only with write requests. */
     BDRV_REQ_SERIALISING        = 0x80,
 
     /* Execute the request only if the operation can be offloaded or otherwise
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
  2020-08-17  9:15 ` [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 15:28   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

The comments states, that on misaligned request we should have already
been waiting. But for bdrv_padding_rmw_read, we called
bdrv_mark_request_serialising with align = request_alignment, and now
we serialise with align = cluster_size. So we may have to wait again
with larger alignment.

Note, that the only user of BDRV_REQ_SERIALISING is backup which issues
cluster-aligned requests, so seems the assertion should not fire for
now. But it's wrong anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/io.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/block/io.c b/block/io.c
index ad3a51ed53..b18680a842 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1881,7 +1881,6 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
                           BdrvTrackedRequest *req, int flags)
 {
     BlockDriverState *bs = child->bs;
-    bool waited;
     int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
 
     if (bs->read_only) {
@@ -1893,15 +1892,7 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
     assert(!(flags & ~BDRV_REQ_MASK));
 
     if (flags & BDRV_REQ_SERIALISING) {
-        waited = bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
-        /*
-         * For a misaligned request we should have already waited earlier,
-         * because we come after bdrv_padding_rmw_read which must be called
-         * with the request already marked as serialising.
-         */
-        assert(!waited ||
-               (req->offset == req->overlap_offset &&
-                req->bytes == req->overlap_bytes));
+        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
     } else {
         bdrv_wait_serialising_requests(req);
     }
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
  2020-08-17  9:15 ` [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING Vladimir Sementsov-Ogievskiy
  2020-08-17  9:15 ` [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

To be reused in separate.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/io.c | 71 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 41 insertions(+), 30 deletions(-)

diff --git a/block/io.c b/block/io.c
index b18680a842..5b96715058 100644
--- a/block/io.c
+++ b/block/io.c
@@ -727,43 +727,54 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req,
     return true;
 }
 
+/* Called with self->bs->reqs_lock held */
+static BdrvTrackedRequest *
+bdrv_find_conflicting_request(BdrvTrackedRequest *self)
+{
+    BdrvTrackedRequest *req;
+
+    QLIST_FOREACH(req, &self->bs->tracked_requests, list) {
+        if (req == self || (!req->serialising && !self->serialising)) {
+            continue;
+        }
+        if (tracked_request_overlaps(req, self->overlap_offset,
+                                     self->overlap_bytes))
+        {
+            /*
+             * Hitting this means there was a reentrant request, for
+             * example, a block driver issuing nested requests.  This must
+             * never happen since it means deadlock.
+             */
+            assert(qemu_coroutine_self() != req->co);
+
+            /*
+             * If the request is already (indirectly) waiting for us, or
+             * will wait for us as soon as it wakes up, then just go on
+             * (instead of producing a deadlock in the former case).
+             */
+            if (!req->waiting_for) {
+                return req;
+            }
+        }
+    }
+
+    return NULL;
+}
+
 static bool coroutine_fn
 bdrv_wait_serialising_requests_locked(BlockDriverState *bs,
                                       BdrvTrackedRequest *self)
 {
     BdrvTrackedRequest *req;
-    bool retry;
     bool waited = false;
 
-    do {
-        retry = false;
-        QLIST_FOREACH(req, &bs->tracked_requests, list) {
-            if (req == self || (!req->serialising && !self->serialising)) {
-                continue;
-            }
-            if (tracked_request_overlaps(req, self->overlap_offset,
-                                         self->overlap_bytes))
-            {
-                /* Hitting this means there was a reentrant request, for
-                 * example, a block driver issuing nested requests.  This must
-                 * never happen since it means deadlock.
-                 */
-                assert(qemu_coroutine_self() != req->co);
-
-                /* If the request is already (indirectly) waiting for us, or
-                 * will wait for us as soon as it wakes up, then just go on
-                 * (instead of producing a deadlock in the former case). */
-                if (!req->waiting_for) {
-                    self->waiting_for = req;
-                    qemu_co_queue_wait(&req->wait_queue, &bs->reqs_lock);
-                    self->waiting_for = NULL;
-                    retry = true;
-                    waited = true;
-                    break;
-                }
-            }
-        }
-    } while (retry);
+    while ((req = bdrv_find_conflicting_request(self))) {
+        self->waiting_for = req;
+        qemu_co_queue_wait(&req->wait_queue, &bs->reqs_lock);
+        self->waiting_for = NULL;
+        waited = true;
+    }
+
     return waited;
 }
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

bs is linked in req, so no needs to pass it separately. Most of
tracked-requests API doesn't have bs argument. Actually, after this
patch only tracked_request_begin has it, but it's for purpose.

While being here, also add a comment about what "_locked" is.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/io.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 5b96715058..36bbe4b9b1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -761,16 +761,16 @@ bdrv_find_conflicting_request(BdrvTrackedRequest *self)
     return NULL;
 }
 
+/* Called with self->bs->reqs_lock held */
 static bool coroutine_fn
-bdrv_wait_serialising_requests_locked(BlockDriverState *bs,
-                                      BdrvTrackedRequest *self)
+bdrv_wait_serialising_requests_locked(BdrvTrackedRequest *self)
 {
     BdrvTrackedRequest *req;
     bool waited = false;
 
     while ((req = bdrv_find_conflicting_request(self))) {
         self->waiting_for = req;
-        qemu_co_queue_wait(&req->wait_queue, &bs->reqs_lock);
+        qemu_co_queue_wait(&req->wait_queue, &self->bs->reqs_lock);
         self->waiting_for = NULL;
         waited = true;
     }
@@ -794,7 +794,7 @@ bool bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
 
     req->overlap_offset = MIN(req->overlap_offset, overlap_offset);
     req->overlap_bytes = MAX(req->overlap_bytes, overlap_bytes);
-    waited = bdrv_wait_serialising_requests_locked(bs, req);
+    waited = bdrv_wait_serialising_requests_locked(req);
     qemu_co_mutex_unlock(&bs->reqs_lock);
     return waited;
 }
@@ -876,7 +876,7 @@ static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self
     }
 
     qemu_co_mutex_lock(&bs->reqs_lock);
-    waited = bdrv_wait_serialising_requests_locked(bs, self);
+    waited = bdrv_wait_serialising_requests_locked(self);
     qemu_co_mutex_unlock(&bs->reqs_lock);
 
     return waited;
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:40   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

We'll need a separate function, which will only "mark" request
serialising with specified align but not wait for conflicting
requests. So, it will be like old bdrv_mark_request_serialising(),
before merging bdrv_wait_serialising_requests_locked() into it.

To reduce the possible mess, let's do the following:

Public function that does both marking and waiting will be called
bdrv_make_request_serialising, and private function which will only
"mark" will be called tracked_request_set_serialising().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block_int.h |  3 ++-
 block/file-posix.c        |  2 +-
 block/io.c                | 34 ++++++++++++++++++++++------------
 3 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 38dec0275b..4d56a1b141 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1039,7 +1039,8 @@ extern unsigned int bdrv_drain_all_count;
 void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent);
 void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent);
 
-bool coroutine_fn bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align);
+bool coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
+                                                uint64_t align);
 BdrvTrackedRequest *coroutine_fn bdrv_co_get_self_request(BlockDriverState *bs);
 
 int get_tmp_filename(char *filename, int size);
diff --git a/block/file-posix.c b/block/file-posix.c
index 9a00d4190a..560d1c0a94 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2957,7 +2957,7 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
         req->bytes = end - req->offset;
         req->overlap_bytes = req->bytes;
 
-        bdrv_mark_request_serialising(req, bs->bl.request_alignment);
+        bdrv_make_request_serialising(req, bs->bl.request_alignment);
     }
 #endif
 
diff --git a/block/io.c b/block/io.c
index 36bbe4b9b1..96b1b9cf5f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -778,15 +778,14 @@ bdrv_wait_serialising_requests_locked(BdrvTrackedRequest *self)
     return waited;
 }
 
-bool bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
+/* Called with req->bs->reqs_lock held */
+static void tracked_request_set_serialising(BdrvTrackedRequest *req,
+                                            uint64_t align)
 {
-    BlockDriverState *bs = req->bs;
     int64_t overlap_offset = req->offset & ~(align - 1);
     uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
                                - overlap_offset;
-    bool waited;
 
-    qemu_co_mutex_lock(&bs->reqs_lock);
     if (!req->serialising) {
         atomic_inc(&req->bs->serialising_in_flight);
         req->serialising = true;
@@ -794,9 +793,6 @@ bool bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
 
     req->overlap_offset = MIN(req->overlap_offset, overlap_offset);
     req->overlap_bytes = MAX(req->overlap_bytes, overlap_bytes);
-    waited = bdrv_wait_serialising_requests_locked(req);
-    qemu_co_mutex_unlock(&bs->reqs_lock);
-    return waited;
 }
 
 /**
@@ -882,6 +878,20 @@ static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self
     return waited;
 }
 
+bool bdrv_make_request_serialising(BdrvTrackedRequest *req, uint64_t align)
+{
+    bool waited;
+
+    qemu_co_mutex_lock(&req->bs->reqs_lock);
+
+    tracked_request_set_serialising(req, align);
+    waited = bdrv_wait_serialising_requests_locked(req);
+
+    qemu_co_mutex_unlock(&req->bs->reqs_lock);
+
+    return waited;
+}
+
 static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
                                    size_t size)
 {
@@ -1492,7 +1502,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
          * with each other for the same cluster.  For example, in copy-on-read
          * it ensures that the CoR read and write operations are atomic and
          * guest writes cannot interleave between them. */
-        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
+        bdrv_make_request_serialising(req, bdrv_get_cluster_size(bs));
     } else {
         bdrv_wait_serialising_requests(req);
     }
@@ -1903,7 +1913,7 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
     assert(!(flags & ~BDRV_REQ_MASK));
 
     if (flags & BDRV_REQ_SERIALISING) {
-        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
+        bdrv_make_request_serialising(req, bdrv_get_cluster_size(bs));
     } else {
         bdrv_wait_serialising_requests(req);
     }
@@ -2069,7 +2079,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
 
     padding = bdrv_init_padding(bs, offset, bytes, &pad);
     if (padding) {
-        bdrv_mark_request_serialising(req, align);
+        bdrv_make_request_serialising(req, align);
 
         bdrv_padding_rmw_read(child, req, &pad, true);
 
@@ -2183,7 +2193,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
     }
 
     if (bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad)) {
-        bdrv_mark_request_serialising(&req, align);
+        bdrv_make_request_serialising(&req, align);
         bdrv_padding_rmw_read(child, &req, &pad, false);
     }
 
@@ -3347,7 +3357,7 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
      * new area, we need to make sure that no write requests are made to it
      * concurrently or they might be overwritten by preallocation. */
     if (new_bytes) {
-        bdrv_mark_request_serialising(&req, 1);
+        bdrv_make_request_serialising(&req, 1);
     }
     if (bs->read_only) {
         error_setg(errp, "Image is read-only");
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:42   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 07/12] block: introduce preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Add flag to make serialising request no wait: if there are conflicting
requests, just return error immediately. It's will be used in upcoming
preallocate filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  9 ++++++++-
 block/io.c            | 13 ++++++++++++-
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index b8f4e86e8d..877fda06a4 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -67,8 +67,15 @@ typedef enum {
      * written to qiov parameter which may be NULL.
      */
     BDRV_REQ_PREFETCH  = 0x200,
+
+    /*
+     * If we need to wait for other requests, just fail immediately. Used
+     * only together with BDRV_REQ_SERIALISING.
+     */
+    BDRV_REQ_NO_WAIT = 0x400,
+
     /* Mask of valid flags */
-    BDRV_REQ_MASK               = 0x3ff,
+    BDRV_REQ_MASK               = 0x7ff,
 } BdrvRequestFlags;
 
 typedef struct BlockSizes {
diff --git a/block/io.c b/block/io.c
index 96b1b9cf5f..fc6d44d302 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1911,9 +1911,20 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
     assert(!(bs->open_flags & BDRV_O_INACTIVE));
     assert((bs->open_flags & BDRV_O_NO_IO) == 0);
     assert(!(flags & ~BDRV_REQ_MASK));
+    assert(!((flags & BDRV_REQ_NO_WAIT) && !(flags & BDRV_REQ_SERIALISING)));
 
     if (flags & BDRV_REQ_SERIALISING) {
-        bdrv_make_request_serialising(req, bdrv_get_cluster_size(bs));
+        qemu_co_mutex_lock(&bs->reqs_lock);
+
+        tracked_request_set_serialising(req, bdrv_get_cluster_size(bs));
+
+        if ((flags & BDRV_REQ_NO_WAIT) && bdrv_find_conflicting_request(req)) {
+            return -EBUSY;
+        }
+
+        bdrv_wait_serialising_requests_locked(req);
+
+        qemu_co_mutex_unlock(&bs->reqs_lock);
     } else {
         bdrv_wait_serialising_requests(req);
     }
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 07/12] block: introduce preallocate filter
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:57   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 08/12] iotests.py: add verify_o_direct helper Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

It's intended to be inserted between format and protocol nodes to
preallocate additional space (expanding protocol file) on writes
crossing EOF. It improves performance for file-systems with slow
allocation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 docs/system/qemu-block-drivers.rst.inc |  26 +++
 qapi/block-core.json                   |  20 +-
 block/preallocate.c                    | 291 +++++++++++++++++++++++++
 block/Makefile.objs                    |   1 +
 4 files changed, 337 insertions(+), 1 deletion(-)
 create mode 100644 block/preallocate.c

diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc
index b052a6d14e..5bfa4f4116 100644
--- a/docs/system/qemu-block-drivers.rst.inc
+++ b/docs/system/qemu-block-drivers.rst.inc
@@ -952,3 +952,29 @@ on host and see if there are locks held by the QEMU process on the image file.
 More than one byte could be locked by the QEMU instance, each byte of which
 reflects a particular permission that is acquired or protected by the running
 block driver.
+
+Filter drivers
+~~~~~~~~~~~~~~
+
+Qemu supports several filter drivers, which doesn't store any data, but do some
+additional tasks, hooking io requests.
+
+.. program:: filter-drivers
+.. option:: preallocate
+
+  Preallocate filter driver is intended to be inserted between format
+  and protocol nodes and does preallocation of some additional space
+  (expanding the protocol file) on write. It may be used for
+  file-systems with slow allocation.
+
+  Supported options:
+
+  .. program:: preallocate
+  .. option:: prealloc-align
+
+    On preallocation, align file length to this number, default 1M.
+
+  .. program:: preallocate
+  .. option:: prealloc-size
+
+    How much to preallocate, default 128M.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 197bdc1c36..b40448063b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2805,7 +2805,7 @@
             'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
             'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
             'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
-            'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
+            'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
             { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
             'sheepdog',
             'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
@@ -3074,6 +3074,23 @@
   'data': { 'aes': 'QCryptoBlockOptionsQCow',
             'luks': 'QCryptoBlockOptionsLUKS'} }
 
+##
+# @BlockdevOptionsPreallocate:
+#
+# Filter driver intended to be inserted between format and protocol node
+# and do preallocation in protocol node on write.
+#
+# @prealloc-align: on preallocation, align file length to this number,
+#                 default 1048576 (1M)
+#
+# @prealloc-size: how much to preallocate, default 134217728 (128M)
+#
+# Since: 5.2
+##
+{ 'struct': 'BlockdevOptionsPreallocate',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { '*prealloc-align': 'int', '*prealloc-size': 'int' } }
+
 ##
 # @BlockdevOptionsQcow2:
 #
@@ -3979,6 +3996,7 @@
       'null-co':    'BlockdevOptionsNull',
       'nvme':       'BlockdevOptionsNVMe',
       'parallels':  'BlockdevOptionsGenericFormat',
+      'preallocate':'BlockdevOptionsPreallocate',
       'qcow2':      'BlockdevOptionsQcow2',
       'qcow':       'BlockdevOptionsQcow',
       'qed':        'BlockdevOptionsGenericCOWFormat',
diff --git a/block/preallocate.c b/block/preallocate.c
new file mode 100644
index 0000000000..6dbad8a0a2
--- /dev/null
+++ b/block/preallocate.c
@@ -0,0 +1,291 @@
+/*
+ * preallocate filter driver
+ *
+ * The driver performs preallocate operation: it is injected above
+ * some node, and before each write over EOF it does additional preallocating
+ * write-zeroes request.
+ *
+ * Copyright (c) 2020 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qemu/module.h"
+#include "qemu/option.h"
+#include "qemu/units.h"
+#include "block/block_int.h"
+
+
+typedef struct BDRVPreallocateState {
+    int64_t prealloc_size;
+    int64_t prealloc_align;
+
+    /*
+     * Filter is started as not-active, so it doesn't do any preallocations nor
+     * requires BLK_PERM_RESIZE on its child. This is needed to create filter
+     * above another node-child and than do bdrv_replace_node to insert the
+     * filter.
+     *
+     * Filter becomes active the first time it detect that its parents has
+     * BLK_PERM_RESIZE on it.
+     *
+     * Filter becomes active forever: it doesn't lose active status if parents
+     * lose BLK_PERM_RESIZE, otherwise we'll not be able to shrink the file on
+     * filter close.
+     */
+    bool active;
+
+    /*
+     * Track real data end, to crop preallocation on close  data_end may be
+     * negative, which means that actual status is unknown (nothing cropped in
+     * this case)
+     */
+    int64_t data_end;
+} BDRVPreallocateState;
+
+#define PREALLOCATE_OPT_PREALLOC_ALIGN "prealloc-align"
+#define PREALLOCATE_OPT_PREALLOC_SIZE "prealloc-size"
+static QemuOptsList runtime_opts = {
+    .name = "preallocate",
+    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
+    .desc = {
+        {
+            .name = PREALLOCATE_OPT_PREALLOC_ALIGN,
+            .type = QEMU_OPT_SIZE,
+            .help = "on preallocation, align file length to this number, "
+                "default 1M",
+        },
+        {
+            .name = PREALLOCATE_OPT_PREALLOC_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "how much to preallocate, default 128M",
+        },
+        { /* end of list */ }
+    },
+};
+
+static int preallocate_open(BlockDriverState *bs, QDict *options, int flags,
+                            Error **errp)
+{
+    QemuOpts *opts;
+    BDRVPreallocateState *s = bs->opaque;
+
+    /*
+     * Parameters are hardcoded now. May need to add corresponding options in
+     * future.
+     */
+    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &error_abort);
+    s->prealloc_align =
+        qemu_opt_get_size(opts, PREALLOCATE_OPT_PREALLOC_ALIGN, 1 * MiB);
+    s->prealloc_size =
+        qemu_opt_get_size(opts, PREALLOCATE_OPT_PREALLOC_SIZE, 128 * MiB);
+    qemu_opts_del(opts);
+
+    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+                               BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+                               false, errp);
+    if (!bs->file) {
+        return -EINVAL;
+    }
+
+    s->data_end = bdrv_getlength(bs->file->bs);
+    if (s->data_end < 0) {
+        return s->data_end;
+    }
+
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+        (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
+
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+        ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
+            bs->file->bs->supported_zero_flags);
+
+    return 0;
+}
+
+static void preallocate_close(BlockDriverState *bs)
+{
+    BDRVPreallocateState *s = bs->opaque;
+
+    if (s->active && s->data_end >= 0 &&
+        bdrv_getlength(bs->file->bs) > s->data_end)
+    {
+        bdrv_truncate(bs->file, s->data_end, true, PREALLOC_MODE_OFF, 0, NULL);
+    }
+}
+
+static void preallocate_child_perm(BlockDriverState *bs, BdrvChild *c,
+                                   BdrvChildRole role,
+                                   BlockReopenQueue *reopen_queue,
+                                   uint64_t perm, uint64_t shared,
+                                   uint64_t *nperm, uint64_t *nshared)
+{
+    BDRVPreallocateState *s = bs->opaque;
+
+    bdrv_default_perms(bs, c, role, reopen_queue, perm, shared, nperm, nshared);
+
+    s->active = s->active || (perm & BLK_PERM_RESIZE);
+
+    if (s->active) {
+        /* Force RESIZE permission, to be able to crop file on close() */
+        *nperm |= BLK_PERM_RESIZE;
+    }
+}
+
+static coroutine_fn int preallocate_co_preadv_part(
+        BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+        QEMUIOVector *qiov, size_t qiov_offset, int flags)
+{
+    return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
+                               flags);
+}
+
+static int coroutine_fn preallocate_co_pdiscard(BlockDriverState *bs,
+                                               int64_t offset, int bytes)
+{
+    return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static bool coroutine_fn do_preallocate(BlockDriverState *bs, int64_t offset,
+                                       int64_t bytes, bool write_zero)
+{
+    BDRVPreallocateState *s = bs->opaque;
+    int64_t len, start, end;
+
+    if (!s->active) {
+        return false;
+    }
+
+    if (s->data_end >= 0) {
+        s->data_end = MAX(s->data_end,
+                          QEMU_ALIGN_UP(offset + bytes, BDRV_SECTOR_SIZE));
+    }
+
+    len = bdrv_getlength(bs->file->bs);
+    if (len < 0) {
+        return false;
+    }
+
+    if (s->data_end < 0) {
+        s->data_end = MAX(len,
+                          QEMU_ALIGN_UP(offset + bytes, BDRV_SECTOR_SIZE));
+    }
+
+    if (offset + bytes <= len) {
+        return false;
+    }
+
+    start = write_zero ? MIN(offset, len) : len;
+    end = QEMU_ALIGN_UP(offset + bytes + s->prealloc_size, s->prealloc_align);
+
+    return !bdrv_co_pwrite_zeroes(bs->file, start, end - start,
+            BDRV_REQ_NO_FALLBACK | BDRV_REQ_SERIALISING | BDRV_REQ_NO_WAIT);
+}
+
+static int coroutine_fn preallocate_co_pwrite_zeroes(BlockDriverState *bs,
+        int64_t offset, int bytes, BdrvRequestFlags flags)
+{
+    if (do_preallocate(bs, offset, bytes, true)) {
+        return 0;
+    }
+
+    return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static coroutine_fn int preallocate_co_pwritev_part(BlockDriverState *bs,
+                                                    uint64_t offset,
+                                                    uint64_t bytes,
+                                                    QEMUIOVector *qiov,
+                                                    size_t qiov_offset,
+                                                    int flags)
+{
+    do_preallocate(bs, offset, bytes, false);
+
+    return bdrv_co_pwritev_part(bs->file, offset, bytes, qiov, qiov_offset,
+                                flags);
+}
+
+static int coroutine_fn
+preallocate_co_truncate(BlockDriverState *bs, int64_t offset,
+                        bool exact, PreallocMode prealloc,
+                        BdrvRequestFlags flags, Error **errp)
+{
+    BDRVPreallocateState *s = bs->opaque;
+    int ret = bdrv_co_truncate(bs->file, offset, exact, prealloc, flags, errp);
+
+    /* s->data_end may become negative here, which means unknown data end */
+    s->data_end = bdrv_getlength(bs->file->bs);
+
+    return ret;
+}
+
+static int coroutine_fn preallocate_co_flush(BlockDriverState *bs)
+{
+    return bdrv_co_flush(bs->file->bs);
+}
+
+static int64_t preallocate_getlength(BlockDriverState *bs)
+{
+    /*
+     * We probably can return s->data_end here, but seems safer to return real
+     * file length, not trying to hide the preallocation.
+     *
+     * Still, don't miss the chance to restore s->data_end if it is broken.
+     */
+    BDRVPreallocateState *s = bs->opaque;
+    int64_t ret = bdrv_getlength(bs->file->bs);
+
+    if (s->data_end < 0) {
+        s->data_end = ret;
+    }
+
+    return ret;
+}
+
+BlockDriver bdrv_preallocate_filter = {
+    .format_name = "preallocate",
+    .instance_size = sizeof(BDRVPreallocateState),
+
+    .bdrv_getlength = preallocate_getlength,
+    .bdrv_open = preallocate_open,
+    .bdrv_close = preallocate_close,
+
+    .bdrv_co_preadv_part = preallocate_co_preadv_part,
+    .bdrv_co_pwritev_part = preallocate_co_pwritev_part,
+    .bdrv_co_pwrite_zeroes = preallocate_co_pwrite_zeroes,
+    .bdrv_co_pdiscard = preallocate_co_pdiscard,
+    .bdrv_co_flush = preallocate_co_flush,
+    .bdrv_co_truncate = preallocate_co_truncate,
+
+    .bdrv_co_block_status = bdrv_co_block_status_from_file,
+
+    .bdrv_child_perm = preallocate_child_perm,
+
+    .has_variable_length = true,
+    .is_filter = true,
+};
+
+static void bdrv_preallocate_init(void)
+{
+    bdrv_register(&bdrv_preallocate_filter);
+}
+
+block_init(bdrv_preallocate_init);
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 19c6f371c9..f8e6f16522 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -44,6 +44,7 @@ block-obj-y += crypto.o
 block-obj-y += aio_task.o
 block-obj-y += backup-top.o
 block-obj-y += filter-compress.o
+block-obj-y += preallocate.o
 common-obj-y += monitor/
 block-obj-y += monitor/
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 08/12] iotests.py: add verify_o_direct helper
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 07/12] block: introduce preallocate filter Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 09/12] iotests.py: add filter_img_check Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Add python notrun-helper similar to _check_o_direct for bash tests.
To be used in the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 717b5b652c..369e9918b4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1083,6 +1083,12 @@ def _verify_aio_mode(supported_aio_modes: Sequence[str] = ()) -> None:
     if supported_aio_modes and (aiomode not in supported_aio_modes):
         notrun('not suitable for this aio mode: %s' % aiomode)
 
+def verify_o_direct() -> None:
+    with FilePath('test_o_direct') as f:
+        qemu_img_create('-f', 'raw', f, '1M')
+        if 'O_DIRECT' in qemu_io('-f', 'raw', '-t', 'none', '-c', 'quit', f):
+            notrun(f'file system at {test_dir} does not support O_DIRECT')
+
 def supports_quorum():
     return 'quorum' in qemu_img_pipe('--help')
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 09/12] iotests.py: add filter_img_check
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 08/12] iotests.py: add verify_o_direct helper Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Add analog of bash _filter_qemu_img_check to python framework.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 369e9918b4..ef3da4ee61 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -402,6 +402,10 @@ def filter_img_info(output, filename):
         lines.append(line)
     return '\n'.join(lines)
 
+def filter_img_check(msg):
+    msg = re.sub(r'.*allocated.*fragmented.*compressed clusters', '', msg)
+    return re.sub(r'Image end offset: [0-9]+', '', msg).strip()
+
 def filter_imgfmt(msg):
     return msg.replace(imgfmt, 'IMGFMT')
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 09/12] iotests.py: add filter_img_check Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 14:59   ` Stefan Hajnoczi
  2020-08-17  9:15 ` [PATCH v3 11/12] block: add bdrv_is_file_on_fuse helper Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/298     | 50 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/298.out |  6 +++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 57 insertions(+)
 create mode 100644 tests/qemu-iotests/298
 create mode 100644 tests/qemu-iotests/298.out

diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298
new file mode 100644
index 0000000000..4f2087352a
--- /dev/null
+++ b/tests/qemu-iotests/298
@@ -0,0 +1,50 @@
+#!/usr/bin/env python3
+#
+# Test for preallocate filter
+#
+# Copyright (c) 2020 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+from iotests import log
+
+iotests.script_initialize(supported_fmts=['qcow2'])
+iotests.verify_o_direct()
+
+size = 10 * 1024 * 1024
+disk = iotests.file_path('disk')
+
+iotests.qemu_img_create('-f', iotests.imgfmt, disk, str(size))
+
+opts = f'driver={iotests.imgfmt},' \
+    f'file.driver=preallocate,file.file.filename={disk}'
+p = iotests.QemuIoInteractive('--image-opts', '-t', 'none', opts)
+
+log(p.cmd('write 0 1M'), filters=[iotests.filter_qemu_io])
+p.cmd('flush')
+
+if os.path.getsize(disk) > 100 * 1024 * 1024:
+    log('file in progress is big, preallocation works')
+
+p.close()
+
+if os.path.getsize(disk) < 10 * 1024 * 1024:
+    log('file is small after close')
+
+# Check that there are no leaks.
+log(iotests.qemu_img_pipe('check', '-f', 'qcow2', disk),
+    filters=[iotests.filter_img_check])
diff --git a/tests/qemu-iotests/298.out b/tests/qemu-iotests/298.out
new file mode 100644
index 0000000000..baf8f8425c
--- /dev/null
+++ b/tests/qemu-iotests/298.out
@@ -0,0 +1,6 @@
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+file in progress is big, preallocation works
+file is small after close
+No errors were found on the image.
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 025ed5238d..ac4772b43f 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -306,6 +306,7 @@
 295 rw
 296 rw
 297 meta
+298 auto quick
 299 auto quick
 301 backing quick
 302 quick
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 11/12] block: add bdrv_is_file_on_fuse helper
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-17  9:15 ` [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

Add a function to check, is it a file-posix node on top of file in
FUSE file system.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/block.h |  2 ++
 block/file-posix.c    | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 877fda06a4..51e957f6fb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -787,4 +787,6 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
                                     BdrvChild *dst, uint64_t dst_offset,
                                     uint64_t bytes, BdrvRequestFlags read_flags,
                                     BdrvRequestFlags write_flags);
+
+bool bdrv_is_file_on_fuse(BlockDriverState *bs);
 #endif
diff --git a/block/file-posix.c b/block/file-posix.c
index 560d1c0a94..4100b8dc89 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -324,6 +324,27 @@ static bool dio_byte_aligned(int fd)
     return false;
 }
 
+static bool is_fuse(int fd)
+{
+#ifdef __linux__
+    struct statfs buf;
+    int ret;
+
+    ret = fstatfs(fd, &buf);
+    if (ret == 0 && buf.f_type == FUSE_SUPER_MAGIC) {
+        return true;
+    }
+#endif
+    return false;
+}
+
+bool bdrv_is_file_on_fuse(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    return !strcmp(bs->drv->format_name, "file") && is_fuse(s->fd);
+}
+
 /* Check if read is allowed with given memory buffer and length.
  *
  * This function is used to check O_DIRECT memory buffer and request alignment.
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 11/12] block: add bdrv_is_file_on_fuse helper Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:15 ` Vladimir Sementsov-Ogievskiy
  2020-08-19 15:15   ` Stefan Hajnoczi
  2020-08-17  9:45 ` [PATCH v3 00/12] preallocate filter no-reply
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17  9:15 UTC (permalink / raw)
  To: qemu-block
  Cc: fam, kwolf, vsementsov, armbru, qemu-devel, mreitz, stefanha, den

vstorage has slow allocation, so this patch detect vstorage
(I hope, we don't use other FUSE filesystems) and inserts preallocate
filter between qcow2 node and its file child.

The following test executes more than 10 times faster
(43.2s -> 3.9s for me) with this patch. (/newssd3 is mount point of
vstorage, both cs and mds are on same ssd local ssd disk)

    IMG=/newssd3/z
    FILE_OPTS=file.filename=$IMG
    COUNT=15000
    CHUNK=64K
    CLUSTER=1M
    rm -f $IMG
    ./qemu-img create -f qcow2 -o cluster_size=$CLUSTER $IMG 1G
    ./qemu-img bench -c $COUNT -d 1 -s $CHUNK -w -t none -f qcow2 $IMG

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 6ad6bdc166..f56507158e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1868,6 +1868,40 @@ static void coroutine_fn qcow2_open_entry(void *opaque)
     qemu_co_mutex_unlock(&s->lock);
 }
 
+static int qcow2_vz_insert_prealloc_filter(BlockDriverState *bs, Error **errp)
+{
+    QDict *options;
+    BlockDriverState *filter_bs;
+    Error *local_err = NULL;
+    int flags;
+
+    if (!bdrv_is_file_on_fuse(bs->file->bs)) {
+        /* Nothing to do */
+        return 0;
+    }
+
+    /* Assume it's a vstorage */
+    options = qdict_new();
+    qdict_put_str(options, "driver", "preallocate");
+    qdict_put_str(options, "file", bs->file->bs->node_name);
+    flags = bdrv_is_read_only(bs->file->bs) ? 0 : BDRV_O_RDWR;
+    filter_bs = bdrv_open(NULL, NULL, options, flags, errp);
+    if (!filter_bs) {
+        return -EINVAL;
+    }
+
+    bdrv_replace_node(bs->file->bs, filter_bs, &local_err);
+
+    /*
+     * On failure we want to remove filter_bs, on success it's referenced now by
+     * qcow2 node.
+     */
+    bdrv_unref(filter_bs);
+
+    error_propagate(errp, local_err);
+    return local_err ? -EINVAL : 0;
+}
+
 static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
                       Error **errp)
 {
@@ -1886,6 +1920,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
+    if (qcow2_vz_insert_prealloc_filter(bs, errp) < 0) {
+        return -EINVAL;
+    }
+
     /* Initialise locks */
     qemu_co_mutex_init(&s->lock);
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 00/12] preallocate filter
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2020-08-17  9:15 ` [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE Vladimir Sementsov-Ogievskiy
@ 2020-08-17  9:45 ` no-reply
  2020-08-17 10:44   ` Vladimir Sementsov-Ogievskiy
  2020-08-17  9:48 ` no-reply
  2020-08-19 15:29 ` Stefan Hajnoczi
  14 siblings, 1 reply; 31+ messages in thread
From: no-reply @ 2020-08-17  9:45 UTC (permalink / raw)
  To: vsementsov
  Cc: fam, kwolf, vsementsov, qemu-block, armbru, qemu-devel, stefanha,
	den, mreitz

Patchew URL: https://patchew.org/QEMU/20200817091553.283155-1-vsementsov@virtuozzo.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  LINK    qemu-edid.exe
/usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
/tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
collect2: error: ld returned 1 exit status
make: *** [/tmp/qemu-test/src/rules.mak:124: qemu-io.exe] Error 1
make: *** Waiting for unfinished jobs....
  GEN     x86_64-softmmu/hmp-commands.h
  GEN     x86_64-softmmu/hmp-commands-info.h
---
  CC      aarch64-softmmu/hw/arm/xlnx-zynqmp.o
/usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: ../block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
/tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:219: qemu-system-x86_64w.exe] Error 1
make: *** [Makefile:527: x86_64-softmmu/all] Error 2
  CC      aarch64-softmmu/hw/arm/xlnx-zcu102.o
  CC      aarch64-softmmu/hw/arm/xlnx-versal.o
  CC      aarch64-softmmu/hw/arm/xlnx-versal-virt.o
---
  LINK    aarch64-softmmu/qemu-system-aarch64w.exe
/usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: ../block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
/tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:219: qemu-system-aarch64w.exe] Error 1
make: *** [Makefile:527: aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=46c939bfc42848c6bf98acc32e769fb0', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-v9t81w9y/src/docker-src.2020-08-17-05.40.50.16028:/var/tmp/qemu:z,ro', 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=46c939bfc42848c6bf98acc32e769fb0
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-v9t81w9y/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    4m43.217s
user    0m8.945s


The full log is available at
http://patchew.org/logs/20200817091553.283155-1-vsementsov@virtuozzo.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 00/12] preallocate filter
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2020-08-17  9:45 ` [PATCH v3 00/12] preallocate filter no-reply
@ 2020-08-17  9:48 ` no-reply
  2020-08-17 10:46   ` Vladimir Sementsov-Ogievskiy
  2020-08-19 15:29 ` Stefan Hajnoczi
  14 siblings, 1 reply; 31+ messages in thread
From: no-reply @ 2020-08-17  9:48 UTC (permalink / raw)
  To: vsementsov
  Cc: fam, kwolf, vsementsov, qemu-block, armbru, qemu-devel, stefanha,
	den, mreitz

Patchew URL: https://patchew.org/QEMU/20200817091553.283155-1-vsementsov@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      block/nvme.o
  CC      block/nbd.o
/tmp/qemu-test/src/block/file-posix.c: In function 'is_fuse':
/tmp/qemu-test/src/block/file-posix.c:334:35: error: 'FUSE_SUPER_MAGIC' undeclared (first use in this function)
     if (ret == 0 && buf.f_type == FUSE_SUPER_MAGIC) {
                                   ^
/tmp/qemu-test/src/block/file-posix.c:334:35: note: each undeclared identifier is reported only once for each function it appears in
  CC      block/sheepdog.o
make: *** [block/file-posix.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=ee7ebe61e5fd4a018bbb2a16ec4a5a54', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-o73ap74n/src/docker-src.2020-08-17-05.46.38.25129:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=ee7ebe61e5fd4a018bbb2a16ec4a5a54
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-o73ap74n/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    1m49.146s
user    0m8.506s


The full log is available at
http://patchew.org/logs/20200817091553.283155-1-vsementsov@virtuozzo.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 00/12] preallocate filter
  2020-08-17  9:45 ` [PATCH v3 00/12] preallocate filter no-reply
@ 2020-08-17 10:44   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: fam, kwolf, qemu-block, armbru, mreitz, stefanha, den

Aha, if we want commit 11, we'll need also a stub function for win32.

17.08.2020 12:45, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200817091553.283155-1-vsementsov@virtuozzo.com/
> 
> 
> 
> Hi,
> 
> This series failed the docker-mingw@fedora build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #! /bin/bash
> export ARCH=x86_64
> make docker-image-fedora V=1 NETWORK=1
> time make docker-test-mingw@fedora J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
>    LINK    qemu-edid.exe
> /usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
> /tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
> collect2: error: ld returned 1 exit status
> make: *** [/tmp/qemu-test/src/rules.mak:124: qemu-io.exe] Error 1
> make: *** Waiting for unfinished jobs....
>    GEN     x86_64-softmmu/hmp-commands.h
>    GEN     x86_64-softmmu/hmp-commands-info.h
> ---
>    CC      aarch64-softmmu/hw/arm/xlnx-zynqmp.o
> /usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: ../block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
> /tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
> collect2: error: ld returned 1 exit status
> make[1]: *** [Makefile:219: qemu-system-x86_64w.exe] Error 1
> make: *** [Makefile:527: x86_64-softmmu/all] Error 2
>    CC      aarch64-softmmu/hw/arm/xlnx-zcu102.o
>    CC      aarch64-softmmu/hw/arm/xlnx-versal.o
>    CC      aarch64-softmmu/hw/arm/xlnx-versal-virt.o
> ---
>    LINK    aarch64-softmmu/qemu-system-aarch64w.exe
> /usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: ../block/qcow2.o: in function `qcow2_vz_insert_prealloc_filter':
> /tmp/qemu-test/src/block/qcow2.c:1878: undefined reference to `bdrv_is_file_on_fuse'
> collect2: error: ld returned 1 exit status
> make[1]: *** [Makefile:219: qemu-system-aarch64w.exe] Error 1
> make: *** [Makefile:527: aarch64-softmmu/all] Error 2
> Traceback (most recent call last):
>    File "./tests/docker/docker.py", line 709, in <module>
>      sys.exit(main())
> ---
>      raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=46c939bfc42848c6bf98acc32e769fb0', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-v9t81w9y/src/docker-src.2020-08-17-05.40.50.16028:/var/tmp/qemu:z,ro', 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
> filter=--filter=label=com.qemu.instance.uuid=46c939bfc42848c6bf98acc32e769fb0
> make[1]: *** [docker-run] Error 1
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-v9t81w9y/src'
> make: *** [docker-run-test-mingw@fedora] Error 2
> 
> real    4m43.217s
> user    0m8.945s
> 
> 
> The full log is available at
> http://patchew.org/logs/20200817091553.283155-1-vsementsov@virtuozzo.com/testing.docker-mingw@fedora/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 00/12] preallocate filter
  2020-08-17  9:48 ` no-reply
@ 2020-08-17 10:46   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-17 10:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: fam, kwolf, qemu-block, armbru, mreitz, stefanha, den

Hmm strange. Probably need to check #ifdef FUSE_SUPER_MAGIC

But again, it's a problem of patch 12, which is just an rfc not to be merged.

17.08.2020 12:48, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200817091553.283155-1-vsementsov@virtuozzo.com/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
>    CC      block/nvme.o
>    CC      block/nbd.o
> /tmp/qemu-test/src/block/file-posix.c: In function 'is_fuse':
> /tmp/qemu-test/src/block/file-posix.c:334:35: error: 'FUSE_SUPER_MAGIC' undeclared (first use in this function)
>       if (ret == 0 && buf.f_type == FUSE_SUPER_MAGIC) {
>                                     ^
> /tmp/qemu-test/src/block/file-posix.c:334:35: note: each undeclared identifier is reported only once for each function it appears in
>    CC      block/sheepdog.o
> make: *** [block/file-posix.o] Error 1
> make: *** Waiting for unfinished jobs....
> Traceback (most recent call last):
>    File "./tests/docker/docker.py", line 709, in <module>
> ---
>      raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=ee7ebe61e5fd4a018bbb2a16ec4a5a54', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-o73ap74n/src/docker-src.2020-08-17-05.46.38.25129:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
> filter=--filter=label=com.qemu.instance.uuid=ee7ebe61e5fd4a018bbb2a16ec4a5a54
> make[1]: *** [docker-run] Error 1
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-o73ap74n/src'
> make: *** [docker-run-test-quick@centos7] Error 2
> 
> real    1m49.146s
> user    0m8.506s
> 
> 
> The full log is available at
> http://patchew.org/logs/20200817091553.283155-1-vsementsov@virtuozzo.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function
  2020-08-17  9:15 ` [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:40   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:40 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 494 bytes --]

On Mon, Aug 17, 2020 at 12:15:46PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> @@ -882,6 +878,20 @@ static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self
>      return waited;
>  }
>  
> +bool bdrv_make_request_serialising(BdrvTrackedRequest *req, uint64_t align)
> +{
> +    bool waited;
> +
> +    qemu_co_mutex_lock(&req->bs->reqs_lock);

qemu_co_mutex_lock() is a coroutine_fn so
bdrv_make_request_serialising() needs to be marked coroutine_fn too.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag
  2020-08-17  9:15 ` [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:42   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:42 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 934 bytes --]

On Mon, Aug 17, 2020 at 12:15:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> diff --git a/block/io.c b/block/io.c
> index 96b1b9cf5f..fc6d44d302 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1911,9 +1911,20 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
>      assert(!(bs->open_flags & BDRV_O_INACTIVE));
>      assert((bs->open_flags & BDRV_O_NO_IO) == 0);
>      assert(!(flags & ~BDRV_REQ_MASK));
> +    assert(!((flags & BDRV_REQ_NO_WAIT) && !(flags & BDRV_REQ_SERIALISING)));
>  
>      if (flags & BDRV_REQ_SERIALISING) {
> -        bdrv_make_request_serialising(req, bdrv_get_cluster_size(bs));
> +        qemu_co_mutex_lock(&bs->reqs_lock);
> +
> +        tracked_request_set_serialising(req, bdrv_get_cluster_size(bs));
> +
> +        if ((flags & BDRV_REQ_NO_WAIT) && bdrv_find_conflicting_request(req)) {
> +            return -EBUSY;

bs->reqs_lock is leaked.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 07/12] block: introduce preallocate filter
  2020-08-17  9:15 ` [PATCH v3 07/12] block: introduce preallocate filter Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:57   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:57 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

On Mon, Aug 17, 2020 at 12:15:48PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc
> index b052a6d14e..5bfa4f4116 100644
> --- a/docs/system/qemu-block-drivers.rst.inc
> +++ b/docs/system/qemu-block-drivers.rst.inc
> @@ -952,3 +952,29 @@ on host and see if there are locks held by the QEMU process on the image file.
>  More than one byte could be locked by the QEMU instance, each byte of which
>  reflects a particular permission that is acquired or protected by the running
>  block driver.
> +
> +Filter drivers
> +~~~~~~~~~~~~~~
> +
> +Qemu supports several filter drivers, which doesn't store any data, but do some

s/Qemu/QEMU/

s/doesn't/don't/

> +typedef struct BDRVPreallocateState {
> +    int64_t prealloc_size;
> +    int64_t prealloc_align;
> +
> +    /*
> +     * Filter is started as not-active, so it doesn't do any preallocations nor
> +     * requires BLK_PERM_RESIZE on its child. This is needed to create filter
> +     * above another node-child and than do bdrv_replace_node to insert the

s/than/then/

> +     * filter.
> +     *
> +     * Filter becomes active the first time it detect that its parents has

s/detect/detects/
s/has/have/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING
  2020-08-17  9:15 ` [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

On Mon, Aug 17, 2020 at 12:15:42PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 1. BDRV_REQ_NO_SERIALISING doesn't exist already, don't mention it.
> 
> 2. We are going to add one more user of BDRV_REQ_SERIALISING, so
>    comment about backup becomes a bit confusing here. The use case in
>    backup is documented in block/backup.c, so let's just drop
>    duplication here.
> 
> 3. The fact that BDRV_REQ_SERIALISING is only for write requests is
>    omitted. Add a note.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/block.h | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request
  2020-08-17  9:15 ` [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

On Mon, Aug 17, 2020 at 12:15:44PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> To be reused in separate.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/io.c | 71 +++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 41 insertions(+), 30 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg
  2020-08-17  9:15 ` [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 574 bytes --]

On Mon, Aug 17, 2020 at 12:15:45PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> bs is linked in req, so no needs to pass it separately. Most of
> tracked-requests API doesn't have bs argument. Actually, after this
> patch only tracked_request_begin has it, but it's for purpose.
> 
> While being here, also add a comment about what "_locked" is.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/io.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 08/12] iotests.py: add verify_o_direct helper
  2020-08-17  9:15 ` [PATCH v3 08/12] iotests.py: add verify_o_direct helper Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

On Mon, Aug 17, 2020 at 12:15:49PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add python notrun-helper similar to _check_o_direct for bash tests.
> To be used in the following commit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/iotests.py | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 09/12] iotests.py: add filter_img_check
  2020-08-17  9:15 ` [PATCH v3 09/12] iotests.py: add filter_img_check Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 363 bytes --]

On Mon, Aug 17, 2020 at 12:15:50PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add analog of bash _filter_qemu_img_check to python framework.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/iotests.py | 4 ++++
>  1 file changed, 4 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver
  2020-08-17  9:15 ` [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver Vladimir Sementsov-Ogievskiy
@ 2020-08-19 14:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

On Mon, Aug 17, 2020 at 12:15:51PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/298     | 50 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/298.out |  6 +++++
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 57 insertions(+)
>  create mode 100644 tests/qemu-iotests/298
>  create mode 100644 tests/qemu-iotests/298.out

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE
  2020-08-17  9:15 ` [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE Vladimir Sementsov-Ogievskiy
@ 2020-08-19 15:15   ` Stefan Hajnoczi
  2020-08-20  9:46     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 15:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 966 bytes --]

On Mon, Aug 17, 2020 at 12:15:53PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> vstorage has slow allocation, so this patch detect vstorage
> (I hope, we don't use other FUSE filesystems) and inserts preallocate
> filter between qcow2 node and its file child.
> 
> The following test executes more than 10 times faster
> (43.2s -> 3.9s for me) with this patch. (/newssd3 is mount point of
> vstorage, both cs and mds are on same ssd local ssd disk)
> 
>     IMG=/newssd3/z
>     FILE_OPTS=file.filename=$IMG
>     COUNT=15000
>     CHUNK=64K
>     CLUSTER=1M
>     rm -f $IMG
>     ./qemu-img create -f qcow2 -o cluster_size=$CLUSTER $IMG 1G
>     ./qemu-img bench -c $COUNT -d 1 -s $CHUNK -w -t none -f qcow2 $IMG

Kevin's input is needed here. I think the philosophy is that nodes are
not automatically inserted. The user should define the graph explicitly.
The management tool would be responsible for configuring a preallocate
filter node.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation
  2020-08-17  9:15 ` [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation Vladimir Sementsov-Ogievskiy
@ 2020-08-19 15:28   ` Stefan Hajnoczi
  2020-08-19 15:48     ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 15:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, Paolo Bonzini, den

[-- Attachment #1: Type: text/plain, Size: 2092 bytes --]

On Mon, Aug 17, 2020 at 12:15:43PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> The comments states, that on misaligned request we should have already
> been waiting. But for bdrv_padding_rmw_read, we called
> bdrv_mark_request_serialising with align = request_alignment, and now
> we serialise with align = cluster_size. So we may have to wait again
> with larger alignment.
> 
> Note, that the only user of BDRV_REQ_SERIALISING is backup which issues
> cluster-aligned requests, so seems the assertion should not fire for
> now. But it's wrong anyway.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/io.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)

This code was added by Paolo, CCing him.

> diff --git a/block/io.c b/block/io.c
> index ad3a51ed53..b18680a842 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1881,7 +1881,6 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
>                            BdrvTrackedRequest *req, int flags)
>  {
>      BlockDriverState *bs = child->bs;
> -    bool waited;
>      int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
>  
>      if (bs->read_only) {
> @@ -1893,15 +1892,7 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
>      assert(!(flags & ~BDRV_REQ_MASK));
>  
>      if (flags & BDRV_REQ_SERIALISING) {
> -        waited = bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
> -        /*
> -         * For a misaligned request we should have already waited earlier,
> -         * because we come after bdrv_padding_rmw_read which must be called
> -         * with the request already marked as serialising.
> -         */
> -        assert(!waited ||
> -               (req->offset == req->overlap_offset &&
> -                req->bytes == req->overlap_bytes));
> +        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
>      } else {
>          bdrv_wait_serialising_requests(req);
>      }
> -- 
> 2.18.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 00/12] preallocate filter
  2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2020-08-17  9:48 ` no-reply
@ 2020-08-19 15:29 ` Stefan Hajnoczi
  14 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 15:29 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 199 bytes --]

On Mon, Aug 17, 2020 at 12:15:41PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Here is a filter, which does preallocation on write.

Looks quite close to being merged. I have left comments.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation
  2020-08-19 15:28   ` Stefan Hajnoczi
@ 2020-08-19 15:48     ` Paolo Bonzini
  0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2020-08-19 15:48 UTC (permalink / raw)
  To: Stefan Hajnoczi, Vladimir Sementsov-Ogievskiy
  Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den


[-- Attachment #1.1: Type: text/plain, Size: 2361 bytes --]

On 19/08/20 17:28, Stefan Hajnoczi wrote:
> On Mon, Aug 17, 2020 at 12:15:43PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> The comments states, that on misaligned request we should have already
>> been waiting. But for bdrv_padding_rmw_read, we called
>> bdrv_mark_request_serialising with align = request_alignment, and now
>> we serialise with align = cluster_size. So we may have to wait again
>> with larger alignment.
>>
>> Note, that the only user of BDRV_REQ_SERIALISING is backup which issues
>> cluster-aligned requests, so seems the assertion should not fire for
>> now. But it's wrong anyway.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>  block/io.c | 11 +----------
>>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> This code was added by Paolo, CCing him.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

bdrv_mark_request_serialising already calls
bdrv_wait_serialising_requests_locked so it is okay.

Thanks,

Paolo

>> diff --git a/block/io.c b/block/io.c
>> index ad3a51ed53..b18680a842 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1881,7 +1881,6 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
>>                            BdrvTrackedRequest *req, int flags)
>>  {
>>      BlockDriverState *bs = child->bs;
>> -    bool waited;
>>      int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
>>  
>>      if (bs->read_only) {
>> @@ -1893,15 +1892,7 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
>>      assert(!(flags & ~BDRV_REQ_MASK));
>>  
>>      if (flags & BDRV_REQ_SERIALISING) {
>> -        waited = bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
>> -        /*
>> -         * For a misaligned request we should have already waited earlier,
>> -         * because we come after bdrv_padding_rmw_read which must be called
>> -         * with the request already marked as serialising.
>> -         */
>> -        assert(!waited ||
>> -               (req->offset == req->overlap_offset &&
>> -                req->bytes == req->overlap_bytes));
>> +        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
>>      } else {
>>          bdrv_wait_serialising_requests(req);
>>      }
>> -- 
>> 2.18.0
>>



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE
  2020-08-19 15:15   ` Stefan Hajnoczi
@ 2020-08-20  9:46     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-08-20  9:46 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: fam, kwolf, qemu-block, armbru, qemu-devel, mreitz, den

19.08.2020 18:15, Stefan Hajnoczi wrote:
> On Mon, Aug 17, 2020 at 12:15:53PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> vstorage has slow allocation, so this patch detect vstorage
>> (I hope, we don't use other FUSE filesystems) and inserts preallocate
>> filter between qcow2 node and its file child.
>>
>> The following test executes more than 10 times faster
>> (43.2s -> 3.9s for me) with this patch. (/newssd3 is mount point of
>> vstorage, both cs and mds are on same ssd local ssd disk)
>>
>>      IMG=/newssd3/z
>>      FILE_OPTS=file.filename=$IMG
>>      COUNT=15000
>>      CHUNK=64K
>>      CLUSTER=1M
>>      rm -f $IMG
>>      ./qemu-img create -f qcow2 -o cluster_size=$CLUSTER $IMG 1G
>>      ./qemu-img bench -c $COUNT -d 1 -s $CHUNK -w -t none -f qcow2 $IMG
> 
> Kevin's input is needed here. I think the philosophy is that nodes are
> not automatically inserted. The user should define the graph explicitly.
> The management tool would be responsible for configuring a preallocate
> filter node.
> 

This patch originally is not intended to be merged upstream, only for downstream.
I post it just to possibly get some ideas, could it be somehow useful for others.
(I'm not sure, that all FUSE filesystems needs this filter. But vstorage needs)

Hmm, about automatically inserted nodes: why not? block jobs insert their filters
automatically.. Anyway, for our downstream process the simplest thing is to
insert it automatically in Qemu.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-08-20  9:47 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-17  9:15 [PATCH v3 00/12] preallocate filter Vladimir Sementsov-Ogievskiy
2020-08-17  9:15 ` [PATCH v3 01/12] block: simplify comment to BDRV_REQ_SERIALISING Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 02/12] block/io.c: drop assertion on double waiting for request serialisation Vladimir Sementsov-Ogievskiy
2020-08-19 15:28   ` Stefan Hajnoczi
2020-08-19 15:48     ` Paolo Bonzini
2020-08-17  9:15 ` [PATCH v3 03/12] block/io: split out bdrv_find_conflicting_request Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 04/12] block/io: bdrv_wait_serialising_requests_locked: drop extra bs arg Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 05/12] block: bdrv_mark_request_serialising: split non-waiting function Vladimir Sementsov-Ogievskiy
2020-08-19 14:40   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 06/12] block: introduce BDRV_REQ_NO_WAIT flag Vladimir Sementsov-Ogievskiy
2020-08-19 14:42   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 07/12] block: introduce preallocate filter Vladimir Sementsov-Ogievskiy
2020-08-19 14:57   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 08/12] iotests.py: add verify_o_direct helper Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 09/12] iotests.py: add filter_img_check Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 10/12] iotests: add 298 to test new preallocate filter driver Vladimir Sementsov-Ogievskiy
2020-08-19 14:59   ` Stefan Hajnoczi
2020-08-17  9:15 ` [PATCH v3 11/12] block: add bdrv_is_file_on_fuse helper Vladimir Sementsov-Ogievskiy
2020-08-17  9:15 ` [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE Vladimir Sementsov-Ogievskiy
2020-08-19 15:15   ` Stefan Hajnoczi
2020-08-20  9:46     ` Vladimir Sementsov-Ogievskiy
2020-08-17  9:45 ` [PATCH v3 00/12] preallocate filter no-reply
2020-08-17 10:44   ` Vladimir Sementsov-Ogievskiy
2020-08-17  9:48 ` no-reply
2020-08-17 10:46   ` Vladimir Sementsov-Ogievskiy
2020-08-19 15:29 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).