All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads
@ 2018-12-11 16:43 Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Hi all!

The series brings threads to qcow2 encryption/decryption path,
like it is already done for compression.

v2: - multiple cipher inside QCryptoBlock instead of multiple
      blocks inside qcow2, as suggested by Daniel, and it is
      done in separate series
    - use threaded encryption in do_perform_cow_encrypt() too
    - some renaming and refactoring and simplifications
(Sorry for not being very careful about change list, but v1 isn't
 actually reviewed, as approach with multiple blocks was early
 rejected by Denial)

The series now based on two queued for 4.0 series, which, in
turn, may be applied in any order: "crypto threads" (Daniel's
tree), and  "qcow2 decompress in threads", which is now in
Kevin's block-next.

Based-on: <20181207161351.4380-1-vsementsov@virtuozzo.com>
([PATCH v3 0/5] crypto threads)
Based-on: git://repo.or.cz/qemu/kevin.git block-next
(decompress in threads inside)

Final performance gain is illustrated by the following test:
 (note, that in v2 I've dropped iter-time=10, pointed by Daniel)

]# cat test.sh 
#!/bin/bash

size=1G
src=/ssd/src.raw
dst=/ssd/dst.enc.qcow2

# create source for tests
./qemu-img create -f raw "$src" $size > /dev/null
./qemu-io -f raw -c "write -P 0xa 0 $size" "$src" > /dev/null

for w in "" "-W"; do
    echo -e "Test with additional paramter for qemu-img: '$w'"

    # create target
    ./qemu-img create -f qcow2 --object secret,id=sec0,data=test -o encrypt.format=luks,encrypt.key-secret=sec0 "$dst" $size > /dev/null

    time ./qemu-img convert $w -f raw --object secret,id=sec0,data=test --target-image-opts -n "$src" "driver=qcow2,file.filename=$dst,encrypt.key-secret=sec0"
    echo
done


before crypto threads series:
Test with additional paramter for qemu-img: ''

real    0m14.224s
user    0m13.559s
sys     0m0.860s

Test with additional paramter for qemu-img: '-W'

real    0m14.002s
user    0m13.562s
sys     0m1.187s


after crypto threads series:
Test with additional paramter for qemu-img: ''

real    0m14.307s
user    0m13.646s
sys     0m0.859s

Test with additional paramter for qemu-img: '-W'

real    0m14.452s
user    0m13.699s
sys     0m1.112s


and after these series:
Test with additional paramter for qemu-img: ''

real    0m14.367s
user    0m13.722s
sys     0m0.829s

Test with additional paramter for qemu-img: '-W'

real    0m5.641s
user    0m15.692s
sys     0m1.207s

Vladimir Sementsov-Ogievskiy (8):
  qcow2.h: add missing include
  qcow2: add separate file for threaded data processing functions
  qcow2-threads: use thread_pool_submit_co
  qcow2-threads: split out generic path
  qcow2: qcow2_co_preadv: improve locking
  qcow2: qcow2_co_preadv: skip using hd_qiov when possible
  qcow2: bdrv_co_pwritev: move encryption code out of the lock
  qcow2: do encryption in threads

 block/qcow2.h         |  20 ++-
 block/qcow2-cluster.c |   7 +-
 block/qcow2-threads.c | 260 +++++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 276 ++++++++----------------------------------
 block/Makefile.objs   |   2 +-
 5 files changed, 335 insertions(+), 230 deletions(-)
 create mode 100644 block/qcow2-threads.c

-- 
2.18.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-13 10:18   ` Alberto Garcia
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

qcow2.h depends on block_int.h. Compilation isn't broken currently only
due to block_int.h always included before qcow2.h. Though, it seems
better to directly include block_int.h in qcow2.h.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index a98d24500b..5095f893a0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -28,6 +28,7 @@
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
 #include "qemu/units.h"
+#include "block/block_int.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2019-01-03 15:11   ` Alberto Garcia
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Move compression-on-threads to separate file. Encryption will be in it
too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |   7 ++
 block/qcow2-threads.c | 197 ++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 169 ------------------------------------
 block/Makefile.objs   |   2 +-
 4 files changed, 205 insertions(+), 170 deletions(-)
 create mode 100644 block/qcow2-threads.c

diff --git a/block/qcow2.h b/block/qcow2.h
index 5095f893a0..be84d7c96a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -695,4 +695,11 @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
                                           const char *name,
                                           Error **errp);
 
+ssize_t coroutine_fn
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                  const void *src, size_t src_size);
+ssize_t coroutine_fn
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
+                    const void *src, size_t src_size);
+
 #endif
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
new file mode 100644
index 0000000000..b75fad6e07
--- /dev/null
+++ b/block/qcow2-threads.c
@@ -0,0 +1,197 @@
+/*
+ * Threaded data processing for Qcow2: compression, encryption
+ *
+ * Copyright (c) 2004-2006 Fabrice Bellard
+ * Copyright (c) 2018 Virtuozzo International GmbH. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+
+#define ZLIB_CONST
+#include <zlib.h>
+
+#include "qcow2.h"
+#include "block/thread-pool.h"
+
+#define MAX_COMPRESS_THREADS 4
+
+typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
+                                     const void *src, size_t src_size);
+typedef struct Qcow2CompressData {
+    void *dest;
+    size_t dest_size;
+    const void *src;
+    size_t src_size;
+    ssize_t ret;
+
+    Qcow2CompressFunc func;
+} Qcow2CompressData;
+
+/*
+ * qcow2_compress()
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: compressed size on success
+ *          -1 destination buffer is not enough to store compressed data
+ *          -2 on any other error
+ */
+static ssize_t qcow2_compress(void *dest, size_t dest_size,
+                              const void *src, size_t src_size)
+{
+    ssize_t ret;
+    z_stream strm;
+
+    /* best compression, small window, no zlib header */
+    memset(&strm, 0, sizeof(strm));
+    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
+                       -12, 9, Z_DEFAULT_STRATEGY);
+    if (ret != Z_OK) {
+        return -2;
+    }
+
+    /* strm.next_in is not const in old zlib versions, such as those used on
+     * OpenBSD/NetBSD, so cast the const away */
+    strm.avail_in = src_size;
+    strm.next_in = (void *) src;
+    strm.avail_out = dest_size;
+    strm.next_out = dest;
+
+    ret = deflate(&strm, Z_FINISH);
+    if (ret == Z_STREAM_END) {
+        ret = dest_size - strm.avail_out;
+    } else {
+        ret = (ret == Z_OK ? -1 : -2);
+    }
+
+    deflateEnd(&strm);
+
+    return ret;
+}
+
+/*
+ * qcow2_decompress()
+ *
+ * Decompress some data (not more than @src_size bytes) to produce exactly
+ * @dest_size bytes.
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: 0 on success
+ *          -1 on fail
+ */
+static ssize_t qcow2_decompress(void *dest, size_t dest_size,
+                                const void *src, size_t src_size)
+{
+    int ret = 0;
+    z_stream strm;
+
+    memset(&strm, 0, sizeof(strm));
+    strm.avail_in = src_size;
+    strm.next_in = (void *) src;
+    strm.avail_out = dest_size;
+    strm.next_out = dest;
+
+    ret = inflateInit2(&strm, -12);
+    if (ret != Z_OK) {
+        return -1;
+    }
+
+    ret = inflate(&strm, Z_FINISH);
+    if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
+        /* We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
+         * @src buffer may be processed partly (because in qcow2 we know size of
+         * compressed data with precision of one sector) */
+        ret = -1;
+    }
+
+    inflateEnd(&strm);
+
+    return ret;
+}
+
+static int qcow2_compress_pool_func(void *opaque)
+{
+    Qcow2CompressData *data = opaque;
+
+    data->ret = data->func(data->dest, data->dest_size,
+                           data->src, data->src_size);
+
+    return 0;
+}
+
+static void qcow2_compress_complete(void *opaque, int ret)
+{
+    qemu_coroutine_enter(opaque);
+}
+
+static ssize_t coroutine_fn
+qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                     const void *src, size_t src_size, Qcow2CompressFunc func)
+{
+    BDRVQcow2State *s = bs->opaque;
+    BlockAIOCB *acb;
+    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    Qcow2CompressData arg = {
+        .dest = dest,
+        .dest_size = dest_size,
+        .src = src,
+        .src_size = src_size,
+        .func = func,
+    };
+
+    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
+        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
+    }
+
+    s->nb_compress_threads++;
+    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
+                                 qcow2_compress_complete,
+                                 qemu_coroutine_self());
+
+    if (!acb) {
+        s->nb_compress_threads--;
+        return -EINVAL;
+    }
+    qemu_coroutine_yield();
+    s->nb_compress_threads--;
+    qemu_co_queue_next(&s->compress_wait_queue);
+
+    return arg.ret;
+}
+
+ssize_t coroutine_fn
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                  const void *src, size_t src_size)
+{
+    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
+                                qcow2_compress);
+}
+
+ssize_t coroutine_fn
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
+                    const void *src, size_t src_size)
+{
+    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
+                                qcow2_decompress);
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 4897abae5e..e61dc54fd0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -24,9 +24,6 @@
 
 #include "qemu/osdep.h"
 
-#define ZLIB_CONST
-#include <zlib.h>
-
 #include "block/block_int.h"
 #include "block/qdict.h"
 #include "sysemu/block-backend.h"
@@ -44,7 +41,6 @@
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
 #include "crypto.h"
-#include "block/thread-pool.h"
 
 /*
   Differences with QCOW:
@@ -3720,171 +3716,6 @@ fail:
     return ret;
 }
 
-/*
- * qcow2_compress()
- *
- * @dest - destination buffer, @dest_size bytes
- * @src - source buffer, @src_size bytes
- *
- * Returns: compressed size on success
- *          -1 destination buffer is not enough to store compressed data
- *          -2 on any other error
- */
-static ssize_t qcow2_compress(void *dest, size_t dest_size,
-                              const void *src, size_t src_size)
-{
-    ssize_t ret;
-    z_stream strm;
-
-    /* best compression, small window, no zlib header */
-    memset(&strm, 0, sizeof(strm));
-    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
-                       -12, 9, Z_DEFAULT_STRATEGY);
-    if (ret != Z_OK) {
-        return -2;
-    }
-
-    /* strm.next_in is not const in old zlib versions, such as those used on
-     * OpenBSD/NetBSD, so cast the const away */
-    strm.avail_in = src_size;
-    strm.next_in = (void *) src;
-    strm.avail_out = dest_size;
-    strm.next_out = dest;
-
-    ret = deflate(&strm, Z_FINISH);
-    if (ret == Z_STREAM_END) {
-        ret = dest_size - strm.avail_out;
-    } else {
-        ret = (ret == Z_OK ? -1 : -2);
-    }
-
-    deflateEnd(&strm);
-
-    return ret;
-}
-
-/*
- * qcow2_decompress()
- *
- * Decompress some data (not more than @src_size bytes) to produce exactly
- * @dest_size bytes.
- *
- * @dest - destination buffer, @dest_size bytes
- * @src - source buffer, @src_size bytes
- *
- * Returns: 0 on success
- *          -1 on fail
- */
-static ssize_t qcow2_decompress(void *dest, size_t dest_size,
-                                const void *src, size_t src_size)
-{
-    int ret = 0;
-    z_stream strm;
-
-    memset(&strm, 0, sizeof(strm));
-    strm.avail_in = src_size;
-    strm.next_in = (void *) src;
-    strm.avail_out = dest_size;
-    strm.next_out = dest;
-
-    ret = inflateInit2(&strm, -12);
-    if (ret != Z_OK) {
-        return -1;
-    }
-
-    ret = inflate(&strm, Z_FINISH);
-    if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
-        /* We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
-         * @src buffer may be processed partly (because in qcow2 we know size of
-         * compressed data with precision of one sector) */
-        ret = -1;
-    }
-
-    inflateEnd(&strm);
-
-    return ret;
-}
-
-#define MAX_COMPRESS_THREADS 4
-
-typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
-                                     const void *src, size_t src_size);
-typedef struct Qcow2CompressData {
-    void *dest;
-    size_t dest_size;
-    const void *src;
-    size_t src_size;
-    ssize_t ret;
-
-    Qcow2CompressFunc func;
-} Qcow2CompressData;
-
-static int qcow2_compress_pool_func(void *opaque)
-{
-    Qcow2CompressData *data = opaque;
-
-    data->ret = data->func(data->dest, data->dest_size,
-                           data->src, data->src_size);
-
-    return 0;
-}
-
-static void qcow2_compress_complete(void *opaque, int ret)
-{
-    qemu_coroutine_enter(opaque);
-}
-
-static ssize_t coroutine_fn
-qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
-                     const void *src, size_t src_size, Qcow2CompressFunc func)
-{
-    BDRVQcow2State *s = bs->opaque;
-    BlockAIOCB *acb;
-    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    Qcow2CompressData arg = {
-        .dest = dest,
-        .dest_size = dest_size,
-        .src = src,
-        .src_size = src_size,
-        .func = func,
-    };
-
-    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
-        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
-    }
-
-    s->nb_compress_threads++;
-    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
-                                 qcow2_compress_complete,
-                                 qemu_coroutine_self());
-
-    if (!acb) {
-        s->nb_compress_threads--;
-        return -EINVAL;
-    }
-    qemu_coroutine_yield();
-    s->nb_compress_threads--;
-    qemu_co_queue_next(&s->compress_wait_queue);
-
-    return arg.ret;
-}
-
-static ssize_t coroutine_fn
-qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
-                  const void *src, size_t src_size)
-{
-    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-                                qcow2_compress);
-}
-
-static ssize_t coroutine_fn
-qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
-                    const void *src, size_t src_size)
-{
-    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-                                qcow2_decompress);
-}
-
 /* XXX: put compressed sectors first, then all the cluster aligned
    tables to avoid losing bytes in alignment */
 static coroutine_fn int
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 7a81892a52..ae11605c9f 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -6,7 +6,7 @@ block-obj-$(CONFIG_BOCHS) += bochs.o
 block-obj-$(CONFIG_VVFAT) += vvfat.o
 block-obj-$(CONFIG_DMG) += dmg.o
 
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o qcow2-threads.o
 block-obj-$(CONFIG_QED) += qed.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-$(CONFIG_QED) += qed-check.o
 block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2019-01-03 15:28   ` Alberto Garcia
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Use thread_pool_submit_co, instead of reinventing it here. Note, that
thread_pool_submit_aio() never returns an error, so checking it was an
extra thing.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-threads.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index b75fad6e07..8c10191f2f 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -140,17 +140,11 @@ static int qcow2_compress_pool_func(void *opaque)
     return 0;
 }
 
-static void qcow2_compress_complete(void *opaque, int ret)
-{
-    qemu_coroutine_enter(opaque);
-}
-
 static ssize_t coroutine_fn
 qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
                      const void *src, size_t src_size, Qcow2CompressFunc func)
 {
     BDRVQcow2State *s = bs->opaque;
-    BlockAIOCB *acb;
     ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     Qcow2CompressData arg = {
         .dest = dest,
@@ -165,16 +159,9 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
     }
 
     s->nb_compress_threads++;
-    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
-                                 qcow2_compress_complete,
-                                 qemu_coroutine_self());
-
-    if (!acb) {
-        s->nb_compress_threads--;
-        return -EINVAL;
-    }
-    qemu_coroutine_yield();
+    thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
     s->nb_compress_threads--;
+
     qemu_co_queue_next(&s->compress_wait_queue);
 
     return arg.ret;
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-13 23:28   ` Paolo Bonzini
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 5/8] qcow2: qcow2_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Move generic part out of qcow2_co_do_compress, to reuse it for
encryption and rename things that would be shared with encryption path.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  4 ++--
 block/qcow2-threads.c | 39 +++++++++++++++++++++++++++------------
 block/qcow2.c         |  2 +-
 3 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index be84d7c96a..9c2f6749ba 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -336,8 +336,8 @@ typedef struct BDRVQcow2State {
     char *image_backing_file;
     char *image_backing_format;
 
-    CoQueue compress_wait_queue;
-    int nb_compress_threads;
+    CoQueue thread_task_queue;
+    int nb_threads;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 8c10191f2f..84b3ede4f1 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -31,7 +31,32 @@
 #include "qcow2.h"
 #include "block/thread-pool.h"
 
-#define MAX_COMPRESS_THREADS 4
+#define QCOW2_MAX_THREADS 4
+
+static int coroutine_fn
+qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
+{
+    int ret;
+    BDRVQcow2State *s = bs->opaque;
+    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+
+    while (s->nb_threads >= QCOW2_MAX_THREADS) {
+        qemu_co_queue_wait(&s->thread_task_queue, NULL);
+    }
+
+    s->nb_threads++;
+    ret = thread_pool_submit_co(pool, func, arg);
+    s->nb_threads--;
+
+    qemu_co_queue_next(&s->thread_task_queue);
+
+    return ret;
+}
+
+
+/*
+ * Compression
+ */
 
 typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
                                      const void *src, size_t src_size);
@@ -144,8 +169,6 @@ static ssize_t coroutine_fn
 qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
                      const void *src, size_t src_size, Qcow2CompressFunc func)
 {
-    BDRVQcow2State *s = bs->opaque;
-    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     Qcow2CompressData arg = {
         .dest = dest,
         .dest_size = dest_size,
@@ -154,15 +177,7 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
         .func = func,
     };
 
-    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
-        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
-    }
-
-    s->nb_compress_threads++;
-    thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
-    s->nb_compress_threads--;
-
-    qemu_co_queue_next(&s->compress_wait_queue);
+    qcow2_co_process(bs, qcow2_compress_pool_func, &arg);
 
     return arg.ret;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index e61dc54fd0..c4b716d4f6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1600,7 +1600,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
     }
 #endif
 
-    qemu_co_queue_init(&s->compress_wait_queue);
+    qemu_co_queue_init(&s->thread_task_queue);
 
     return ret;
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 5/8] qcow2: qcow2_co_preadv: improve locking
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 6/8] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Background: decryption will be done in threads, to take benefit of it,
we should move it out of the lock first.

But let's go further: it turns out, that for locking around switch
cases we have only two variants: when we just do memset(0) not
releasing the lock (it is useless) and when we actually can handle the
whole case out of the lock. So, refactor the whole thing to reduce
locked code region and make it clean.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 48 ++++++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index c4b716d4f6..2a49314e42 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1884,6 +1884,7 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
 
         ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
         if (ret < 0) {
+            qemu_co_mutex_unlock(&s->lock);
             goto fail;
         }
 
@@ -1892,39 +1893,38 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
         qemu_iovec_reset(&hd_qiov);
         qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
+        if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
+            ret == QCOW2_CLUSTER_ZERO_ALLOC ||
+            (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+        {
+            /* No sense in releasing the lock */
+
+            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+
+            bytes -= cur_bytes;
+            offset += cur_bytes;
+            bytes_done += cur_bytes;
+            continue;
+        }
+
+        qemu_co_mutex_unlock(&s->lock);
+
         switch (ret) {
         case QCOW2_CLUSTER_UNALLOCATED:
-
-            if (bs->backing) {
-                BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-                qemu_co_mutex_unlock(&s->lock);
-                ret = bdrv_co_preadv(bs->backing, offset, cur_bytes,
-                                     &hd_qiov, 0);
-                qemu_co_mutex_lock(&s->lock);
-                if (ret < 0) {
-                    goto fail;
-                }
-            } else {
-                /* Note: in this case, no need to wait */
-                qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+            BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
+            ret = bdrv_co_preadv(bs->backing, offset, cur_bytes, &hd_qiov, 0);
+            if (ret < 0) {
+                goto fail;
             }
             break;
 
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-        case QCOW2_CLUSTER_ZERO_ALLOC:
-            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
-            break;
-
         case QCOW2_CLUSTER_COMPRESSED:
-            qemu_co_mutex_unlock(&s->lock);
             ret = qcow2_co_preadv_compressed(bs, cluster_offset,
                                              offset, cur_bytes,
                                              &hd_qiov);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
                 goto fail;
             }
-
             break;
 
         case QCOW2_CLUSTER_NORMAL:
@@ -1957,11 +1957,9 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             }
 
             BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-            qemu_co_mutex_unlock(&s->lock);
             ret = bdrv_co_preadv(bs->file,
                                  cluster_offset + offset_in_cluster,
                                  cur_bytes, &hd_qiov, 0);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
                 goto fail;
             }
@@ -1992,12 +1990,14 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
         bytes -= cur_bytes;
         offset += cur_bytes;
         bytes_done += cur_bytes;
+
+        qemu_co_mutex_lock(&s->lock);
     }
     ret = 0;
 
-fail:
     qemu_co_mutex_unlock(&s->lock);
 
+fail:
     qemu_iovec_destroy(&hd_qiov);
     qemu_vfree(cluster_data);
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 6/8] qcow2: qcow2_co_preadv: skip using hd_qiov when possible
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 5/8] qcow2: qcow2_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 7/8] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

qemu_iovec_memset has @offset parameter, so using hd_qiov for it is not
needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 2a49314e42..4e217ee918 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1888,18 +1888,13 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             goto fail;
         }
 
-        offset_in_cluster = offset_into_cluster(s, offset);
-
-        qemu_iovec_reset(&hd_qiov);
-        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
-
         if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
             ret == QCOW2_CLUSTER_ZERO_ALLOC ||
             (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
         {
             /* No sense in releasing the lock */
 
-            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+            qemu_iovec_memset(qiov, bytes_done, 0, cur_bytes);
 
             bytes -= cur_bytes;
             offset += cur_bytes;
@@ -1907,6 +1902,11 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             continue;
         }
 
+        offset_in_cluster = offset_into_cluster(s, offset);
+
+        qemu_iovec_reset(&hd_qiov);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
+
         qemu_co_mutex_unlock(&s->lock);
 
         switch (ret) {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 7/8] qcow2: bdrv_co_pwritev: move encryption code out of the lock
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 6/8] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 8/8] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Encryption will be done in threads, to take benefit of it, we should
move it out of the lock first.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 4e217ee918..941ccfa51a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2078,11 +2078,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                          &cluster_offset, &l2meta);
         if (ret < 0) {
-            goto fail;
+            goto out_locked;
         }
 
         assert((cluster_offset & 511) == 0);
 
+        ret = qcow2_pre_write_overlap_check(bs, 0,
+                                            cluster_offset + offset_in_cluster,
+                                            cur_bytes);
+        if (ret < 0) {
+            goto out_locked;
+        }
+
+        qemu_co_mutex_unlock(&s->lock);
+
         qemu_iovec_reset(&hd_qiov);
         qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
@@ -2094,7 +2103,7 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                                    * s->cluster_size);
                 if (cluster_data == NULL) {
                     ret = -ENOMEM;
-                    goto fail;
+                    goto out_unlocked;
                 }
             }
 
@@ -2109,40 +2118,34 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                       cluster_data,
                                       cur_bytes, NULL) < 0) {
                 ret = -EIO;
-                goto fail;
+                goto out_unlocked;
             }
 
             qemu_iovec_reset(&hd_qiov);
             qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
         }
 
-        ret = qcow2_pre_write_overlap_check(bs, 0,
-                cluster_offset + offset_in_cluster, cur_bytes);
-        if (ret < 0) {
-            goto fail;
-        }
-
         /* If we need to do COW, check if it's possible to merge the
          * writing of the guest data together with that of the COW regions.
          * If it's not possible (or not necessary) then write the
          * guest data now. */
         if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
-            qemu_co_mutex_unlock(&s->lock);
             BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
             trace_qcow2_writev_data(qemu_coroutine_self(),
                                     cluster_offset + offset_in_cluster);
             ret = bdrv_co_pwritev(bs->file,
                                   cluster_offset + offset_in_cluster,
                                   cur_bytes, &hd_qiov, 0);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
-                goto fail;
+                goto out_unlocked;
             }
         }
 
+        qemu_co_mutex_lock(&s->lock);
+
         ret = qcow2_handle_l2meta(bs, &l2meta, true);
         if (ret) {
-            goto fail;
+            goto out_locked;
         }
 
         bytes -= cur_bytes;
@@ -2151,8 +2154,12 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
     }
     ret = 0;
+    goto out_locked;
 
-fail:
+out_unlocked:
+    qemu_co_mutex_lock(&s->lock);
+
+out_locked:
     qcow2_handle_l2meta(bs, &l2meta, false);
 
     qemu_co_mutex_unlock(&s->lock);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v2 8/8] qcow2: do encryption in threads
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 7/8] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:43 ` Vladimir Sementsov-Ogievskiy
  2018-12-11 16:50 ` [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Daniel P. Berrangé
  2018-12-17 13:53 ` Vladimir Sementsov-Ogievskiy
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-11 16:43 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Do encryption/decryption in threads, like it is already done for
compression. This improves asynchronous encrypted io.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  8 ++++++
 block/qcow2-cluster.c |  7 ++---
 block/qcow2-threads.c | 65 +++++++++++++++++++++++++++++++++++++++++--
 block/qcow2.c         | 22 +++++----------
 4 files changed, 81 insertions(+), 21 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 9c2f6749ba..7e75f20373 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -257,6 +257,8 @@ typedef struct Qcow2BitmapHeaderExt {
     uint64_t bitmap_directory_offset;
 } QEMU_PACKED Qcow2BitmapHeaderExt;
 
+#define QCOW2_MAX_THREADS 4
+
 typedef struct BDRVQcow2State {
     int cluster_bits;
     int cluster_size;
@@ -701,5 +703,11 @@ qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
 ssize_t coroutine_fn
 qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
                     const void *src, size_t src_size);
+int coroutine_fn
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len);
+int coroutine_fn
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len);
 
 #endif
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e2737429f5..283080a9fd 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -468,13 +468,12 @@ static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
 {
     if (bytes && bs->encrypted) {
         BDRVQcow2State *s = bs->opaque;
-        int64_t offset = (s->crypt_physical_offset ?
-                          (cluster_offset + offset_in_cluster) :
-                          (src_cluster_offset + offset_in_cluster));
         assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
         assert((bytes & ~BDRV_SECTOR_MASK) == 0);
         assert(s->crypto);
-        if (qcrypto_block_encrypt(s->crypto, offset, buffer, bytes, NULL) < 0) {
+        if (qcow2_co_encrypt(bs, cluster_offset,
+                             src_cluster_offset + offset_in_cluster,
+                             buffer, bytes) < 0) {
             return false;
         }
     }
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 84b3ede4f1..19da46dce1 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -30,8 +30,7 @@
 
 #include "qcow2.h"
 #include "block/thread-pool.h"
-
-#define QCOW2_MAX_THREADS 4
+#include "crypto.h"
 
 static int coroutine_fn
 qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
@@ -197,3 +196,65 @@ qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
     return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
                                 qcow2_decompress);
 }
+
+
+/*
+ * Cryptography
+ */
+
+/*
+ * Qcow2EncDecFunc: common prototype of qcrypto_block_encrypt() and
+ * qcrypto_block_decrypt() functions.
+ */
+typedef int (*Qcow2EncDecFunc)(QCryptoBlock *block, uint64_t offset,
+                               uint8_t *buf, size_t len, Error **errp);
+
+typedef struct Qcow2EncDecData {
+    QCryptoBlock *block;
+    uint64_t offset;
+    uint8_t *buf;
+    size_t len;
+
+    Qcow2EncDecFunc func;
+} Qcow2EncDecData;
+
+static int qcow2_encdec_pool_func(void *opaque)
+{
+    Qcow2EncDecData *data = opaque;
+
+    return data->func(data->block, data->offset, data->buf, data->len, NULL);
+}
+
+static int coroutine_fn
+qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
+                  uint64_t offset, void *buf, size_t len, Qcow2EncDecFunc func)
+{
+    BDRVQcow2State *s = bs->opaque;
+    Qcow2EncDecData arg = {
+        .block = s->crypto,
+        .offset = s->crypt_physical_offset ?
+                      file_cluster_offset + offset_into_cluster(s, offset) :
+                      offset,
+        .buf = buf,
+        .len = len,
+        .func = func,
+    };
+
+    return qcow2_co_process(bs, qcow2_encdec_pool_func, &arg);
+}
+
+int coroutine_fn
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len)
+{
+    return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
+                             qcrypto_block_encrypt);
+}
+
+int coroutine_fn
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len)
+{
+    return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
+                             qcrypto_block_decrypt);
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 941ccfa51a..4a30c126a2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -297,7 +297,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             }
             s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
                                            qcow2_crypto_hdr_read_func,
-                                           bs, cflags, 1, errp);
+                                           bs, cflags, QCOW2_MAX_THREADS, errp);
             if (!s->crypto) {
                 return -EINVAL;
             }
@@ -1447,7 +1447,8 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
                 cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
             }
             s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
-                                           NULL, NULL, cflags, 1, errp);
+                                           NULL, NULL, cflags,
+                                           QCOW2_MAX_THREADS, errp);
             if (!s->crypto) {
                 ret = -EINVAL;
                 goto fail;
@@ -1967,13 +1968,8 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
                 assert(s->crypto);
                 assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
                 assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-                if (qcrypto_block_decrypt(s->crypto,
-                                          (s->crypt_physical_offset ?
-                                           cluster_offset + offset_in_cluster :
-                                           offset),
-                                          cluster_data,
-                                          cur_bytes,
-                                          NULL) < 0) {
+                if (qcow2_co_decrypt(bs, cluster_offset, offset,
+                                     cluster_data, cur_bytes) < 0) {
                     ret = -EIO;
                     goto fail;
                 }
@@ -2111,12 +2107,8 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                    QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
             qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
 
-            if (qcrypto_block_encrypt(s->crypto,
-                                      (s->crypt_physical_offset ?
-                                       cluster_offset + offset_in_cluster :
-                                       offset),
-                                      cluster_data,
-                                      cur_bytes, NULL) < 0) {
+            if (qcow2_co_encrypt(bs, cluster_offset, offset,
+                                 cluster_data, cur_bytes) < 0) {
                 ret = -EIO;
                 goto out_unlocked;
             }
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 8/8] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
@ 2018-12-11 16:50 ` Daniel P. Berrangé
  2018-12-17 13:53 ` Vladimir Sementsov-Ogievskiy
  9 siblings, 0 replies; 17+ messages in thread
From: Daniel P. Berrangé @ 2018-12-11 16:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, qemu-block, mreitz, kwolf, berto, den

On Tue, Dec 11, 2018 at 07:43:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> The series brings threads to qcow2 encryption/decryption path,
> like it is already done for compression.
> 
> v2: - multiple cipher inside QCryptoBlock instead of multiple
>       blocks inside qcow2, as suggested by Daniel, and it is
>       done in separate series
>     - use threaded encryption in do_perform_cow_encrypt() too
>     - some renaming and refactoring and simplifications
> (Sorry for not being very careful about change list, but v1 isn't
>  actually reviewed, as approach with multiple blocks was early
>  rejected by Denial)
> 
> The series now based on two queued for 4.0 series, which, in
> turn, may be applied in any order: "crypto threads" (Daniel's
> tree), and  "qcow2 decompress in threads", which is now in
> Kevin's block-next.

FYI I've pushed my crypto queue here for benefit of anyone
wishing to test this series

  https://github.com/berrange/qemu/tree/qcrypto-next


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
@ 2018-12-13 10:18   ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2018-12-13 10:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: mreitz, kwolf, berrange, den

On Tue 11 Dec 2018 05:43:10 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> qcow2.h depends on block_int.h. Compilation isn't broken currently only
> due to block_int.h always included before qcow2.h. Though, it seems
> better to directly include block_int.h in qcow2.h.

If you include block_int.h in qcow2.h then I guess there's no need to do
it again in qcow2.c

Berto

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
@ 2018-12-13 23:28   ` Paolo Bonzini
  2018-12-14  8:51     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2018-12-13 23:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: kwolf, berto, mreitz, den

On 11/12/18 17:43, Vladimir Sementsov-Ogievskiy wrote:
> +    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
> +
> +    while (s->nb_threads >= QCOW2_MAX_THREADS) {
> +        qemu_co_queue_wait(&s->thread_task_queue, NULL);
> +    }
> +
> +    s->nb_threads++;
> +    ret = thread_pool_submit_co(pool, func, arg);
> +    s->nb_threads--;
> +
> +    qemu_co_queue_next(&s->thread_task_queue);

What lock is protecting this manipulation?  I'd rather avoid adding more
users of the AioContext lock, especially manipulation of CoQueues.

One possibility is to do the s->lock unlock/lock here, and then also
rely on that in patch 8.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path
  2018-12-13 23:28   ` Paolo Bonzini
@ 2018-12-14  8:51     ` Vladimir Sementsov-Ogievskiy
  2019-01-04 14:59       ` Alberto Garcia
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-14  8:51 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, qemu-block; +Cc: kwolf, berto, mreitz, Denis Lunev

14.12.2018 2:28, Paolo Bonzini wrote:
> On 11/12/18 17:43, Vladimir Sementsov-Ogievskiy wrote:
>> +    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
>> +
>> +    while (s->nb_threads >= QCOW2_MAX_THREADS) {
>> +        qemu_co_queue_wait(&s->thread_task_queue, NULL);
>> +    }
>> +
>> +    s->nb_threads++;
>> +    ret = thread_pool_submit_co(pool, func, arg);
>> +    s->nb_threads--;
>> +
>> +    qemu_co_queue_next(&s->thread_task_queue);
> 
> What lock is protecting this manipulation?  I'd rather avoid adding more
> users of the AioContext lock, especially manipulation of CoQueues.
> 
> One possibility is to do the s->lock unlock/lock here, and then also
> rely on that in patch 8.
> 
> Paolo
> 

Hm, can you please give more details? It's refactoring commit, so, do you mean
that there is an existing bug? We are in coroutine, so, it's not safe to call
qemu_co_queue_next?

I feel, I don't understand the future without AioContext lock, and how it influence
contexts/threads/coroutines.. So, from this context, s->thread_task_queue may be
accessed from other thread, when we are here, and calling qemu_co_queue_next from
our aio context? But from where? /Sorry for may be stupid questions/

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads
  2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2018-12-11 16:50 ` [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Daniel P. Berrangé
@ 2018-12-17 13:53 ` Vladimir Sementsov-Ogievskiy
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-12-17 13:53 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, Denis Lunev

11.12.2018 19:43, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> The series brings threads to qcow2 encryption/decryption path,
> like it is already done for compression.
> 
> v2: - multiple cipher inside QCryptoBlock instead of multiple
>        blocks inside qcow2, as suggested by Daniel, and it is
>        done in separate series
>      - use threaded encryption in do_perform_cow_encrypt() too
>      - some renaming and refactoring and simplifications
> (Sorry for not being very careful about change list, but v1 isn't
>   actually reviewed, as approach with multiple blocks was early
>   rejected by Denial)
> 
> The series now based on two queued for 4.0 series, which, in
> turn, may be applied in any order: "crypto threads" (Daniel's
> tree), and  "qcow2 decompress in threads", which is now in
> Kevin's block-next.
> 
> Based-on: <20181207161351.4380-1-vsementsov@virtuozzo.com>
> ([PATCH v3 0/5] crypto threads)
> Based-on: git://repo.or.cz/qemu/kevin.git block-next
> (decompress in threads inside)

Both merged, so, now based on master.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
@ 2019-01-03 15:11   ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2019-01-03 15:11 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: mreitz, kwolf, berrange, den

On Tue 11 Dec 2018 05:43:11 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Move compression-on-threads to separate file. Encryption will be in it
> too.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co
  2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
@ 2019-01-03 15:28   ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2019-01-03 15:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block
  Cc: mreitz, kwolf, berrange, den

On Tue 11 Dec 2018 05:43:12 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Use thread_pool_submit_co, instead of reinventing it here. Note, that
> thread_pool_submit_aio() never returns an error, so checking it was an
> extra thing.

"never returns NULL" is perhaps a bit more explicit.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path
  2018-12-14  8:51     ` Vladimir Sementsov-Ogievskiy
@ 2019-01-04 14:59       ` Alberto Garcia
  0 siblings, 0 replies; 17+ messages in thread
From: Alberto Garcia @ 2019-01-04 14:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-devel, qemu-block
  Cc: kwolf, mreitz, Denis Lunev

On Fri 14 Dec 2018 09:51:12 AM CET, Vladimir Sementsov-Ogievskiy wrote:
> 14.12.2018 2:28, Paolo Bonzini wrote:
>> On 11/12/18 17:43, Vladimir Sementsov-Ogievskiy wrote:
>>> +    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
>>> +
>>> +    while (s->nb_threads >= QCOW2_MAX_THREADS) {
>>> +        qemu_co_queue_wait(&s->thread_task_queue, NULL);
>>> +    }
>>> +
>>> +    s->nb_threads++;
>>> +    ret = thread_pool_submit_co(pool, func, arg);
>>> +    s->nb_threads--;
>>> +
>>> +    qemu_co_queue_next(&s->thread_task_queue);
>> 
>> What lock is protecting this manipulation?  I'd rather avoid adding more
>> users of the AioContext lock, especially manipulation of CoQueues.
>> 
>> One possibility is to do the s->lock unlock/lock here, and then also
>> rely on that in patch 8.
>
> Hm, can you please give more details? It's refactoring commit, so, do
> you mean that there is an existing bug?

I think there's an existing problem but not a currently reproducible bug
as such: accesses to the qcow2 BlockDriverState are protected by the
same AioContext lock, but as soon as you want to implement multiqueue
you'll have multiple iothreads (and contexts) accessing the same BDS.

For that to work you need that the in-memory qcow2 metadata is properly
locked, and that's what s->lock is for. However this CoQueue here is not
protected by any lock (s->lock is taken _after_ the data is compressed
in qcow2_co_pwritev_compressed()).

And I think that's it in a nutshell, Paolo correct me if I'm wrong.

Berto

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-01-04 14:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 16:43 [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 1/8] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
2018-12-13 10:18   ` Alberto Garcia
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 2/8] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
2019-01-03 15:11   ` Alberto Garcia
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 3/8] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
2019-01-03 15:28   ` Alberto Garcia
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 4/8] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
2018-12-13 23:28   ` Paolo Bonzini
2018-12-14  8:51     ` Vladimir Sementsov-Ogievskiy
2019-01-04 14:59       ` Alberto Garcia
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 5/8] qcow2: qcow2_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 6/8] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 7/8] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
2018-12-11 16:43 ` [Qemu-devel] [PATCH v2 8/8] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
2018-12-11 16:50 ` [Qemu-devel] [PATCH v2 0/8] qcow2: encryption threads Daniel P. Berrangé
2018-12-17 13:53 ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.