All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/11] qcow2: encryption threads
@ 2018-11-23 16:55 Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 01/11] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Hi all!

The series brings threads to qcow2 encryption/decryption path,
like it is already done for compression.

Based-on: Kevin's block-next branch [d3db1496c5]

Performance gain is illustrated by the following test:

]# cat test.sh 
#!/bin/bash

size=1G
src=/ssd/src.raw
dst=/ssd/dst.enc.qcow2

echo create source for tests
./qemu-img create -f raw "$src" $size
./qemu-io -f raw -c "write -P 0xa 0 $size" "$src"

for w in "" "-W"; do
    echo -e "\n\nTest with additional paramter for qemu-img: '$w'\n"

    echo create target...
    ./qemu-img create -f qcow2 --object secret,id=sec0,data=test -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 "$dst" $size
    echo

    echo test...
    time ./qemu-img convert $w -f raw --object secret,id=sec0,data=test --target-image-opts -n "$src" "driver=qcow2,file.filename=$dst,encrypt.key-secret=sec0"
done



before patches:

]# ./test.sh 
create source for tests
Formatting '/ssd/src.raw', fmt=raw size=1073741824
wrote 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0:00:03.02 (338.734 MiB/sec and 0.3308 ops/sec)


Test with additional paramter for qemu-img: ''

create target...
Formatting '/ssd/dst.enc.qcow2', fmt=qcow2 size=1073741824 encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10 cluster_size=65536 lazy_refcounts=off refcount_bits=16

test...

real    0m12.014s
user    0m11.299s
sys     0m0.928s


Test with additional paramter for qemu-img: '-W'

create target...
Formatting '/ssd/dst.enc.qcow2', fmt=qcow2 size=1073741824 encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10 cluster_size=65536 lazy_refcounts=off refcount_bits=16

test...

real    0m11.639s
user    0m11.324s
sys     0m1.149s



after patches:

]# ./test.sh 
create source for tests
Formatting '/ssd/src.raw', fmt=raw size=1073741824
wrote 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0:00:02.63 (388.900 MiB/sec and 0.3798 ops/sec)


Test with additional paramter for qemu-img: ''

create target...
Formatting '/ssd/dst.enc.qcow2', fmt=qcow2 size=1073741824 encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10 cluster_size=65536 lazy_refcounts=off refcount_bits=16

test...

real    0m12.113s
user    0m11.433s
sys     0m0.878s


Test with additional paramter for qemu-img: '-W'

create target...
Formatting '/ssd/dst.enc.qcow2', fmt=qcow2 size=1073741824 encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10 cluster_size=65536 lazy_refcounts=off refcount_bits=16

test...

real    0m3.436s
user    0m13.429s
sys     0m1.183s


Vladimir Sementsov-Ogievskiy (11):
  qcow2.h: add missing include
  qcow2: add separate file for threaded data processing functions
  qcow2-threads: use thread_pool_submit_co
  qcow2: split out data processing threads state from BDRVQcow2State
  qcow2-threads: split out generic path
  qcow2-threads: add per-thread data
  qcow2-threads: add encryption
  qcow2: bdrv_co_preadv: improve locking
  qcow2: qcow2_co_preadv: skip using hd_qiov when possible
  qcow2: bdrv_co_pwritev: move encryption code out of the lock
  qcow2: do encryption in threads

 block/qcow2.h         |  34 ++++-
 block/qcow2-threads.c | 281 ++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 328 +++++++++++++-----------------------------
 block/Makefile.objs   |   2 +-
 4 files changed, 412 insertions(+), 233 deletions(-)
 create mode 100644 block/qcow2-threads.c

-- 
2.18.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 01/11] qcow2.h: add missing include
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 02/11] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

qcow2.h depends on block_int.h. Compilation isn't broken currently only
due to block_int.h always included before qcow2.h. Though, it seems
better to directly include block_int.h in qcow2.h.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index a98d24500b..5095f893a0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -28,6 +28,7 @@
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
 #include "qemu/units.h"
+#include "block/block_int.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 02/11] qcow2: add separate file for threaded data processing functions
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 01/11] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 03/11] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Move compression-on-threads to separate file. Encryption will be in it
too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |   7 ++
 block/qcow2-threads.c | 197 ++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 169 ------------------------------------
 block/Makefile.objs   |   2 +-
 4 files changed, 205 insertions(+), 170 deletions(-)
 create mode 100644 block/qcow2-threads.c

diff --git a/block/qcow2.h b/block/qcow2.h
index 5095f893a0..be84d7c96a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -695,4 +695,11 @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
                                           const char *name,
                                           Error **errp);
 
+ssize_t coroutine_fn
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                  const void *src, size_t src_size);
+ssize_t coroutine_fn
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
+                    const void *src, size_t src_size);
+
 #endif
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
new file mode 100644
index 0000000000..b75fad6e07
--- /dev/null
+++ b/block/qcow2-threads.c
@@ -0,0 +1,197 @@
+/*
+ * Threaded data processing for Qcow2: compression, encryption
+ *
+ * Copyright (c) 2004-2006 Fabrice Bellard
+ * Copyright (c) 2018 Virtuozzo International GmbH. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+
+#define ZLIB_CONST
+#include <zlib.h>
+
+#include "qcow2.h"
+#include "block/thread-pool.h"
+
+#define MAX_COMPRESS_THREADS 4
+
+typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
+                                     const void *src, size_t src_size);
+typedef struct Qcow2CompressData {
+    void *dest;
+    size_t dest_size;
+    const void *src;
+    size_t src_size;
+    ssize_t ret;
+
+    Qcow2CompressFunc func;
+} Qcow2CompressData;
+
+/*
+ * qcow2_compress()
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: compressed size on success
+ *          -1 destination buffer is not enough to store compressed data
+ *          -2 on any other error
+ */
+static ssize_t qcow2_compress(void *dest, size_t dest_size,
+                              const void *src, size_t src_size)
+{
+    ssize_t ret;
+    z_stream strm;
+
+    /* best compression, small window, no zlib header */
+    memset(&strm, 0, sizeof(strm));
+    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
+                       -12, 9, Z_DEFAULT_STRATEGY);
+    if (ret != Z_OK) {
+        return -2;
+    }
+
+    /* strm.next_in is not const in old zlib versions, such as those used on
+     * OpenBSD/NetBSD, so cast the const away */
+    strm.avail_in = src_size;
+    strm.next_in = (void *) src;
+    strm.avail_out = dest_size;
+    strm.next_out = dest;
+
+    ret = deflate(&strm, Z_FINISH);
+    if (ret == Z_STREAM_END) {
+        ret = dest_size - strm.avail_out;
+    } else {
+        ret = (ret == Z_OK ? -1 : -2);
+    }
+
+    deflateEnd(&strm);
+
+    return ret;
+}
+
+/*
+ * qcow2_decompress()
+ *
+ * Decompress some data (not more than @src_size bytes) to produce exactly
+ * @dest_size bytes.
+ *
+ * @dest - destination buffer, @dest_size bytes
+ * @src - source buffer, @src_size bytes
+ *
+ * Returns: 0 on success
+ *          -1 on fail
+ */
+static ssize_t qcow2_decompress(void *dest, size_t dest_size,
+                                const void *src, size_t src_size)
+{
+    int ret = 0;
+    z_stream strm;
+
+    memset(&strm, 0, sizeof(strm));
+    strm.avail_in = src_size;
+    strm.next_in = (void *) src;
+    strm.avail_out = dest_size;
+    strm.next_out = dest;
+
+    ret = inflateInit2(&strm, -12);
+    if (ret != Z_OK) {
+        return -1;
+    }
+
+    ret = inflate(&strm, Z_FINISH);
+    if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
+        /* We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
+         * @src buffer may be processed partly (because in qcow2 we know size of
+         * compressed data with precision of one sector) */
+        ret = -1;
+    }
+
+    inflateEnd(&strm);
+
+    return ret;
+}
+
+static int qcow2_compress_pool_func(void *opaque)
+{
+    Qcow2CompressData *data = opaque;
+
+    data->ret = data->func(data->dest, data->dest_size,
+                           data->src, data->src_size);
+
+    return 0;
+}
+
+static void qcow2_compress_complete(void *opaque, int ret)
+{
+    qemu_coroutine_enter(opaque);
+}
+
+static ssize_t coroutine_fn
+qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                     const void *src, size_t src_size, Qcow2CompressFunc func)
+{
+    BDRVQcow2State *s = bs->opaque;
+    BlockAIOCB *acb;
+    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    Qcow2CompressData arg = {
+        .dest = dest,
+        .dest_size = dest_size,
+        .src = src,
+        .src_size = src_size,
+        .func = func,
+    };
+
+    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
+        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
+    }
+
+    s->nb_compress_threads++;
+    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
+                                 qcow2_compress_complete,
+                                 qemu_coroutine_self());
+
+    if (!acb) {
+        s->nb_compress_threads--;
+        return -EINVAL;
+    }
+    qemu_coroutine_yield();
+    s->nb_compress_threads--;
+    qemu_co_queue_next(&s->compress_wait_queue);
+
+    return arg.ret;
+}
+
+ssize_t coroutine_fn
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
+                  const void *src, size_t src_size)
+{
+    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
+                                qcow2_compress);
+}
+
+ssize_t coroutine_fn
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
+                    const void *src, size_t src_size)
+{
+    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
+                                qcow2_decompress);
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 0b5ad13006..3fcb80d747 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -24,9 +24,6 @@
 
 #include "qemu/osdep.h"
 
-#define ZLIB_CONST
-#include <zlib.h>
-
 #include "block/block_int.h"
 #include "block/qdict.h"
 #include "sysemu/block-backend.h"
@@ -44,7 +41,6 @@
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
 #include "crypto.h"
-#include "block/thread-pool.h"
 
 /*
   Differences with QCOW:
@@ -3720,171 +3716,6 @@ fail:
     return ret;
 }
 
-/*
- * qcow2_compress()
- *
- * @dest - destination buffer, @dest_size bytes
- * @src - source buffer, @src_size bytes
- *
- * Returns: compressed size on success
- *          -1 destination buffer is not enough to store compressed data
- *          -2 on any other error
- */
-static ssize_t qcow2_compress(void *dest, size_t dest_size,
-                              const void *src, size_t src_size)
-{
-    ssize_t ret;
-    z_stream strm;
-
-    /* best compression, small window, no zlib header */
-    memset(&strm, 0, sizeof(strm));
-    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
-                       -12, 9, Z_DEFAULT_STRATEGY);
-    if (ret != Z_OK) {
-        return -2;
-    }
-
-    /* strm.next_in is not const in old zlib versions, such as those used on
-     * OpenBSD/NetBSD, so cast the const away */
-    strm.avail_in = src_size;
-    strm.next_in = (void *) src;
-    strm.avail_out = dest_size;
-    strm.next_out = dest;
-
-    ret = deflate(&strm, Z_FINISH);
-    if (ret == Z_STREAM_END) {
-        ret = dest_size - strm.avail_out;
-    } else {
-        ret = (ret == Z_OK ? -1 : -2);
-    }
-
-    deflateEnd(&strm);
-
-    return ret;
-}
-
-/*
- * qcow2_decompress()
- *
- * Decompress some data (not more than @src_size bytes) to produce exactly
- * @dest_size bytes.
- *
- * @dest - destination buffer, @dest_size bytes
- * @src - source buffer, @src_size bytes
- *
- * Returns: 0 on success
- *          -1 on fail
- */
-static ssize_t qcow2_decompress(void *dest, size_t dest_size,
-                                const void *src, size_t src_size)
-{
-    int ret = 0;
-    z_stream strm;
-
-    memset(&strm, 0, sizeof(strm));
-    strm.avail_in = src_size;
-    strm.next_in = (void *) src;
-    strm.avail_out = dest_size;
-    strm.next_out = dest;
-
-    ret = inflateInit2(&strm, -12);
-    if (ret != Z_OK) {
-        return -1;
-    }
-
-    ret = inflate(&strm, Z_FINISH);
-    if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
-        /* We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
-         * @src buffer may be processed partly (because in qcow2 we know size of
-         * compressed data with precision of one sector) */
-        ret = -1;
-    }
-
-    inflateEnd(&strm);
-
-    return ret;
-}
-
-#define MAX_COMPRESS_THREADS 4
-
-typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
-                                     const void *src, size_t src_size);
-typedef struct Qcow2CompressData {
-    void *dest;
-    size_t dest_size;
-    const void *src;
-    size_t src_size;
-    ssize_t ret;
-
-    Qcow2CompressFunc func;
-} Qcow2CompressData;
-
-static int qcow2_compress_pool_func(void *opaque)
-{
-    Qcow2CompressData *data = opaque;
-
-    data->ret = data->func(data->dest, data->dest_size,
-                           data->src, data->src_size);
-
-    return 0;
-}
-
-static void qcow2_compress_complete(void *opaque, int ret)
-{
-    qemu_coroutine_enter(opaque);
-}
-
-static ssize_t coroutine_fn
-qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
-                     const void *src, size_t src_size, Qcow2CompressFunc func)
-{
-    BDRVQcow2State *s = bs->opaque;
-    BlockAIOCB *acb;
-    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    Qcow2CompressData arg = {
-        .dest = dest,
-        .dest_size = dest_size,
-        .src = src,
-        .src_size = src_size,
-        .func = func,
-    };
-
-    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
-        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
-    }
-
-    s->nb_compress_threads++;
-    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
-                                 qcow2_compress_complete,
-                                 qemu_coroutine_self());
-
-    if (!acb) {
-        s->nb_compress_threads--;
-        return -EINVAL;
-    }
-    qemu_coroutine_yield();
-    s->nb_compress_threads--;
-    qemu_co_queue_next(&s->compress_wait_queue);
-
-    return arg.ret;
-}
-
-static ssize_t coroutine_fn
-qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
-                  const void *src, size_t src_size)
-{
-    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-                                qcow2_compress);
-}
-
-static ssize_t coroutine_fn
-qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
-                    const void *src, size_t src_size)
-{
-    return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
-                                qcow2_decompress);
-}
-
 /* XXX: put compressed sectors first, then all the cluster aligned
    tables to avoid losing bytes in alignment */
 static coroutine_fn int
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 7a81892a52..ae11605c9f 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -6,7 +6,7 @@ block-obj-$(CONFIG_BOCHS) += bochs.o
 block-obj-$(CONFIG_VVFAT) += vvfat.o
 block-obj-$(CONFIG_DMG) += dmg.o
 
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o qcow2-threads.o
 block-obj-$(CONFIG_QED) += qed.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-$(CONFIG_QED) += qed-check.o
 block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 03/11] qcow2-threads: use thread_pool_submit_co
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 01/11] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 02/11] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 04/11] qcow2: split out data processing threads state from BDRVQcow2State Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Use thread_pool_submit_co, instead of reinventing it here. Note, that
thread_pool_submit_aio() never returns an error, so checking it was an
extra thing.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-threads.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index b75fad6e07..8c10191f2f 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -140,17 +140,11 @@ static int qcow2_compress_pool_func(void *opaque)
     return 0;
 }
 
-static void qcow2_compress_complete(void *opaque, int ret)
-{
-    qemu_coroutine_enter(opaque);
-}
-
 static ssize_t coroutine_fn
 qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
                      const void *src, size_t src_size, Qcow2CompressFunc func)
 {
     BDRVQcow2State *s = bs->opaque;
-    BlockAIOCB *acb;
     ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     Qcow2CompressData arg = {
         .dest = dest,
@@ -165,16 +159,9 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
     }
 
     s->nb_compress_threads++;
-    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
-                                 qcow2_compress_complete,
-                                 qemu_coroutine_self());
-
-    if (!acb) {
-        s->nb_compress_threads--;
-        return -EINVAL;
-    }
-    qemu_coroutine_yield();
+    thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
     s->nb_compress_threads--;
+
     qemu_co_queue_next(&s->compress_wait_queue);
 
     return arg.ret;
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 04/11] qcow2: split out data processing threads state from BDRVQcow2State
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 03/11] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 05/11] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

The state will grow in further commits, so it's good to give it its own
structure.

Make naming generic, as threads will be used for encryption too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  8 ++++++--
 block/qcow2-threads.c | 11 ++++++-----
 block/qcow2.c         |  2 +-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index be84d7c96a..abe51a385c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -257,6 +257,11 @@ typedef struct Qcow2BitmapHeaderExt {
     uint64_t bitmap_directory_offset;
 } QEMU_PACKED Qcow2BitmapHeaderExt;
 
+typedef struct Qcow2ThreadsState {
+    CoQueue task_queue;
+    int count;
+} Qcow2ThreadsState;
+
 typedef struct BDRVQcow2State {
     int cluster_bits;
     int cluster_size;
@@ -336,8 +341,7 @@ typedef struct BDRVQcow2State {
     char *image_backing_file;
     char *image_backing_format;
 
-    CoQueue compress_wait_queue;
-    int nb_compress_threads;
+    Qcow2ThreadsState threads;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 8c10191f2f..028cf3175e 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -145,6 +145,7 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
                      const void *src, size_t src_size, Qcow2CompressFunc func)
 {
     BDRVQcow2State *s = bs->opaque;
+    Qcow2ThreadsState *thr = &s->threads;
     ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     Qcow2CompressData arg = {
         .dest = dest,
@@ -154,15 +155,15 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
         .func = func,
     };
 
-    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
-        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
+    while (thr->count >= MAX_COMPRESS_THREADS) {
+        qemu_co_queue_wait(&thr->task_queue, NULL);
     }
 
-    s->nb_compress_threads++;
+    thr->count++;
     thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
-    s->nb_compress_threads--;
+    thr->count--;
 
-    qemu_co_queue_next(&s->compress_wait_queue);
+    qemu_co_queue_next(&thr->task_queue);
 
     return arg.ret;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 3fcb80d747..295ae926ee 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1600,7 +1600,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
     }
 #endif
 
-    qemu_co_queue_init(&s->compress_wait_queue);
+    qemu_co_queue_init(&s->threads.task_queue);
 
     return ret;
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 05/11] qcow2-threads: split out generic path
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 04/11] qcow2: split out data processing threads state from BDRVQcow2State Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 06/11] qcow2-threads: add per-thread data Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Move generic part out of qcow2_co_do_compress, to reuse it for
encryption and rename things that would be shared with encryption path.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-threads.c | 41 ++++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 028cf3175e..a09f07e2f9 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -31,7 +31,33 @@
 #include "qcow2.h"
 #include "block/thread-pool.h"
 
-#define MAX_COMPRESS_THREADS 4
+#define QCOW2_MAX_THREADS 4
+
+static int coroutine_fn
+qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
+{
+    int ret;
+    BDRVQcow2State *s = bs->opaque;
+    Qcow2ThreadsState *thr = &s->threads;
+    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+
+    while (thr->count >= QCOW2_MAX_THREADS) {
+        qemu_co_queue_wait(&thr->task_queue, NULL);
+    }
+
+    thr->count++;
+    ret = thread_pool_submit_co(pool, func, arg);
+    thr->count--;
+
+    qemu_co_queue_next(&thr->task_queue);
+
+    return ret;
+}
+
+
+/*
+ * Compression
+ */
 
 typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
                                      const void *src, size_t src_size);
@@ -144,9 +170,6 @@ static ssize_t coroutine_fn
 qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
                      const void *src, size_t src_size, Qcow2CompressFunc func)
 {
-    BDRVQcow2State *s = bs->opaque;
-    Qcow2ThreadsState *thr = &s->threads;
-    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     Qcow2CompressData arg = {
         .dest = dest,
         .dest_size = dest_size,
@@ -155,15 +178,7 @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
         .func = func,
     };
 
-    while (thr->count >= MAX_COMPRESS_THREADS) {
-        qemu_co_queue_wait(&thr->task_queue, NULL);
-    }
-
-    thr->count++;
-    thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
-    thr->count--;
-
-    qemu_co_queue_next(&thr->task_queue);
+    qcow2_co_process(bs, qcow2_compress_pool_func, &arg);
 
     return arg.ret;
 }
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 06/11] qcow2-threads: add per-thread data
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 05/11] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

We'll need per-thread data (crypto blocks) for encryption threads.
Prepare threads infrastructure for it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  6 ++++++
 block/qcow2-threads.c | 25 ++++++++++++++++++++++---
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index abe51a385c..7bef0393ce 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -257,9 +257,15 @@ typedef struct Qcow2BitmapHeaderExt {
     uint64_t bitmap_directory_offset;
 } QEMU_PACKED Qcow2BitmapHeaderExt;
 
+#define QCOW2_MAX_THREADS 4
+typedef struct Qcow2PerThreadData {
+    bool in_use;
+} Qcow2PerThreadData;
+
 typedef struct Qcow2ThreadsState {
     CoQueue task_queue;
     int count;
+    Qcow2PerThreadData per_thread[QCOW2_MAX_THREADS];
 } Qcow2ThreadsState;
 
 typedef struct BDRVQcow2State {
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index a09f07e2f9..3ed990ef2f 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -31,24 +31,42 @@
 #include "qcow2.h"
 #include "block/thread-pool.h"
 
-#define QCOW2_MAX_THREADS 4
+typedef struct Qcow2ProcessData {
+    Qcow2PerThreadData *self;
+    void *arg;
+} Qcow2ProcessData;
 
 static int coroutine_fn
 qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
 {
     int ret;
+    int ind;
     BDRVQcow2State *s = bs->opaque;
     Qcow2ThreadsState *thr = &s->threads;
     ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    Qcow2ProcessData data = {
+        .arg = arg
+    };
 
     while (thr->count >= QCOW2_MAX_THREADS) {
         qemu_co_queue_wait(&thr->task_queue, NULL);
     }
 
+    for (ind = 0; ind < QCOW2_MAX_THREADS; ind++) {
+        if (!thr->per_thread[ind].in_use) {
+            data.self = &thr->per_thread[ind];
+            thr->per_thread[ind].in_use = true;
+            break;
+        }
+    }
+    assert(ind < QCOW2_MAX_THREADS);
+
     thr->count++;
-    ret = thread_pool_submit_co(pool, func, arg);
+    ret = thread_pool_submit_co(pool, func, &data);
     thr->count--;
 
+    thr->per_thread[ind].in_use = false;
+
     qemu_co_queue_next(&thr->task_queue);
 
     return ret;
@@ -158,7 +176,8 @@ static ssize_t qcow2_decompress(void *dest, size_t dest_size,
 
 static int qcow2_compress_pool_func(void *opaque)
 {
-    Qcow2CompressData *data = opaque;
+    Qcow2ProcessData *pdata = opaque;
+    Qcow2CompressData *data = pdata->arg;
 
     data->ret = data->func(data->dest, data->dest_size,
                            data->src, data->src_size);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 06/11] qcow2-threads: add per-thread data Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-27 16:21   ` Daniel P. Berrangé
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 08/11] qcow2: bdrv_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Add thread-based encrypt/decrypt. QCrypto don't support parallel
operations with one block, so we need QCryptoBlock for each thread.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         | 12 +++++++++
 block/qcow2-threads.c | 62 +++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 57 ++++++++++++++++++++++++++++++++-------
 3 files changed, 122 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 7bef0393ce..351ad8d3e7 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -260,6 +260,12 @@ typedef struct Qcow2BitmapHeaderExt {
 #define QCOW2_MAX_THREADS 4
 typedef struct Qcow2PerThreadData {
     bool in_use;
+
+    /* QCryptoBlock doesn't support parallel operations in threads, so we can't
+     * use BDRVQcow2State.crypto and instead we need separate crypto block for
+     * each thread.
+     */
+    QCryptoBlock *crypto;
 } Qcow2PerThreadData;
 
 typedef struct Qcow2ThreadsState {
@@ -711,5 +717,11 @@ qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
 ssize_t coroutine_fn
 qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
                     const void *src, size_t src_size);
+int coroutine_fn
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len);
+int coroutine_fn
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len);
 
 #endif
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 3ed990ef2f..0a75c1aead 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -30,6 +30,7 @@
 
 #include "qcow2.h"
 #include "block/thread-pool.h"
+#include "crypto.h"
 
 typedef struct Qcow2ProcessData {
     Qcow2PerThreadData *self;
@@ -217,3 +218,64 @@ qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
     return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
                                 qcow2_decompress);
 }
+
+
+/*
+ * Encryption
+ */
+
+typedef int (*Qcow2EncryptFunc)(QCryptoBlock *block, uint64_t offset,
+                                uint8_t *buf, size_t len, Error **errp);
+/*
+ * encrypt functions are qcrypto_block_encrypt() and qcrypto_block_decrypt()
+ */
+
+typedef struct Qcow2EncryptData {
+    uint64_t offset;
+    uint8_t *buf;
+    size_t len;
+
+    Qcow2EncryptFunc func;
+} Qcow2EncryptData;
+
+static int qcow2_encrypt_pool_func(void *opaque)
+{
+    Qcow2ProcessData *pdata = opaque;
+    Qcow2EncryptData *data = pdata->arg;
+
+    return data->func(pdata->self->crypto,
+                      data->offset, data->buf, data->len, NULL);
+}
+
+static int coroutine_fn
+qcow2_co_do_crypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                  uint64_t offset, void *buf, size_t len, Qcow2EncryptFunc func)
+{
+    BDRVQcow2State *s = bs->opaque;
+    Qcow2EncryptData arg = {
+        .offset = s->crypt_physical_offset ?
+                      file_cluster_offset + offset_into_cluster(s, offset) :
+                      offset,
+        .buf = buf,
+        .len = len,
+        .func = func,
+    };
+
+    return qcow2_co_process(bs, qcow2_encrypt_pool_func, &arg);
+}
+
+int coroutine_fn
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len)
+{
+    return qcow2_co_do_crypt(bs, file_cluster_offset, offset, buf, len,
+                             qcrypto_block_encrypt);
+}
+
+int coroutine_fn
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
+                 uint64_t offset, void *buf, size_t len)
+{
+    return qcow2_co_do_crypt(bs, file_cluster_offset, offset, buf, len,
+                             qcrypto_block_decrypt);
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 295ae926ee..1e28f17373 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -170,6 +170,47 @@ static ssize_t qcow2_crypto_hdr_write_func(QCryptoBlock *block, size_t offset,
     return ret;
 }
 
+static void qcow2_crypto_blocks_free(BDRVQcow2State *s)
+{
+    int i;
+
+    qcrypto_block_free(s->crypto);
+    s->crypto = NULL;
+
+    for (i = 0; i < QCOW2_MAX_THREADS; i++) {
+        qcrypto_block_free(s->threads.per_thread[i].crypto);
+        s->threads.per_thread[i].crypto = NULL;
+    }
+}
+
+static int qcow2_crypto_blocks_open(BDRVQcow2State *s,
+                                    const char *optprefix,
+                                    QCryptoBlockReadFunc readfunc,
+                                    void *opaque,
+                                    unsigned int flags,
+                                    Error **errp)
+{
+    int i;
+
+    s->crypto = qcrypto_block_open(s->crypto_opts, optprefix,
+                                   readfunc, opaque, flags, errp);
+    if (!s->crypto) {
+        qcrypto_block_free(s->crypto);
+        return -EINVAL;
+    }
+
+    for (i = 0; i < QCOW2_MAX_THREADS; i++) {
+        s->threads.per_thread[i].crypto =
+                qcrypto_block_open(s->crypto_opts, optprefix,
+                                   readfunc, opaque, flags, errp);
+        if (!s->threads.per_thread[i].crypto) {
+            qcow2_crypto_blocks_free(s);
+            return -EINVAL;
+        }
+    }
+
+    return 0;
+}
 
 /* 
  * read qcow2 extension and fill bs
@@ -295,11 +336,11 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             if (flags & BDRV_O_NO_IO) {
                 cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
             }
-            s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
+            ret = qcow2_crypto_blocks_open(s, "encrypt.",
                                            qcow2_crypto_hdr_read_func,
                                            bs, cflags, errp);
-            if (!s->crypto) {
-                return -EINVAL;
+            if (ret < 0) {
+                return ret;
             }
         }   break;
 
@@ -1446,10 +1487,9 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
             if (flags & BDRV_O_NO_IO) {
                 cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
             }
-            s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
+            ret = qcow2_crypto_blocks_open(s, "encrypt.",
                                            NULL, NULL, cflags, errp);
-            if (!s->crypto) {
-                ret = -EINVAL;
+            if (ret < 0) {
                 goto fail;
             }
         } else if (!(flags & BDRV_O_NO_IO)) {
@@ -1619,7 +1659,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
     if (s->refcount_block_cache) {
         qcow2_cache_destroy(s->refcount_block_cache);
     }
-    qcrypto_block_free(s->crypto);
+    qcow2_crypto_blocks_free(s);
     qapi_free_QCryptoBlockOpenOptions(s->crypto_opts);
     return ret;
 }
@@ -2214,8 +2254,7 @@ static void qcow2_close(BlockDriverState *bs)
     qcow2_cache_destroy(s->l2_table_cache);
     qcow2_cache_destroy(s->refcount_block_cache);
 
-    qcrypto_block_free(s->crypto);
-    s->crypto = NULL;
+    qcow2_crypto_blocks_free(s);
 
     g_free(s->unknown_header_fields);
     cleanup_unknown_header_ext(bs);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 08/11] qcow2: bdrv_co_preadv: improve locking
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 09/11] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Background: decryption will be done in threads, to take benefit of it,
we should move it out of the lock first.

But let's go further: it turns out, that for locking around switch
cases we have only two variants: when we just do memset(0) not
releasing the lock (it is useless) and when we actually can handle the
whole case out of the lock. So, refactor the whole thing to reduce
locked code region and make it clean.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 48 ++++++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1e28f17373..5467089cfe 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1924,6 +1924,7 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
 
         ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
         if (ret < 0) {
+            qemu_co_mutex_unlock(&s->lock);
             goto fail;
         }
 
@@ -1932,39 +1933,38 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
         qemu_iovec_reset(&hd_qiov);
         qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
+        if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
+            ret == QCOW2_CLUSTER_ZERO_ALLOC ||
+            (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+        {
+            /* No sense in releasing the lock */
+
+            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+
+            bytes -= cur_bytes;
+            offset += cur_bytes;
+            bytes_done += cur_bytes;
+            continue;
+        }
+
+        qemu_co_mutex_unlock(&s->lock);
+
         switch (ret) {
         case QCOW2_CLUSTER_UNALLOCATED:
-
-            if (bs->backing) {
-                BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-                qemu_co_mutex_unlock(&s->lock);
-                ret = bdrv_co_preadv(bs->backing, offset, cur_bytes,
-                                     &hd_qiov, 0);
-                qemu_co_mutex_lock(&s->lock);
-                if (ret < 0) {
-                    goto fail;
-                }
-            } else {
-                /* Note: in this case, no need to wait */
-                qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+            BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
+            ret = bdrv_co_preadv(bs->backing, offset, cur_bytes, &hd_qiov, 0);
+            if (ret < 0) {
+                goto fail;
             }
             break;
 
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-        case QCOW2_CLUSTER_ZERO_ALLOC:
-            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
-            break;
-
         case QCOW2_CLUSTER_COMPRESSED:
-            qemu_co_mutex_unlock(&s->lock);
             ret = qcow2_co_preadv_compressed(bs, cluster_offset,
                                              offset, cur_bytes,
                                              &hd_qiov);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
                 goto fail;
             }
-
             break;
 
         case QCOW2_CLUSTER_NORMAL:
@@ -1997,11 +1997,9 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             }
 
             BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-            qemu_co_mutex_unlock(&s->lock);
             ret = bdrv_co_preadv(bs->file,
                                  cluster_offset + offset_in_cluster,
                                  cur_bytes, &hd_qiov, 0);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
                 goto fail;
             }
@@ -2032,12 +2030,14 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
         bytes -= cur_bytes;
         offset += cur_bytes;
         bytes_done += cur_bytes;
+
+        qemu_co_mutex_lock(&s->lock);
     }
     ret = 0;
 
-fail:
     qemu_co_mutex_unlock(&s->lock);
 
+fail:
     qemu_iovec_destroy(&hd_qiov);
     qemu_vfree(cluster_data);
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 09/11] qcow2: qcow2_co_preadv: skip using hd_qiov when possible
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 08/11] qcow2: bdrv_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 10/11] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

qemu_iovec_memset has @offset parameter, so using hd_qiov for it is not
needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 5467089cfe..bbd6df3614 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1928,18 +1928,13 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             goto fail;
         }
 
-        offset_in_cluster = offset_into_cluster(s, offset);
-
-        qemu_iovec_reset(&hd_qiov);
-        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
-
         if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
             ret == QCOW2_CLUSTER_ZERO_ALLOC ||
             (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
         {
             /* No sense in releasing the lock */
 
-            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
+            qemu_iovec_memset(qiov, bytes_done, 0, cur_bytes);
 
             bytes -= cur_bytes;
             offset += cur_bytes;
@@ -1947,6 +1942,11 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
             continue;
         }
 
+        offset_in_cluster = offset_into_cluster(s, offset);
+
+        qemu_iovec_reset(&hd_qiov);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
+
         qemu_co_mutex_unlock(&s->lock);
 
         switch (ret) {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 10/11] qcow2: bdrv_co_pwritev: move encryption code out of the lock
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 09/11] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 11/11] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
  2018-11-27 16:23 ` [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Daniel P. Berrangé
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Encryption will be done in threads, to take benefit of it, we should
move it out of the lock first.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index bbd6df3614..3fa7e3ea27 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2118,11 +2118,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                          &cluster_offset, &l2meta);
         if (ret < 0) {
-            goto fail;
+            goto out_locked;
         }
 
         assert((cluster_offset & 511) == 0);
 
+        ret = qcow2_pre_write_overlap_check(bs, 0,
+                                            cluster_offset + offset_in_cluster,
+                                            cur_bytes);
+        if (ret < 0) {
+            goto out_locked;
+        }
+
+        qemu_co_mutex_unlock(&s->lock);
+
         qemu_iovec_reset(&hd_qiov);
         qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
@@ -2134,7 +2143,7 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                                    * s->cluster_size);
                 if (cluster_data == NULL) {
                     ret = -ENOMEM;
-                    goto fail;
+                    goto out_unlocked;
                 }
             }
 
@@ -2149,40 +2158,34 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                       cluster_data,
                                       cur_bytes, NULL) < 0) {
                 ret = -EIO;
-                goto fail;
+                goto out_unlocked;
             }
 
             qemu_iovec_reset(&hd_qiov);
             qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
         }
 
-        ret = qcow2_pre_write_overlap_check(bs, 0,
-                cluster_offset + offset_in_cluster, cur_bytes);
-        if (ret < 0) {
-            goto fail;
-        }
-
         /* If we need to do COW, check if it's possible to merge the
          * writing of the guest data together with that of the COW regions.
          * If it's not possible (or not necessary) then write the
          * guest data now. */
         if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
-            qemu_co_mutex_unlock(&s->lock);
             BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
             trace_qcow2_writev_data(qemu_coroutine_self(),
                                     cluster_offset + offset_in_cluster);
             ret = bdrv_co_pwritev(bs->file,
                                   cluster_offset + offset_in_cluster,
                                   cur_bytes, &hd_qiov, 0);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
-                goto fail;
+                goto out_unlocked;
             }
         }
 
+        qemu_co_mutex_lock(&s->lock);
+
         ret = qcow2_handle_l2meta(bs, &l2meta, true);
         if (ret) {
-            goto fail;
+            goto out_locked;
         }
 
         bytes -= cur_bytes;
@@ -2191,8 +2194,12 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
     }
     ret = 0;
+    goto out_locked;
 
-fail:
+out_unlocked:
+    qemu_co_mutex_lock(&s->lock);
+
+out_locked:
     qcow2_handle_l2meta(bs, &l2meta, false);
 
     qemu_co_mutex_unlock(&s->lock);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 11/11] qcow2: do encryption in threads
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 10/11] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
@ 2018-11-23 16:55 ` Vladimir Sementsov-Ogievskiy
  2018-11-27 16:23 ` [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Daniel P. Berrangé
  11 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2018-11-23 16:55 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mreitz, kwolf, berrange, berto, vsementsov, den

Do encryption/decryption in threads, like it is already done for
compression. This improves asynchronous encrypted io.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3fa7e3ea27..a7d77a2178 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2007,13 +2007,8 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
                 assert(s->crypto);
                 assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
                 assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-                if (qcrypto_block_decrypt(s->crypto,
-                                          (s->crypt_physical_offset ?
-                                           cluster_offset + offset_in_cluster :
-                                           offset),
-                                          cluster_data,
-                                          cur_bytes,
-                                          NULL) < 0) {
+                if (qcow2_co_decrypt(bs, cluster_offset, offset,
+                                     cluster_data, cur_bytes) < 0) {
                     ret = -EIO;
                     goto fail;
                 }
@@ -2151,12 +2146,8 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                    QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
             qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
 
-            if (qcrypto_block_encrypt(s->crypto,
-                                      (s->crypt_physical_offset ?
-                                       cluster_offset + offset_in_cluster :
-                                       offset),
-                                      cluster_data,
-                                      cur_bytes, NULL) < 0) {
+            if (qcow2_co_encrypt(bs, cluster_offset, offset,
+                                 cluster_data, cur_bytes) < 0) {
                 ret = -EIO;
                 goto out_unlocked;
             }
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption Vladimir Sementsov-Ogievskiy
@ 2018-11-27 16:21   ` Daniel P. Berrangé
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel P. Berrangé @ 2018-11-27 16:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, qemu-block, mreitz, kwolf, berto, den

On Fri, Nov 23, 2018 at 07:55:07PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add thread-based encrypt/decrypt. QCrypto don't support parallel
> operations with one block, so we need QCryptoBlock for each thread.



> +static int qcow2_crypto_blocks_open(BDRVQcow2State *s,
> +                                    const char *optprefix,
> +                                    QCryptoBlockReadFunc readfunc,
> +                                    void *opaque,
> +                                    unsigned int flags,
> +                                    Error **errp)
> +{
> +    int i;
> +
> +    s->crypto = qcrypto_block_open(s->crypto_opts, optprefix,
> +                                   readfunc, opaque, flags, errp);
> +    if (!s->crypto) {
> +        qcrypto_block_free(s->crypto);
> +        return -EINVAL;
> +    }
> +
> +    for (i = 0; i < QCOW2_MAX_THREADS; i++) {
> +        s->threads.per_thread[i].crypto =
> +                qcrypto_block_open(s->crypto_opts, optprefix,
> +                                   readfunc, opaque, flags, errp);

We really don't want to be doing this.  LUKS has an intentional time
penalty for opening devices. Each time you open a disk, we expect to
burn 1-2 seconds in CPU time. So this is multiplying that burn by
QCOW2_MAX_THREADS.

What we need todo is modify QCryptoBlock so that it can (optionally)
create many QCryptoCipher instances, allowing each thread to have its
own instance. We'll also need locking around the iv generator calls.

> +        if (!s->threads.per_thread[i].crypto) {
> +            qcow2_crypto_blocks_free(s);
> +            return -EINVAL;
> +        }
> +    }
> +
> +    return 0;
> +}

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] qcow2: encryption threads
  2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2018-11-23 16:55 ` [Qemu-devel] [PATCH 11/11] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
@ 2018-11-27 16:23 ` Daniel P. Berrangé
  11 siblings, 0 replies; 14+ messages in thread
From: Daniel P. Berrangé @ 2018-11-27 16:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, qemu-block, mreitz, kwolf, berto, den

On Fri, Nov 23, 2018 at 07:55:00PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> The series brings threads to qcow2 encryption/decryption path,
> like it is already done for compression.
> 
> Based-on: Kevin's block-next branch [d3db1496c5]
> 
> Performance gain is illustrated by the following test:
> 
> ]# cat test.sh 
> #!/bin/bash
> 
> size=1G
> src=/ssd/src.raw
> dst=/ssd/dst.enc.qcow2
> 
> echo create source for tests
> ./qemu-img create -f raw "$src" $size
> ./qemu-io -f raw -c "write -P 0xa 0 $size" "$src"
> 
> for w in "" "-W"; do
>     echo -e "\n\nTest with additional paramter for qemu-img: '$w'\n"
> 
>     echo create target...
>     ./qemu-img create -f qcow2 --object secret,id=sec0,data=test -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 "$dst" $size
>     echo
> 
>     echo test...
>     time ./qemu-img convert $w -f raw --object secret,id=sec0,data=test --target-image-opts -n "$src" "driver=qcow2,file.filename=$dst,encrypt.key-secret=sec0"
> done

Note that using  iter-time=10 is removing the time penalty for opening
the luks device. This is why you didn't see the significant negative
performance impact of creating many  QCryptoBlock instances, one for
each thread.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-11-27 16:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-23 16:55 [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 01/11] qcow2.h: add missing include Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 02/11] qcow2: add separate file for threaded data processing functions Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 03/11] qcow2-threads: use thread_pool_submit_co Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 04/11] qcow2: split out data processing threads state from BDRVQcow2State Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 05/11] qcow2-threads: split out generic path Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 06/11] qcow2-threads: add per-thread data Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 07/11] qcow2-threads: add encryption Vladimir Sementsov-Ogievskiy
2018-11-27 16:21   ` Daniel P. Berrangé
2018-11-23 16:55 ` [Qemu-devel] [PATCH 08/11] qcow2: bdrv_co_preadv: improve locking Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 09/11] qcow2: qcow2_co_preadv: skip using hd_qiov when possible Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 10/11] qcow2: bdrv_co_pwritev: move encryption code out of the lock Vladimir Sementsov-Ogievskiy
2018-11-23 16:55 ` [Qemu-devel] [PATCH 11/11] qcow2: do encryption in threads Vladimir Sementsov-Ogievskiy
2018-11-27 16:23 ` [Qemu-devel] [PATCH 00/11] qcow2: encryption threads Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.