All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 0/5] Block layer patches
@ 2018-07-03 14:59 Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 1/5] qemu-img: allow compressed not-in-order writes Kevin Wolf
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

The following changes since commit a395717cbd26e7593d3c3fe81faca121ec6d13e8:

  Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2018-07-03 11:49:51 +0100)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 59738025a1674bb7e07713c3c93ff4fb9c5079f5:

  block: Add blklogwrites (2018-07-03 16:09:48 +0200)

----------------------------------------------------------------
Block layer patches:

- qcow2: Use worker threads for compression to improve performance of
  'qemu-img convert -W' and compressed backup jobs
- blklogwrites: New filter driver to log write requests to an image in
  the dm-log-writes format

----------------------------------------------------------------
Aapo Vienamo (1):
      block: Add blklogwrites

Ari Sundholm (1):
      block: Move two block permission constants to the relevant enum

Vladimir Sementsov-Ogievskiy (3):
      qemu-img: allow compressed not-in-order writes
      qcow2: refactor data compression
      qcow2: add compress threads

 qapi/block-core.json  |  33 ++++-
 block/qcow2.h         |   3 +
 include/block/block.h |   7 +
 block.c               |   6 -
 block/blklogwrites.c  | 392 ++++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c         | 136 ++++++++++++++----
 qemu-img.c            |   5 -
 MAINTAINERS           |   6 +
 block/Makefile.objs   |   1 +
 9 files changed, 545 insertions(+), 44 deletions(-)
 create mode 100644 block/blklogwrites.c

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PULL 1/5] qemu-img: allow compressed not-in-order writes
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
@ 2018-07-03 14:59 ` Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 2/5] qcow2: refactor data compression Kevin Wolf
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

No reason to forbid them, and they are needed to improve performance
with compress-threads in further patches.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qemu-img.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index e1a506f7f6..7651d8172c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2141,11 +2141,6 @@ static int img_convert(int argc, char **argv)
         goto fail_getopt;
     }
 
-    if (!s.wr_in_order && s.compressed) {
-        error_report("Out of order write and compress are mutually exclusive");
-        goto fail_getopt;
-    }
-
     if (tgt_image_opts && !skip_create) {
         error_report("--target-image-opts requires use of -n flag");
         goto fail_getopt;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PULL 2/5] qcow2: refactor data compression
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 1/5] qemu-img: allow compressed not-in-order writes Kevin Wolf
@ 2018-07-03 14:59 ` Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 3/5] qcow2: add compress threads Kevin Wolf
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Make a separate function for compression to be parallelized later.
 - use .avail_out field instead of .next_out to calculate size of
   compressed data. It looks more natural and it allows to keep dest to
   be void pointer
 - set avail_out to be at least one byte less than input, to be sure
   avoid inefficient compression earlier

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 76 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 49 insertions(+), 27 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 2f9e58e0c4..67f4fb7c71 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -23,11 +23,14 @@
  */
 
 #include "qemu/osdep.h"
+
+#define ZLIB_CONST
+#include <zlib.h>
+
 #include "block/block_int.h"
 #include "block/qdict.h"
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
-#include <zlib.h>
 #include "qcow2.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
@@ -3650,6 +3653,46 @@ fail:
     return ret;
 }
 
+/*
+ * qcow2_compress()
+ *
+ * @dest - destination buffer, at least of @size-1 bytes
+ * @src - source buffer, @size bytes
+ *
+ * Returns: compressed size on success
+ *          -1 if compression is inefficient
+ *          -2 on any other error
+ */
+static ssize_t qcow2_compress(void *dest, const void *src, size_t size)
+{
+    ssize_t ret;
+    z_stream strm;
+
+    /* best compression, small window, no zlib header */
+    memset(&strm, 0, sizeof(strm));
+    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
+                       -12, 9, Z_DEFAULT_STRATEGY);
+    if (ret != 0) {
+        return -2;
+    }
+
+    strm.avail_in = size;
+    strm.next_in = src;
+    strm.avail_out = size - 1;
+    strm.next_out = dest;
+
+    ret = deflate(&strm, Z_FINISH);
+    if (ret == Z_STREAM_END) {
+        ret = size - 1 - strm.avail_out;
+    } else {
+        ret = (ret == Z_OK ? -1 : -2);
+    }
+
+    deflateEnd(&strm);
+
+    return ret;
+}
+
 /* XXX: put compressed sectors first, then all the cluster aligned
    tables to avoid losing bytes in alignment */
 static coroutine_fn int
@@ -3659,8 +3702,8 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
     BDRVQcow2State *s = bs->opaque;
     QEMUIOVector hd_qiov;
     struct iovec iov;
-    z_stream strm;
-    int ret, out_len;
+    int ret;
+    size_t out_len;
     uint8_t *buf, *out_buf;
     int64_t cluster_offset;
 
@@ -3694,32 +3737,11 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
 
     out_buf = g_malloc(s->cluster_size);
 
-    /* best compression, small window, no zlib header */
-    memset(&strm, 0, sizeof(strm));
-    ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION,
-                       Z_DEFLATED, -12,
-                       9, Z_DEFAULT_STRATEGY);
-    if (ret != 0) {
+    out_len = qcow2_compress(out_buf, buf, s->cluster_size);
+    if (out_len == -2) {
         ret = -EINVAL;
         goto fail;
-    }
-
-    strm.avail_in = s->cluster_size;
-    strm.next_in = (uint8_t *)buf;
-    strm.avail_out = s->cluster_size;
-    strm.next_out = out_buf;
-
-    ret = deflate(&strm, Z_FINISH);
-    if (ret != Z_STREAM_END && ret != Z_OK) {
-        deflateEnd(&strm);
-        ret = -EINVAL;
-        goto fail;
-    }
-    out_len = strm.next_out - out_buf;
-
-    deflateEnd(&strm);
-
-    if (ret != Z_STREAM_END || out_len >= s->cluster_size) {
+    } else if (out_len == -1) {
         /* could not compress: write normal cluster */
         ret = qcow2_co_pwritev(bs, offset, bytes, qiov, 0);
         if (ret < 0) {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PULL 3/5] qcow2: add compress threads
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 1/5] qemu-img: allow compressed not-in-order writes Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 2/5] qcow2: refactor data compression Kevin Wolf
@ 2018-07-03 14:59 ` Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 4/5] block: Move two block permission constants to the relevant enum Kevin Wolf
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Do data compression in separate threads. This significantly improve
performance for qemu-img convert with -W (allow async writes) and -c
(compressed) options.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.h |  3 +++
 block/qcow2.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 1c9c0d3631..d6aca687d6 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -326,6 +326,9 @@ typedef struct BDRVQcow2State {
      * override) */
     char *image_backing_file;
     char *image_backing_format;
+
+    CoQueue compress_wait_queue;
+    int nb_compress_threads;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
diff --git a/block/qcow2.c b/block/qcow2.c
index 67f4fb7c71..327685a2f9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -44,6 +44,7 @@
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
 #include "crypto.h"
+#include "block/thread-pool.h"
 
 /*
   Differences with QCOW:
@@ -1544,6 +1545,9 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
         qcow2_check_refcounts(bs, &result, 0);
     }
 #endif
+
+    qemu_co_queue_init(&s->compress_wait_queue);
+
     return ret;
 
  fail:
@@ -3693,6 +3697,62 @@ static ssize_t qcow2_compress(void *dest, const void *src, size_t size)
     return ret;
 }
 
+#define MAX_COMPRESS_THREADS 4
+
+typedef struct Qcow2CompressData {
+    void *dest;
+    const void *src;
+    size_t size;
+    ssize_t ret;
+} Qcow2CompressData;
+
+static int qcow2_compress_pool_func(void *opaque)
+{
+    Qcow2CompressData *data = opaque;
+
+    data->ret = qcow2_compress(data->dest, data->src, data->size);
+
+    return 0;
+}
+
+static void qcow2_compress_complete(void *opaque, int ret)
+{
+    qemu_coroutine_enter(opaque);
+}
+
+/* See qcow2_compress definition for parameters description */
+static ssize_t qcow2_co_compress(BlockDriverState *bs,
+                                 void *dest, const void *src, size_t size)
+{
+    BDRVQcow2State *s = bs->opaque;
+    BlockAIOCB *acb;
+    ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    Qcow2CompressData arg = {
+        .dest = dest,
+        .src = src,
+        .size = size,
+    };
+
+    while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
+        qemu_co_queue_wait(&s->compress_wait_queue, NULL);
+    }
+
+    s->nb_compress_threads++;
+    acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
+                                 qcow2_compress_complete,
+                                 qemu_coroutine_self());
+
+    if (!acb) {
+        s->nb_compress_threads--;
+        return -EINVAL;
+    }
+    qemu_coroutine_yield();
+    s->nb_compress_threads--;
+    qemu_co_queue_next(&s->compress_wait_queue);
+
+    return arg.ret;
+}
+
 /* XXX: put compressed sectors first, then all the cluster aligned
    tables to avoid losing bytes in alignment */
 static coroutine_fn int
@@ -3737,7 +3797,7 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
 
     out_buf = g_malloc(s->cluster_size);
 
-    out_len = qcow2_compress(out_buf, buf, s->cluster_size);
+    out_len = qcow2_co_compress(bs, out_buf, buf, s->cluster_size);
     if (out_len == -2) {
         ret = -EINVAL;
         goto fail;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PULL 4/5] block: Move two block permission constants to the relevant enum
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
                   ` (2 preceding siblings ...)
  2018-07-03 14:59 ` [Qemu-devel] [PULL 3/5] qcow2: add compress threads Kevin Wolf
@ 2018-07-03 14:59 ` Kevin Wolf
  2018-07-03 14:59 ` [Qemu-devel] [PULL 5/5] block: Add blklogwrites Kevin Wolf
  2018-07-04 18:30 ` [Qemu-devel] [PULL 0/5] Block layer patches Peter Maydell
  5 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

From: Ari Sundholm <ari@tuxera.com>

This allows using the two constants outside of block.c, which will
happen in a subsequent patch.

Signed-off-by: Ari Sundholm <ari@tuxera.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block.h | 7 +++++++
 block.c               | 6 ------
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index e5c7759a0c..bc76b1e59f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -225,6 +225,13 @@ enum {
     BLK_PERM_GRAPH_MOD          = 0x10,
 
     BLK_PERM_ALL                = 0x1f,
+
+    DEFAULT_PERM_PASSTHROUGH    = BLK_PERM_CONSISTENT_READ
+                                 | BLK_PERM_WRITE
+                                 | BLK_PERM_WRITE_UNCHANGED
+                                 | BLK_PERM_RESIZE,
+
+    DEFAULT_PERM_UNCHANGED      = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH,
 };
 
 char *bdrv_perm_names(uint64_t perm);
diff --git a/block.c b/block.c
index 70a46fdd84..961ec97d26 100644
--- a/block.c
+++ b/block.c
@@ -1948,12 +1948,6 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
     return 0;
 }
 
-#define DEFAULT_PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
-                                 | BLK_PERM_WRITE \
-                                 | BLK_PERM_WRITE_UNCHANGED \
-                                 | BLK_PERM_RESIZE)
-#define DEFAULT_PERM_UNCHANGED (BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH)
-
 void bdrv_filter_default_perms(BlockDriverState *bs, BdrvChild *c,
                                const BdrvChildRole *role,
                                BlockReopenQueue *reopen_queue,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PULL 5/5] block: Add blklogwrites
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
                   ` (3 preceding siblings ...)
  2018-07-03 14:59 ` [Qemu-devel] [PULL 4/5] block: Move two block permission constants to the relevant enum Kevin Wolf
@ 2018-07-03 14:59 ` Kevin Wolf
  2018-07-03 15:07   ` Kevin Wolf
  2018-07-04 18:30 ` [Qemu-devel] [PULL 0/5] Block layer patches Peter Maydell
  5 siblings, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 14:59 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, peter.maydell, qemu-devel

From: Aapo Vienamo <aapo@tuxera.com>

Implements a block device write logging system, similar to Linux kernel
device mapper dm-log-writes. The write operations that are performed
on a block device are logged to a file or another block device. The
write log format is identical to the dm-log-writes format. Currently,
log markers are not supported.

This functionality can be used for crash consistency and fs consistency
testing. By implementing it in qemu, tests utilizing write logs can be
be used to test non-Linux drivers and older kernels.

The driver accepts an optional parameter to set the sector size used
for logging. This makes the driver require all requests to be aligned
to this sector size and also makes offsets and sizes of writes in the
log metadata to be expressed in terms of this value (the log format has
a granularity of one sector for offsets and sizes). This allows
accurate logging of writes to guest block devices that have unusual
sector sizes.

The implementation is based on the blkverify and blkdebug block
drivers.

Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qapi/block-core.json |  33 ++++-
 block/blklogwrites.c | 392 +++++++++++++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS          |   6 +
 block/Makefile.objs  |   1 +
 4 files changed, 426 insertions(+), 6 deletions(-)
 create mode 100644 block/blklogwrites.c

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 90e554ed0f..a9eab8cab8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2533,16 +2533,17 @@
 # @throttle: Since 2.11
 # @nvme: Since 2.12
 # @copy-on-read: Since 3.0
+# @blklogwrites: Since 3.0
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevDriver',
-  'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop', 'copy-on-read',
-            'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
-            'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
-            'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', 'qcow2', 'qed',
-            'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
-            'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
+  'data': [ 'blkdebug', 'blklogwrites', 'blkverify', 'bochs', 'cloop',
+            'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', 'gluster',
+            'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks',
+            'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow',
+            'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog',
+            'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
 
 ##
 # @BlockdevOptionsFile:
@@ -3045,6 +3046,25 @@
             '*set-state': ['BlkdebugSetStateOptions'] } }
 
 ##
+# @BlockdevOptionsBlklogwrites:
+#
+# Driver specific block device options for blklogwrites.
+#
+# @file:            block device
+#
+# @log:             block device used to log writes to @file
+#
+# @log-sector-size: sector size used in logging writes to @file, determines
+#                   granularity of offsets and sizes of writes (default: 512)
+#
+# Since: 3.0
+##
+{ 'struct': 'BlockdevOptionsBlklogwrites',
+  'data': { 'file': 'BlockdevRef',
+            'log': 'BlockdevRef',
+            '*log-sector-size': 'uint32' } }
+
+##
 # @BlockdevOptionsBlkverify:
 #
 # Driver specific block device options for blkverify.
@@ -3563,6 +3583,7 @@
   'discriminator': 'driver',
   'data': {
       'blkdebug':   'BlockdevOptionsBlkdebug',
+      'blklogwrites':'BlockdevOptionsBlklogwrites',
       'blkverify':  'BlockdevOptionsBlkverify',
       'bochs':      'BlockdevOptionsGenericFormat',
       'cloop':      'BlockdevOptionsGenericFormat',
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
new file mode 100644
index 0000000000..0748b56d72
--- /dev/null
+++ b/block/blklogwrites.c
@@ -0,0 +1,392 @@
+/*
+ * Write logging blk driver based on blkverify and blkdebug.
+ *
+ * Copyright (c) 2017 Tuomas Tynkkynen <tuomas@tuxera.com>
+ * Copyright (c) 2018 Aapo Vienamo <aapo@tuxera.com>
+ * Copyright (c) 2018 Ari Sundholm <ari@tuxera.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
+#include "block/block_int.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/option.h"
+
+/* Disk format stuff - taken from Linux drivers/md/dm-log-writes.c */
+
+#define LOG_FLUSH_FLAG   (1 << 0)
+#define LOG_FUA_FLAG     (1 << 1)
+#define LOG_DISCARD_FLAG (1 << 2)
+#define LOG_MARK_FLAG    (1 << 3)
+
+#define WRITE_LOG_VERSION 1ULL
+#define WRITE_LOG_MAGIC 0x6a736677736872ULL
+
+/* All fields are little-endian. */
+struct log_write_super {
+    uint64_t magic;
+    uint64_t version;
+    uint64_t nr_entries;
+    uint32_t sectorsize;
+} QEMU_PACKED;
+
+struct log_write_entry {
+    uint64_t sector;
+    uint64_t nr_sectors;
+    uint64_t flags;
+    uint64_t data_len;
+} QEMU_PACKED;
+
+/* End of disk format structures. */
+
+typedef struct {
+    BdrvChild *log_file;
+    uint32_t sectorsize;
+    uint32_t sectorbits;
+    uint64_t cur_log_sector;
+    uint64_t nr_entries;
+} BDRVBlkLogWritesState;
+
+static inline uint32_t blk_log_writes_log2(uint32_t value)
+{
+    assert(value > 0);
+    return 31 - clz32(value);
+}
+
+static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
+                               Error **errp)
+{
+    BDRVBlkLogWritesState *s = bs->opaque;
+    Error *local_err = NULL;
+    int ret;
+    int64_t log_sector_size = BDRV_SECTOR_SIZE;
+
+    /* Open the file */
+    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
+                               &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        error_propagate(errp, local_err);
+        goto fail;
+    }
+
+    if (qdict_haskey(options, "log-sector-size")) {
+        log_sector_size = qdict_get_int(options, "log-sector-size");
+        qdict_del(options, "log-sector-size");
+    }
+
+    if (log_sector_size < 0 || log_sector_size >= (1ull << 32) ||
+        !is_power_of_2(log_sector_size))
+    {
+        ret = -EINVAL;
+        error_setg(errp, "Invalid log sector size %"PRId64, log_sector_size);
+        goto fail;
+    }
+
+    s->sectorsize = log_sector_size;
+    s->sectorbits = blk_log_writes_log2(log_sector_size);
+    s->cur_log_sector = 1;
+    s->nr_entries = 0;
+
+    /* Open the log file */
+    s->log_file = bdrv_open_child(NULL, options, "log", bs, &child_file, false,
+                                  &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        error_propagate(errp, local_err);
+        goto fail;
+    }
+
+    ret = 0;
+fail:
+    if (ret < 0) {
+        bdrv_unref_child(bs, bs->file);
+        bs->file = NULL;
+    }
+    return ret;
+}
+
+static void blk_log_writes_close(BlockDriverState *bs)
+{
+    BDRVBlkLogWritesState *s = bs->opaque;
+
+    bdrv_unref_child(bs, s->log_file);
+    s->log_file = NULL;
+}
+
+static int64_t blk_log_writes_getlength(BlockDriverState *bs)
+{
+    return bdrv_getlength(bs->file->bs);
+}
+
+static void blk_log_writes_refresh_filename(BlockDriverState *bs,
+                                            QDict *options)
+{
+    BDRVBlkLogWritesState *s = bs->opaque;
+
+    /* bs->file->bs has already been refreshed */
+    bdrv_refresh_filename(s->log_file->bs);
+
+    if (bs->file->bs->full_open_options
+        && s->log_file->bs->full_open_options)
+    {
+        QDict *opts = qdict_new();
+        qdict_put_str(opts, "driver", "blklogwrites");
+
+        qobject_ref(bs->file->bs->full_open_options);
+        qdict_put_obj(opts, "file", QOBJECT(bs->file->bs->full_open_options));
+        qobject_ref(s->log_file->bs->full_open_options);
+        qdict_put_obj(opts, "log",
+                      QOBJECT(s->log_file->bs->full_open_options));
+
+        bs->full_open_options = opts;
+    }
+}
+
+static void blk_log_writes_child_perm(BlockDriverState *bs, BdrvChild *c,
+                                      const BdrvChildRole *role,
+                                      BlockReopenQueue *ro_q,
+                                      uint64_t perm, uint64_t shrd,
+                                      uint64_t *nperm, uint64_t *nshrd)
+{
+    if (!c) {
+        *nperm = perm & DEFAULT_PERM_PASSTHROUGH;
+        *nshrd = (shrd & DEFAULT_PERM_PASSTHROUGH) | DEFAULT_PERM_UNCHANGED;
+        return;
+    }
+
+    if (!strcmp(c->name, "log")) {
+        bdrv_format_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
+    } else {
+        bdrv_filter_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
+    }
+}
+
+static void blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    BDRVBlkLogWritesState *s = bs->opaque;
+    bs->bl.request_alignment = s->sectorsize;
+}
+
+static int coroutine_fn
+blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                         QEMUIOVector *qiov, int flags)
+{
+    return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+typedef struct BlkLogWritesFileReq {
+    BlockDriverState *bs;
+    uint64_t offset;
+    uint64_t bytes;
+    int file_flags;
+    QEMUIOVector *qiov;
+    int (*func)(struct BlkLogWritesFileReq *r);
+    int file_ret;
+} BlkLogWritesFileReq;
+
+typedef struct {
+    BlockDriverState *bs;
+    QEMUIOVector *qiov;
+    struct log_write_entry entry;
+    uint64_t zero_size;
+    int log_ret;
+} BlkLogWritesLogReq;
+
+static void coroutine_fn blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
+{
+    BDRVBlkLogWritesState *s = lr->bs->opaque;
+    uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
+
+    s->nr_entries++;
+    s->cur_log_sector +=
+            ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
+
+    lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
+                                  lr->qiov, 0);
+
+    /* Logging for the "write zeroes" operation */
+    if (lr->log_ret == 0 && lr->zero_size) {
+        cur_log_offset = s->cur_log_sector << s->sectorbits;
+        s->cur_log_sector +=
+                ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
+
+        lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
+                                            lr->zero_size, 0);
+    }
+
+    /* Update super block on flush */
+    if (lr->log_ret == 0 && lr->entry.flags & LOG_FLUSH_FLAG) {
+        struct log_write_super super = {
+            .magic      = cpu_to_le64(WRITE_LOG_MAGIC),
+            .version    = cpu_to_le64(WRITE_LOG_VERSION),
+            .nr_entries = cpu_to_le64(s->nr_entries),
+            .sectorsize = cpu_to_le32(s->sectorsize),
+        };
+        void *zeroes = g_malloc0(s->sectorsize - sizeof(super));
+        QEMUIOVector qiov;
+
+        qemu_iovec_init(&qiov, 2);
+        qemu_iovec_add(&qiov, &super, sizeof(super));
+        qemu_iovec_add(&qiov, zeroes, s->sectorsize - sizeof(super));
+
+        lr->log_ret =
+            bdrv_co_pwritev(s->log_file, 0, s->sectorsize, &qiov, 0);
+        if (lr->log_ret == 0) {
+            lr->log_ret = bdrv_co_flush(s->log_file->bs);
+        }
+        qemu_iovec_destroy(&qiov);
+        g_free(zeroes);
+    }
+}
+
+static void coroutine_fn blk_log_writes_co_do_file(BlkLogWritesFileReq *fr)
+{
+    fr->file_ret = fr->func(fr);
+}
+
+static int coroutine_fn
+blk_log_writes_co_log(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                      QEMUIOVector *qiov, int flags,
+                      int (*file_func)(BlkLogWritesFileReq *r),
+                      uint64_t entry_flags, bool is_zero_write)
+{
+    QEMUIOVector log_qiov;
+    size_t niov = qiov ? qiov->niov : 0;
+    BDRVBlkLogWritesState *s = bs->opaque;
+    BlkLogWritesFileReq fr = {
+        .bs         = bs,
+        .offset     = offset,
+        .bytes      = bytes,
+        .file_flags = flags,
+        .qiov       = qiov,
+        .func       = file_func,
+    };
+    BlkLogWritesLogReq lr = {
+        .bs             = bs,
+        .qiov           = &log_qiov,
+        .entry = {
+            .sector     = cpu_to_le64(offset >> s->sectorbits),
+            .nr_sectors = cpu_to_le64(bytes >> s->sectorbits),
+            .flags      = cpu_to_le64(entry_flags),
+            .data_len   = 0,
+        },
+        .zero_size = is_zero_write ? bytes : 0,
+    };
+    void *zeroes = g_malloc0(s->sectorsize - sizeof(lr.entry));
+
+    assert((1 << s->sectorbits) == s->sectorsize);
+    assert(bs->bl.request_alignment == s->sectorsize);
+    assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
+    assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
+
+    qemu_iovec_init(&log_qiov, niov + 2);
+    qemu_iovec_add(&log_qiov, &lr.entry, sizeof(lr.entry));
+    qemu_iovec_add(&log_qiov, zeroes, s->sectorsize - sizeof(lr.entry));
+    if (qiov) {
+        qemu_iovec_concat(&log_qiov, qiov, 0, qiov->size);
+    }
+
+    blk_log_writes_co_do_file(&fr);
+    blk_log_writes_co_do_log(&lr);
+
+    qemu_iovec_destroy(&log_qiov);
+    g_free(zeroes);
+
+    if (lr.log_ret < 0) {
+        return lr.log_ret;
+    }
+
+    return fr.file_ret;
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pwritev(BlkLogWritesFileReq *fr)
+{
+    return bdrv_co_pwritev(fr->bs->file, fr->offset, fr->bytes,
+                           fr->qiov, fr->file_flags);
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pwrite_zeroes(BlkLogWritesFileReq *fr)
+{
+    return bdrv_co_pwrite_zeroes(fr->bs->file, fr->offset, fr->bytes,
+                                 fr->file_flags);
+}
+
+static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
+{
+    return bdrv_co_flush(fr->bs->file->bs);
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
+{
+    return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                          QEMUIOVector *qiov, int flags)
+{
+    return blk_log_writes_co_log(bs, offset, bytes, qiov, flags,
+                                 blk_log_writes_co_do_file_pwritev, 0, false);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
+                                BdrvRequestFlags flags)
+{
+    return blk_log_writes_co_log(bs, offset, bytes, NULL, flags,
+                                 blk_log_writes_co_do_file_pwrite_zeroes, 0,
+                                 true);
+}
+
+static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs)
+{
+    return blk_log_writes_co_log(bs, 0, 0, NULL, 0,
+                                 blk_log_writes_co_do_file_flush,
+                                 LOG_FLUSH_FLAG, false);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
+{
+    return blk_log_writes_co_log(bs, offset, count, NULL, 0,
+                                 blk_log_writes_co_do_file_pdiscard,
+                                 LOG_DISCARD_FLAG, false);
+}
+
+static BlockDriver bdrv_blk_log_writes = {
+    .format_name            = "blklogwrites",
+    .instance_size          = sizeof(BDRVBlkLogWritesState),
+
+    .bdrv_open              = blk_log_writes_open,
+    .bdrv_close             = blk_log_writes_close,
+    .bdrv_getlength         = blk_log_writes_getlength,
+    .bdrv_refresh_filename  = blk_log_writes_refresh_filename,
+    .bdrv_child_perm        = blk_log_writes_child_perm,
+    .bdrv_refresh_limits    = blk_log_writes_refresh_limits,
+
+    .bdrv_co_preadv         = blk_log_writes_co_preadv,
+    .bdrv_co_pwritev        = blk_log_writes_co_pwritev,
+    .bdrv_co_pwrite_zeroes  = blk_log_writes_co_pwrite_zeroes,
+    .bdrv_co_flush_to_disk  = blk_log_writes_co_flush_to_disk,
+    .bdrv_co_pdiscard       = blk_log_writes_co_pdiscard,
+    .bdrv_co_block_status   = bdrv_co_block_status_from_file,
+
+    .is_filter              = true,
+};
+
+static void bdrv_blk_log_writes_init(void)
+{
+    bdrv_register(&bdrv_blk_log_writes);
+}
+
+block_init(bdrv_blk_log_writes_init);
diff --git a/MAINTAINERS b/MAINTAINERS
index 42a1892d6a..5af89e7af3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2051,6 +2051,12 @@ S: Supported
 F: block/quorum.c
 L: qemu-block@nongnu.org
 
+blklogwrites
+M: Ari Sundholm <ari@tuxera.com>
+L: qemu-block@nongnu.org
+S: Supported
+F: block/blklogwrites.c
+
 blkverify
 M: Stefan Hajnoczi <stefanha@redhat.com>
 L: qemu-block@nongnu.org
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 899bfb5e2c..c8337bf186 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -5,6 +5,7 @@ block-obj-y += qed-check.o
 block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
 block-obj-y += quorum.o
 block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
+block-obj-y += blklogwrites.o
 block-obj-y += block-backend.o snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += file-posix.o
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PULL 5/5] block: Add blklogwrites
  2018-07-03 14:59 ` [Qemu-devel] [PULL 5/5] block: Add blklogwrites Kevin Wolf
@ 2018-07-03 15:07   ` Kevin Wolf
  0 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-03 15:07 UTC (permalink / raw)
  To: qemu-block; +Cc: peter.maydell, qemu-devel

Am 03.07.2018 um 16:59 hat Kevin Wolf geschrieben:
> From: Aapo Vienamo <aapo@tuxera.com>
> 
> Implements a block device write logging system, similar to Linux kernel
> device mapper dm-log-writes. The write operations that are performed
> on a block device are logged to a file or another block device. The
> write log format is identical to the dm-log-writes format. Currently,
> log markers are not supported.
> 
> This functionality can be used for crash consistency and fs consistency
> testing. By implementing it in qemu, tests utilizing write logs can be
> be used to test non-Linux drivers and older kernels.
> 
> The driver accepts an optional parameter to set the sector size used
> for logging. This makes the driver require all requests to be aligned
> to this sector size and also makes offsets and sizes of writes in the
> log metadata to be expressed in terms of this value (the log format has
> a granularity of one sector for offsets and sizes). This allows
> accurate logging of writes to guest block devices that have unusual
> sector sizes.
> 
> The implementation is based on the blkverify and blkdebug block
> drivers.
> 
> Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
> Signed-off-by: Ari Sundholm <ari@tuxera.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Note that I saw Ari's v7 right after sending the pull request. It
contains only a few simple bugfixes compared to the version I had
queued, so I quickly squashed the following in and updated the tag.

Kevin


diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 0748b56d72..47093fadd6 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -53,6 +53,19 @@ typedef struct {
     uint64_t nr_entries;
 } BDRVBlkLogWritesState;
 
+static QemuOptsList runtime_opts = {
+    .name = "blklogwrites",
+    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
+    .desc = {
+        {
+            .name = "log-sector-size",
+            .type = QEMU_OPT_SIZE,
+            .help = "Log sector size",
+        },
+        { /* end of list */ }
+    },
+};
+
 static inline uint32_t blk_log_writes_log2(uint32_t value)
 {
     assert(value > 0);
@@ -63,9 +76,18 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
                                Error **errp)
 {
     BDRVBlkLogWritesState *s = bs->opaque;
+    QemuOpts *opts;
     Error *local_err = NULL;
     int ret;
-    int64_t log_sector_size = BDRV_SECTOR_SIZE;
+    int64_t log_sector_size;
+
+    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        error_propagate(errp, local_err);
+        goto fail;
+    }
 
     /* Open the file */
     bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
@@ -76,12 +98,10 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail;
     }
 
-    if (qdict_haskey(options, "log-sector-size")) {
-        log_sector_size = qdict_get_int(options, "log-sector-size");
-        qdict_del(options, "log-sector-size");
-    }
+    log_sector_size = qemu_opt_get_size(opts, "log-sector-size",
+                                        BDRV_SECTOR_SIZE);
 
-    if (log_sector_size < 0 || log_sector_size >= (1ull << 32) ||
+    if (log_sector_size < 0 || log_sector_size > (1ull << 23) ||
         !is_power_of_2(log_sector_size))
     {
         ret = -EINVAL;
@@ -109,6 +129,7 @@ fail:
         bdrv_unref_child(bs, bs->file);
         bs->file = NULL;
     }
+    qemu_opts_del(opts);
     return ret;
 }
 
@@ -144,6 +165,7 @@ static void blk_log_writes_refresh_filename(BlockDriverState *bs,
         qobject_ref(s->log_file->bs->full_open_options);
         qdict_put_obj(opts, "log",
                       QOBJECT(s->log_file->bs->full_open_options));
+        qdict_put_int(opts, "log-sector-size", s->sectorsize);
 
         bs->full_open_options = opts;
     }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PULL 0/5] Block layer patches
  2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
                   ` (4 preceding siblings ...)
  2018-07-03 14:59 ` [Qemu-devel] [PULL 5/5] block: Add blklogwrites Kevin Wolf
@ 2018-07-04 18:30 ` Peter Maydell
  2018-07-05  8:28   ` Kevin Wolf
  5 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2018-07-04 18:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Qemu-block, QEMU Developers

On 3 July 2018 at 15:59, Kevin Wolf <kwolf@redhat.com> wrote:
> The following changes since commit a395717cbd26e7593d3c3fe81faca121ec6d13e8:
>
>   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2018-07-03 11:49:51 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 59738025a1674bb7e07713c3c93ff4fb9c5079f5:
>
>   block: Add blklogwrites (2018-07-03 16:09:48 +0200)
>
> ----------------------------------------------------------------
> Block layer patches:
>
> - qcow2: Use worker threads for compression to improve performance of
>   'qemu-img convert -W' and compressed backup jobs
> - blklogwrites: New filter driver to log write requests to an image in
>   the dm-log-writes format
>
> ----------------------------------------------------------------

Hi; this gives some a warning on OpenBSD and NetBSD:

/home/qemu/block/qcow2.c: In function 'qcow2_compress':
/home/qemu/block/qcow2.c:3684:18: warning: assignment discards 'const'
qualifier from pointer target type
     strm.next_in = src;
                  ^

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PULL 0/5] Block layer patches
  2018-07-04 18:30 ` [Qemu-devel] [PULL 0/5] Block layer patches Peter Maydell
@ 2018-07-05  8:28   ` Kevin Wolf
  0 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2018-07-05  8:28 UTC (permalink / raw)
  To: vsementsov; +Cc: Qemu-block, QEMU Developers, Peter Maydell

Am 04.07.2018 um 20:30 hat Peter Maydell geschrieben:
> On 3 July 2018 at 15:59, Kevin Wolf <kwolf@redhat.com> wrote:
> > The following changes since commit a395717cbd26e7593d3c3fe81faca121ec6d13e8:
> >
> >   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2018-07-03 11:49:51 +0100)
> >
> > are available in the git repository at:
> >
> >   git://repo.or.cz/qemu/kevin.git tags/for-upstream
> >
> > for you to fetch changes up to 59738025a1674bb7e07713c3c93ff4fb9c5079f5:
> >
> >   block: Add blklogwrites (2018-07-03 16:09:48 +0200)
> >
> > ----------------------------------------------------------------
> > Block layer patches:
> >
> > - qcow2: Use worker threads for compression to improve performance of
> >   'qemu-img convert -W' and compressed backup jobs
> > - blklogwrites: New filter driver to log write requests to an image in
> >   the dm-log-writes format
> >
> > ----------------------------------------------------------------
> 
> Hi; this gives some a warning on OpenBSD and NetBSD:
> 
> /home/qemu/block/qcow2.c: In function 'qcow2_compress':
> /home/qemu/block/qcow2.c:3684:18: warning: assignment discards 'const'
> qualifier from pointer target type
>      strm.next_in = src;

Hm, looks like they use a really old version of the zlib header, which
doesn't care about const correctness. I'll add a cast to work around it.

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-07-05  8:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 14:59 [Qemu-devel] [PULL 0/5] Block layer patches Kevin Wolf
2018-07-03 14:59 ` [Qemu-devel] [PULL 1/5] qemu-img: allow compressed not-in-order writes Kevin Wolf
2018-07-03 14:59 ` [Qemu-devel] [PULL 2/5] qcow2: refactor data compression Kevin Wolf
2018-07-03 14:59 ` [Qemu-devel] [PULL 3/5] qcow2: add compress threads Kevin Wolf
2018-07-03 14:59 ` [Qemu-devel] [PULL 4/5] block: Move two block permission constants to the relevant enum Kevin Wolf
2018-07-03 14:59 ` [Qemu-devel] [PULL 5/5] block: Add blklogwrites Kevin Wolf
2018-07-03 15:07   ` Kevin Wolf
2018-07-04 18:30 ` [Qemu-devel] [PULL 0/5] Block layer patches Peter Maydell
2018-07-05  8:28   ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.