All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver
@ 2018-04-21 13:29 Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 1/9] " Max Reitz
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

This series adds a copy-on-read block filter driver which works by
simply setting the BDRV_REQ_COPY_ON_READ flag on write requests (and
otherwise just passing everything through).

Cover letter of v1:
http://lists.nongnu.org/archive/html/qemu-block/2018-04/msg00285.html


v2:
- Renamed the driver "copy-on-read" (from just "cor") [Stefan]
- Commit message typo fix in patch 2 [Stefan, Berto]


git-backport-diff against v2:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/9:[0020] [FC] 'block: Add COR filter driver'
002/9:[----] [--] 'block: BLK_PERM_WRITE includes ..._UNCHANGED'
003/9:[----] [--] 'block: Add BDRV_REQ_WRITE_UNCHANGED flag'
004/9:[----] [--] 'block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes'
005/9:[----] [--] 'block/quorum: Support BDRV_REQ_WRITE_UNCHANGED'
006/9:[----] [--] 'block: Support BDRV_REQ_WRITE_UNCHANGED in filters'
007/9:[----] [--] 'iotests: Clean up wrap image in 197'
008/9:[0010] [FC] 'iotests: Copy 197 for COR filter driver'
009/9:[0004] [FC] 'iotests: Add test for COR across nodes'


Max Reitz (9):
  block: Add COR filter driver
  block: BLK_PERM_WRITE includes ..._UNCHANGED
  block: Add BDRV_REQ_WRITE_UNCHANGED flag
  block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
  block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
  block: Support BDRV_REQ_WRITE_UNCHANGED in filters
  iotests: Clean up wrap image in 197
  iotests: Copy 197 for COR filter driver
  iotests: Add test for COR across nodes

 block/Makefile.objs        |   2 +-
 qapi/block-core.json       |   5 +-
 include/block/block.h      |   9 ++-
 block/blkdebug.c           |   9 +--
 block/blkreplay.c          |   3 +
 block/blkverify.c          |   3 +
 block/copy-on-read.c       | 173 +++++++++++++++++++++++++++++++++++++++++++++
 block/io.c                 |  12 +++-
 block/mirror.c             |   2 +
 block/quorum.c             |  19 +++--
 block/raw-format.c         |   9 +--
 block/throttle.c           |   6 +-
 tests/qemu-iotests/197     |   1 +
 tests/qemu-iotests/215     | 120 +++++++++++++++++++++++++++++++
 tests/qemu-iotests/215.out |  26 +++++++
 tests/qemu-iotests/216     | 117 ++++++++++++++++++++++++++++++
 tests/qemu-iotests/216.out |  28 ++++++++
 tests/qemu-iotests/group   |   2 +
 18 files changed, 524 insertions(+), 22 deletions(-)
 create mode 100644 block/copy-on-read.c
 create mode 100755 tests/qemu-iotests/215
 create mode 100644 tests/qemu-iotests/215.out
 create mode 100755 tests/qemu-iotests/216
 create mode 100644 tests/qemu-iotests/216.out

-- 
2.14.3

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 1/9] block: Add COR filter driver
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-24 15:08   ` Alberto Garcia
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 2/9] block: BLK_PERM_WRITE includes ..._UNCHANGED Max Reitz
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

This adds a simple copy-on-read filter driver.  It relies on the already
existing COR functionality in the central block layer code, which may be
moved here once we no longer need it there.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/Makefile.objs  |   2 +-
 qapi/block-core.json |   5 +-
 block/copy-on-read.c | 171 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 176 insertions(+), 2 deletions(-)
 create mode 100644 block/copy-on-read.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index d644bac60a..899bfb5e2c 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -26,7 +26,7 @@ block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
 block-obj-y += backup.o
 block-obj-$(CONFIG_REPLICATION) += replication.o
-block-obj-y += throttle.o
+block-obj-y += throttle.o copy-on-read.o
 
 block-obj-y += crypto.o
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c50517bff3..32ca94e294 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2506,11 +2506,12 @@
 # @vxhs: Since 2.10
 # @throttle: Since 2.11
 # @nvme: Since 2.12
+# @copy-on-read: Since 2.13
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevDriver',
-  'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop',
+  'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop', 'copy-on-read',
             'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
             'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
             'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', 'qcow2', 'qed',
@@ -3522,6 +3523,7 @@
       'blkverify':  'BlockdevOptionsBlkverify',
       'bochs':      'BlockdevOptionsGenericFormat',
       'cloop':      'BlockdevOptionsGenericFormat',
+      'copy-on-read':'BlockdevOptionsGenericFormat',
       'dmg':        'BlockdevOptionsGenericFormat',
       'file':       'BlockdevOptionsFile',
       'ftp':        'BlockdevOptionsCurlFtp',
@@ -4049,6 +4051,7 @@
       'blkverify':      'BlockdevCreateNotSupported',
       'bochs':          'BlockdevCreateNotSupported',
       'cloop':          'BlockdevCreateNotSupported',
+      'copy-on-read':   'BlockdevCreateNotSupported',
       'dmg':            'BlockdevCreateNotSupported',
       'file':           'BlockdevCreateOptionsFile',
       'ftp':            'BlockdevCreateNotSupported',
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
new file mode 100644
index 0000000000..823ec751c4
--- /dev/null
+++ b/block/copy-on-read.c
@@ -0,0 +1,171 @@
+/*
+ * Copy-on-read filter block driver
+ *
+ * Copyright (c) 2018 Red Hat, Inc.
+ *
+ * Author:
+ *   Max Reitz <mreitz@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+
+
+static int cor_open(BlockDriverState *bs, QDict *options, int flags,
+                    Error **errp)
+{
+    bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
+                               errp);
+    if (!bs->file) {
+        return -EINVAL;
+    }
+
+    bs->supported_write_flags = BDRV_REQ_FUA &
+                                    bs->file->bs->supported_write_flags;
+
+    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+                                    bs->file->bs->supported_zero_flags;
+
+    return 0;
+}
+
+
+static void cor_close(BlockDriverState *bs)
+{
+}
+
+
+#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
+                          | BLK_PERM_WRITE \
+                          | BLK_PERM_RESIZE)
+#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
+
+static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
+                           const BdrvChildRole *role,
+                           BlockReopenQueue *reopen_queue,
+                           uint64_t perm, uint64_t shared,
+                           uint64_t *nperm, uint64_t *nshared)
+{
+    if (c == NULL) {
+        *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED;
+        *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
+        return;
+    }
+
+    *nperm = (perm & PERM_PASSTHROUGH) |
+             (c->perm & PERM_UNCHANGED);
+    *nshared = (shared & PERM_PASSTHROUGH) |
+               (c->shared_perm & PERM_UNCHANGED);
+}
+
+
+static int64_t cor_getlength(BlockDriverState *bs)
+{
+    return bdrv_getlength(bs->file->bs);
+}
+
+
+static int cor_truncate(BlockDriverState *bs, int64_t offset,
+                        PreallocMode prealloc, Error **errp)
+{
+    return bdrv_truncate(bs->file, offset, prealloc, errp);
+}
+
+
+static int coroutine_fn cor_co_preadv(BlockDriverState *bs,
+                                      uint64_t offset, uint64_t bytes,
+                                      QEMUIOVector *qiov, int flags)
+{
+    return bdrv_co_preadv(bs->file, offset, bytes, qiov,
+                          flags | BDRV_REQ_COPY_ON_READ);
+}
+
+
+static int coroutine_fn cor_co_pwritev(BlockDriverState *bs,
+                                       uint64_t offset, uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags)
+{
+
+    return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+
+static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs,
+                                             int64_t offset, int bytes,
+                                             BdrvRequestFlags flags)
+{
+    return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+
+static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
+                                        int64_t offset, int bytes)
+{
+    return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+}
+
+
+static void cor_eject(BlockDriverState *bs, bool eject_flag)
+{
+    bdrv_eject(bs->file->bs, eject_flag);
+}
+
+
+static void cor_lock_medium(BlockDriverState *bs, bool locked)
+{
+    bdrv_lock_medium(bs->file->bs, locked);
+}
+
+
+static bool cor_recurse_is_first_non_filter(BlockDriverState *bs,
+                                            BlockDriverState *candidate)
+{
+    return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
+}
+
+
+BlockDriver bdrv_copy_on_read = {
+    .format_name                        = "copy-on-read",
+
+    .bdrv_open                          = cor_open,
+    .bdrv_close                         = cor_close,
+    .bdrv_child_perm                    = cor_child_perm,
+
+    .bdrv_getlength                     = cor_getlength,
+    .bdrv_truncate                      = cor_truncate,
+
+    .bdrv_co_preadv                     = cor_co_preadv,
+    .bdrv_co_pwritev                    = cor_co_pwritev,
+    .bdrv_co_pwrite_zeroes              = cor_co_pwrite_zeroes,
+    .bdrv_co_pdiscard                   = cor_co_pdiscard,
+
+    .bdrv_eject                         = cor_eject,
+    .bdrv_lock_medium                   = cor_lock_medium,
+
+    .bdrv_co_block_status               = bdrv_co_block_status_from_file,
+
+    .bdrv_recurse_is_first_non_filter   = cor_recurse_is_first_non_filter,
+
+    .has_variable_length                = true,
+    .is_filter                          = true,
+};
+
+static void bdrv_copy_on_read_init(void)
+{
+    bdrv_register(&bdrv_copy_on_read);
+}
+
+block_init(bdrv_copy_on_read_init);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 2/9] block: BLK_PERM_WRITE includes ..._UNCHANGED
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 1/9] " Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag Max Reitz
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

Currently we never actually check whether the WRITE_UNCHANGED
permission has been taken for unchanging writes.  But the one check that
is commented out checks both WRITE and WRITE_UNCHANGED; and considering
that WRITE_UNCHANGED is already documented as being weaker than WRITE,
we should probably explicitly document WRITE to include WRITE_UNCHANGED.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 include/block/block.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index cdec3639a3..397b5e8d44 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -205,6 +205,9 @@ enum {
      * This permission (which is weaker than BLK_PERM_WRITE) is both enough and
      * required for writes to the block node when the caller promises that
      * the visible disk content doesn't change.
+     *
+     * As the BLK_PERM_WRITE permission is strictly stronger, either is
+     * sufficient to perform an unchanging write.
      */
     BLK_PERM_WRITE_UNCHANGED    = 0x04,
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 1/9] " Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 2/9] block: BLK_PERM_WRITE includes ..._UNCHANGED Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-25 14:33   ` Eric Blake
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 4/9] block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes Max Reitz
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

This flag signifies that a write request will not change the visible
disk content.  With this flag set, it is sufficient to have the
BLK_PERM_WRITE_UNCHANGED permission instead of BLK_PERM_WRITE.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 include/block/block.h | 6 +++++-
 block/io.c            | 6 +++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 397b5e8d44..3894edda9d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -54,8 +54,12 @@ typedef enum {
     BDRV_REQ_FUA                = 0x10,
     BDRV_REQ_WRITE_COMPRESSED   = 0x20,
 
+    /* Signifies that this write request will not change the visible disk
+     * content. */
+    BDRV_REQ_WRITE_UNCHANGED    = 0x40,
+
     /* Mask of valid flags */
-    BDRV_REQ_MASK               = 0x3f,
+    BDRV_REQ_MASK               = 0x7f,
 } BdrvRequestFlags;
 
 typedef struct BlockSizes {
diff --git a/block/io.c b/block/io.c
index bd9a19a9c4..134b2a498f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1501,7 +1501,11 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
     assert(!waited || !req->serialising);
     assert(req->overlap_offset <= offset);
     assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-    assert(child->perm & BLK_PERM_WRITE);
+    if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+        assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+    } else {
+        assert(child->perm & BLK_PERM_WRITE);
+    }
     assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
 
     ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 4/9] block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (2 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 5/9] block/quorum: Support BDRV_REQ_WRITE_UNCHANGED Max Reitz
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/io.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 134b2a498f..fada4efbf3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1115,13 +1115,15 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
                 /* FIXME: Should we (perhaps conditionally) be setting
                  * BDRV_REQ_MAY_UNMAP, if it will allow for a sparser copy
                  * that still correctly reads as zero? */
-                ret = bdrv_co_do_pwrite_zeroes(bs, cluster_offset, pnum, 0);
+                ret = bdrv_co_do_pwrite_zeroes(bs, cluster_offset, pnum,
+                                               BDRV_REQ_WRITE_UNCHANGED);
             } else {
                 /* This does not change the data on the disk, it is not
                  * necessary to flush even in cache=writethrough mode.
                  */
                 ret = bdrv_driver_pwritev(bs, cluster_offset, pnum,
-                                          &local_qiov, 0);
+                                          &local_qiov,
+                                          BDRV_REQ_WRITE_UNCHANGED);
             }
 
             if (ret < 0) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 5/9] block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (3 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 4/9] block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 6/9] block: Support BDRV_REQ_WRITE_UNCHANGED in filters Max Reitz
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

We just need to forward it to quorum's children (except in case of a
rewrite because of corruption), but for that we first have to support
flags in child requests at all.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index cfe484a945..8cd689b2c1 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -115,6 +115,7 @@ struct QuorumAIOCB {
     /* Request metadata */
     uint64_t offset;
     uint64_t bytes;
+    int flags;
 
     QEMUIOVector *qiov;         /* calling IOV */
 
@@ -157,7 +158,8 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
 static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
                                    QEMUIOVector *qiov,
                                    uint64_t offset,
-                                   uint64_t bytes)
+                                   uint64_t bytes,
+                                   int flags)
 {
     BDRVQuorumState *s = bs->opaque;
     QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
@@ -168,6 +170,7 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
         .bs                 = bs,
         .offset             = offset,
         .bytes              = bytes,
+        .flags              = flags,
         .qiov               = qiov,
         .votes.compare      = quorum_sha256_compare,
         .votes.vote_list    = QLIST_HEAD_INITIALIZER(acb.votes.vote_list),
@@ -271,9 +274,11 @@ static void quorum_rewrite_entry(void *opaque)
     BDRVQuorumState *s = acb->bs->opaque;
 
     /* Ignore any errors, it's just a correction attempt for already
-     * corrupted data. */
+     * corrupted data.
+     * Mask out BDRV_REQ_WRITE_UNCHANGED because this overwrites the
+     * area with different data from the other children. */
     bdrv_co_pwritev(s->children[co->idx], acb->offset, acb->bytes,
-                    acb->qiov, 0);
+                    acb->qiov, acb->flags & ~BDRV_REQ_WRITE_UNCHANGED);
 
     /* Wake up the caller after the last rewrite */
     acb->rewrite_count--;
@@ -673,7 +678,7 @@ static int quorum_co_preadv(BlockDriverState *bs, uint64_t offset,
                             uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags);
     int ret;
 
     acb->is_read = true;
@@ -699,7 +704,7 @@ static void write_quorum_entry(void *opaque)
 
     sacb->bs = s->children[i]->bs;
     sacb->ret = bdrv_co_pwritev(s->children[i], acb->offset, acb->bytes,
-                                acb->qiov, 0);
+                                acb->qiov, acb->flags);
     if (sacb->ret == 0) {
         acb->success_count++;
     } else {
@@ -719,7 +724,7 @@ static int quorum_co_pwritev(BlockDriverState *bs, uint64_t offset,
                              uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags);
     int i, ret;
 
     for (i = 0; i < s->num_children; i++) {
@@ -961,6 +966,8 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
     }
     s->next_child_index = s->num_children;
 
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
+
     g_free(opened);
     goto exit;
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 6/9] block: Support BDRV_REQ_WRITE_UNCHANGED in filters
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (4 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 5/9] block/quorum: Support BDRV_REQ_WRITE_UNCHANGED Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 7/9] iotests: Clean up wrap image in 197 Max Reitz
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

Update the rest of the filter drivers to support
BDRV_REQ_WRITE_UNCHANGED.  They already forward write request flags to
their children, so we just have to announce support for it.

This patch does not cover the replication driver because that currently
does not support flags at all, and because it just grabs the WRITE
permission for its children when it can, so we should be fine just
submitting the incoming WRITE_UNCHANGED requests as normal writes.

It also does not cover format drivers for similar reasons.  They all use
bdrv_format_default_perms() as their .bdrv_child_perm() implementation
so they just always grab the WRITE permission for their file children
whenever possible.  In addition, it often would be difficult to
ascertain whether incoming unchanging writes end up as unchanging writes
in their files.  So we just leave them as normal potentially changing
writes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/blkdebug.c     |  9 +++++----
 block/blkreplay.c    |  3 +++
 block/blkverify.c    |  3 +++
 block/copy-on-read.c | 10 ++++++----
 block/mirror.c       |  2 ++
 block/raw-format.c   |  9 +++++----
 block/throttle.c     |  6 ++++--
 7 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 589712475a..762ec2527c 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -398,10 +398,11 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
         goto out;
     }
 
-    bs->supported_write_flags = BDRV_REQ_FUA &
-        bs->file->bs->supported_write_flags;
-    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
-        bs->file->bs->supported_zero_flags;
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+        (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+        ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+            bs->file->bs->supported_zero_flags);
     ret = -EINVAL;
 
     /* Set alignment overrides */
diff --git a/block/blkreplay.c b/block/blkreplay.c
index fe5a9b4a98..b016dbeee7 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -35,6 +35,9 @@ static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail;
     }
 
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
+
     ret = 0;
 fail:
     return ret;
diff --git a/block/blkverify.c b/block/blkverify.c
index 331365be33..1646404b46 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -141,6 +141,9 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail;
     }
 
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
+
     ret = 0;
 fail:
     qemu_opts_del(opts);
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 823ec751c4..6a97208888 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -33,11 +33,13 @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
-    bs->supported_write_flags = BDRV_REQ_FUA &
-                                    bs->file->bs->supported_write_flags;
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+                                (BDRV_REQ_FUA &
+                                    bs->file->bs->supported_write_flags);
 
-    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
-                                    bs->file->bs->supported_zero_flags;
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+                               ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+                                    bs->file->bs->supported_zero_flags);
 
     return 0;
 }
diff --git a/block/mirror.c b/block/mirror.c
index 820f512c7b..6fdd36d27f 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1148,6 +1148,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
         mirror_top_bs->implicit = true;
     }
     mirror_top_bs->total_sectors = bs->total_sectors;
+    mirror_top_bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
+    mirror_top_bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
     bdrv_set_aio_context(mirror_top_bs, bdrv_get_aio_context(bs));
 
     /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
diff --git a/block/raw-format.c b/block/raw-format.c
index a378547c99..fe33693a2d 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -415,10 +415,11 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     bs->sg = bs->file->bs->sg;
-    bs->supported_write_flags = BDRV_REQ_FUA &
-        bs->file->bs->supported_write_flags;
-    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
-        bs->file->bs->supported_zero_flags;
+    bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+        (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
+    bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+        ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+            bs->file->bs->supported_zero_flags);
 
     if (bs->probed && !bdrv_is_read_only(bs)) {
         fprintf(stderr,
diff --git a/block/throttle.c b/block/throttle.c
index 95ed06acd8..e298827f95 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -81,8 +81,10 @@ static int throttle_open(BlockDriverState *bs, QDict *options,
     if (!bs->file) {
         return -EINVAL;
     }
-    bs->supported_write_flags = bs->file->bs->supported_write_flags;
-    bs->supported_zero_flags = bs->file->bs->supported_zero_flags;
+    bs->supported_write_flags = bs->file->bs->supported_write_flags |
+                                BDRV_REQ_WRITE_UNCHANGED;
+    bs->supported_zero_flags = bs->file->bs->supported_zero_flags |
+                               BDRV_REQ_WRITE_UNCHANGED;
 
     return throttle_configure_tgm(bs, tgm, options, errp);
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 7/9] iotests: Clean up wrap image in 197
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (5 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 6/9] block: Support BDRV_REQ_WRITE_UNCHANGED in filters Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver Max Reitz
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 tests/qemu-iotests/197 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
index 5e869fe2b7..3ae4975eec 100755
--- a/tests/qemu-iotests/197
+++ b/tests/qemu-iotests/197
@@ -44,6 +44,7 @@ esac
 _cleanup()
 {
     _cleanup_test_img
+    rm -f "$TEST_WRAP"
     rm -f "$BLKDBG_CONF"
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (6 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 7/9] iotests: Clean up wrap image in 197 Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-24 15:17   ` Alberto Garcia
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes Max Reitz
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

iotest 197 tests copy-on-read using the (now old) copy-on-read flag.
Copy it to 215 and modify it to use the COR filter driver instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/215     | 120 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/215.out |  26 ++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 147 insertions(+)
 create mode 100755 tests/qemu-iotests/215
 create mode 100644 tests/qemu-iotests/215.out

diff --git a/tests/qemu-iotests/215 b/tests/qemu-iotests/215
new file mode 100755
index 0000000000..2e616ed659
--- /dev/null
+++ b/tests/qemu-iotests/215
@@ -0,0 +1,120 @@
+#!/bin/bash
+#
+# Test case for copy-on-read into qcow2, using the COR filter driver
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+status=1 # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+TEST_WRAP="$TEST_DIR/t.wrap.qcow2"
+BLKDBG_CONF="$TEST_DIR/blkdebug.conf"
+
+# Sanity check: our use of blkdebug fails if $TEST_DIR contains spaces
+# or other problems
+case "$TEST_DIR" in
+    *[^-_a-zA-Z0-9/]*)
+        _notrun "Suspicious TEST_DIR='$TEST_DIR', cowardly refusing to run" ;;
+esac
+
+_cleanup()
+{
+    _cleanup_test_img
+    rm -f "$TEST_WRAP"
+    rm -f "$BLKDBG_CONF"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# Test is supported for any backing file; but we force qcow2 for our wrapper.
+_supported_fmt generic
+_supported_proto generic
+_supported_os Linux
+# LUKS support may be possible, but it complicates things.
+_unsupported_fmt luks
+
+echo
+echo '=== Copy-on-read ==='
+echo
+
+# Prep the images
+# VPC rounds image sizes to a specific geometry, force a specific size.
+if [ "$IMGFMT" = "vpc" ]; then
+    IMGOPTS=$(_optstr_add "$IMGOPTS" "force_size")
+fi
+_make_test_img 4G
+$QEMU_IO -c "write -P 55 3G 1k" "$TEST_IMG" | _filter_qemu_io
+IMGPROTO=file IMGFMT=qcow2 IMGOPTS= TEST_IMG_FILE="$TEST_WRAP" \
+    _make_test_img -F "$IMGFMT" -b "$TEST_IMG" | _filter_img_create
+$QEMU_IO -f qcow2 -c "write -z -u 1M 64k" "$TEST_WRAP" | _filter_qemu_io
+
+# Ensure that a read of two clusters, but where one is already allocated,
+# does not re-write the allocated cluster
+cat > "$BLKDBG_CONF" <<EOF
+[inject-error]
+event = "cor_write"
+sector = "2048"
+EOF
+$QEMU_IO -c "open \
+ -o driver=copy-on-read,file.driver=blkdebug,file.config=$BLKDBG_CONF,file.image.driver=qcow2 $TEST_WRAP" \
+ -c "read -P 0 1M 128k" | _filter_qemu_io
+
+# Read the areas we want copied. A zero-length read should still be a
+# no-op.  The next read is under 2G, but aligned so that rounding to
+# clusters copies more than 2G of zeroes. The final read will pick up
+# the non-zero data in the same cluster.  Since a 2G read may exhaust
+# memory on some machines (particularly 32-bit), we skip the test if
+# that fails due to memory pressure.
+$QEMU_IO \
+    -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
+    -c "read 0 0" \
+    | _filter_qemu_io
+output=$($QEMU_IO \
+         -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
+         -c "read -P 0 1k $((2*1024*1024*1024 - 512))" \
+         2>&1 | _filter_qemu_io)
+case $output in
+    *allocate*)
+        _notrun "Insufficent memory to run test" ;;
+    *) printf '%s\n' "$output" ;;
+esac
+$QEMU_IO \
+    -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
+    -c "read -P 0 $((3*1024*1024*1024 + 1024)) 1k" \
+    | _filter_qemu_io
+
+# Copy-on-read is incompatible with read-only
+$QEMU_IO \
+    -c "open -r -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
+    2>&1 | _filter_testdir
+
+# Break the backing chain, and show that images are identical, and that
+# we properly copied over explicit zeros.
+$QEMU_IMG rebase -u -b "" -f qcow2 "$TEST_WRAP"
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"
+_check_test_img
+$QEMU_IMG compare -f $IMGFMT -F qcow2 "$TEST_IMG" "$TEST_WRAP"
+
+# success, all done
+echo '*** done'
+status=0
diff --git a/tests/qemu-iotests/215.out b/tests/qemu-iotests/215.out
new file mode 100644
index 0000000000..70b0f5fb19
--- /dev/null
+++ b/tests/qemu-iotests/215.out
@@ -0,0 +1,26 @@
+QA output created by 215
+
+=== Copy-on-read ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296
+wrote 1024/1024 bytes at offset 3221225472
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.wrap.IMGFMT', fmt=IMGFMT size=4294967296 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
+wrote 65536/65536 bytes at offset 1048576
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 131072/131072 bytes at offset 1048576
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 0/0 bytes at offset 0
+0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 2147483136/2147483136 bytes at offset 1024
+2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1024/1024 bytes at offset 3221226496
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+can't open device TEST_DIR/t.wrap.qcow2: Block node is read-only
+2 GiB (0x80010000) bytes     allocated at offset 0 bytes (0x0)
+1023.938 MiB (0x3fff0000) bytes not allocated at offset 2 GiB (0x80010000)
+64 KiB (0x10000) bytes     allocated at offset 3 GiB (0xc0000000)
+1023.938 MiB (0x3fff0000) bytes not allocated at offset 3 GiB (0xc0010000)
+No errors were found on the image.
+Images are identical.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 3e2dcdfa33..f1179ffde7 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -213,3 +213,4 @@
 212 rw auto quick
 213 rw auto quick
 214 rw auto
+215 rw auto quick
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (7 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver Max Reitz
@ 2018-04-21 13:29 ` Max Reitz
  2018-04-24 16:50   ` Kevin Wolf
  2018-04-24 16:51 ` [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Kevin Wolf
  2018-04-25 12:18 ` Max Reitz
  10 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-21 13:29 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, Max Reitz, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

COR across nodes (that is, you have some filter node between the
actually COR target and the node that performs the COR) cannot reliably
work together with the permission system when there is no explicit COR
node that can request the WRITE_UNCHANGED permission for its child.
This is because COR (currently) sneaks its requests by the usual
permission checks, so it can work without a WRITE* permission; but if
there is a filter node in between, that will re-issue the request, which
then passes through the usual check -- and if nobody has requested a
WRITE_UNCHANGED permission, that check will fail.

There is no real direct fix apart from hoping that there is someone who
has requested that permission; in case of just the qemu-io HMP command
(and no guest device), however, that is not the case.  The real real fix
is to implement the copy-on-read flag through an implicitly added COR
node.  Such a node can request the necessary permissions as shown in
this test.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/216     | 117 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/216.out |  28 +++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 146 insertions(+)
 create mode 100755 tests/qemu-iotests/216
 create mode 100644 tests/qemu-iotests/216.out

diff --git a/tests/qemu-iotests/216 b/tests/qemu-iotests/216
new file mode 100755
index 0000000000..b362cf93a8
--- /dev/null
+++ b/tests/qemu-iotests/216
@@ -0,0 +1,117 @@
+#!/usr/bin/env python
+#
+# Copy-on-read tests using a COR filter node
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+# Creator/Owner: Max Reitz <mreitz@redhat.com>
+#
+# Non-shared storage migration test using NBD server and drive-mirror
+
+import iotests
+from iotests import log, qemu_img_pipe, qemu_io, filter_qemu_io
+
+# Need backing file support
+iotests.verify_image_format(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'])
+iotests.verify_platform(['linux'])
+
+log('')
+log('=== Copy-on-read across nodes ===')
+log('')
+
+# The old copy-on-read mechanism without a filter node cannot request
+# WRITE_UNCHANGED permissions for its child.  Therefore it just tries
+# to sneak its write by the usual permission system and holds its
+# fingers crossed.  However, that sneaking does not work so well when
+# there is a filter node in the way: That will receive the write
+# request and re-issue a new one to its child, which this time is a
+# proper write request that will make the permission system cough --
+# unless there is someone at the top (like a guest device) that has
+# requested write permissions.
+#
+# A COR filter node, however, can request the proper permissions for
+# its child and therefore is not hit by this issue.
+
+with iotests.FilePath('base.img') as base_img_path, \
+     iotests.FilePath('top.img') as top_img_path, \
+     iotests.VM() as vm:
+
+    log('--- Setting up images ---')
+    log('')
+
+    qemu_img_pipe('create', '-f', iotests.imgfmt, base_img_path, '64M')
+
+    log(filter_qemu_io(qemu_io(base_img_path, '-c', 'write -P 1 0M 1M')))
+
+    qemu_img_pipe('create', '-f', iotests.imgfmt, '-b', base_img_path,
+                  top_img_path)
+
+    log(filter_qemu_io(qemu_io(top_img_path,  '-c', 'write -P 2 1M 1M')))
+
+    log('')
+    log('--- Doing COR ---')
+    log('')
+
+    # Compare with e.g. the following:
+    #   vm.add_drive_raw('if=none,node-name=node0,copy-on-read=on,driver=raw,' \
+    #                    'file.driver=%s,file.file.filename=%s' %
+    #                       (iotests.imgfmt, top_img_path))
+    # (Remove the blockdev-add instead.)
+    # ((Not tested here because it hits an assertion in the permission
+    #   system.))
+
+    vm.launch()
+
+    log(vm.qmp('blockdev-add',
+                    node_name='node0',
+                    driver='copy-on-read',
+                    file={
+                        'driver': 'raw',
+                        'file': {
+                            'driver': 'copy-on-read',
+                            'file': {
+                                'driver': 'raw',
+                                'file': {
+                                    'driver': iotests.imgfmt,
+                                    'file': {
+                                        'driver': 'file',
+                                        'filename': top_img_path
+                                    },
+                                    'backing': {
+                                        'driver': iotests.imgfmt,
+                                        'file': {
+                                            'driver': 'file',
+                                            'filename': base_img_path
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    }))
+
+    # Trigger COR
+    log(vm.qmp('human-monitor-command',
+               command_line='qemu-io node0 "read 0 64M"'))
+
+    vm.shutdown()
+
+    log('')
+    log('--- Checking COR result ---')
+    log('')
+
+    log(filter_qemu_io(qemu_io(base_img_path, '-c', 'discard 0 64M')))
+    log(filter_qemu_io(qemu_io(top_img_path,  '-c', 'read -P 1 0M 1M')))
+    log(filter_qemu_io(qemu_io(top_img_path,  '-c', 'read -P 2 1M 1M')))
diff --git a/tests/qemu-iotests/216.out b/tests/qemu-iotests/216.out
new file mode 100644
index 0000000000..d3fc590d29
--- /dev/null
+++ b/tests/qemu-iotests/216.out
@@ -0,0 +1,28 @@
+
+=== Copy-on-read across nodes ===
+
+--- Setting up images ---
+
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+
+--- Doing COR ---
+
+{u'return': {}}
+{u'return': u''}
+
+--- Checking COR result ---
+
+discard 67108864/67108864 bytes at offset 0
+64 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+read 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index f1179ffde7..2edc377370 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -214,3 +214,4 @@
 213 rw auto quick
 214 rw auto
 215 rw auto quick
+216 rw auto quick
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/9] block: Add COR filter driver
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 1/9] " Max Reitz
@ 2018-04-24 15:08   ` Alberto Garcia
  2018-04-25 11:18     ` Max Reitz
  0 siblings, 1 reply; 24+ messages in thread
From: Alberto Garcia @ 2018-04-24 15:08 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Kevin Wolf

On Sat 21 Apr 2018 03:29:21 PM CEST, Max Reitz wrote:
> This adds a simple copy-on-read filter driver.  It relies on the already
> existing COR functionality in the central block layer code, which may be
> moved here once we no longer need it there.
>
> Signed-off-by: Max Reitz <mreitz@redhat.com>


> +#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
> +                          | BLK_PERM_WRITE \
> +                          | BLK_PERM_RESIZE)
> +#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
> +
> +static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
> +                           const BdrvChildRole *role,
> +                           BlockReopenQueue *reopen_queue,
> +                           uint64_t perm, uint64_t shared,
> +                           uint64_t *nperm, uint64_t *nshared)
> +{
> +    if (c == NULL) {
> +        *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED;
> +        *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
> +        return;
> +    }
> +
> +    *nperm = (perm & PERM_PASSTHROUGH) |
> +             (c->perm & PERM_UNCHANGED);

I admit I'm not completely familiar with this, but don't you need to add
BLK_PERM_WRITE_UNCHANGED to *nperm ?

> +    *nshared = (shared & PERM_PASSTHROUGH) |
> +               (c->shared_perm & PERM_UNCHANGED);
> +}

Berto

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver Max Reitz
@ 2018-04-24 15:17   ` Alberto Garcia
  0 siblings, 0 replies; 24+ messages in thread
From: Alberto Garcia @ 2018-04-24 15:17 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Kevin Wolf

On Sat 21 Apr 2018 03:29:28 PM CEST, Max Reitz wrote:
> iotest 197 tests copy-on-read using the (now old) copy-on-read flag.
> Copy it to 215 and modify it to use the COR filter driver instead.
>
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes Max Reitz
@ 2018-04-24 16:50   ` Kevin Wolf
  0 siblings, 0 replies; 24+ messages in thread
From: Kevin Wolf @ 2018-04-24 16:50 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi, Alberto Garcia

Am 21.04.2018 um 15:29 hat Max Reitz geschrieben:
> COR across nodes (that is, you have some filter node between the
> actually COR target and the node that performs the COR) cannot reliably
> work together with the permission system when there is no explicit COR
> node that can request the WRITE_UNCHANGED permission for its child.
> This is because COR (currently) sneaks its requests by the usual
> permission checks, so it can work without a WRITE* permission; but if
> there is a filter node in between, that will re-issue the request, which
> then passes through the usual check -- and if nobody has requested a
> WRITE_UNCHANGED permission, that check will fail.
> 
> There is no real direct fix apart from hoping that there is someone who
> has requested that permission; in case of just the qemu-io HMP command
> (and no guest device), however, that is not the case.  The real real fix
> is to implement the copy-on-read flag through an implicitly added COR
> node.  Such a node can request the necessary permissions as shown in
> this test.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/216     | 117 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/216.out |  28 +++++++++++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 146 insertions(+)
>  create mode 100755 tests/qemu-iotests/216
>  create mode 100644 tests/qemu-iotests/216.out
> 
> diff --git a/tests/qemu-iotests/216 b/tests/qemu-iotests/216
> new file mode 100755
> index 0000000000..b362cf93a8
> --- /dev/null
> +++ b/tests/qemu-iotests/216
> @@ -0,0 +1,117 @@
> +#!/usr/bin/env python
> +#
> +# Copy-on-read tests using a COR filter node
> +#
> +# Copyright (C) 2018 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +# Creator/Owner: Max Reitz <mreitz@redhat.com>
> +#
> +# Non-shared storage migration test using NBD server and drive-mirror
> +
> +import iotests
> +from iotests import log, qemu_img_pipe, qemu_io, filter_qemu_io
> +
> +# Need backing file support
> +iotests.verify_image_format(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'])
> +iotests.verify_platform(['linux'])
> +
> +log('')
> +log('=== Copy-on-read across nodes ===')
> +log('')
> +
> +# The old copy-on-read mechanism without a filter node cannot request
> +# WRITE_UNCHANGED permissions for its child.  Therefore it just tries
> +# to sneak its write by the usual permission system and holds its
> +# fingers crossed.  However, that sneaking does not work so well when
> +# there is a filter node in the way: That will receive the write
> +# request and re-issue a new one to its child, which this time is a
> +# proper write request that will make the permission system cough --
> +# unless there is someone at the top (like a guest device) that has
> +# requested write permissions.
> +#
> +# A COR filter node, however, can request the proper permissions for
> +# its child and therefore is not hit by this issue.
> +
> +with iotests.FilePath('base.img') as base_img_path, \
> +     iotests.FilePath('top.img') as top_img_path, \
> +     iotests.VM() as vm:
> +
> +    log('--- Setting up images ---')
> +    log('')
> +
> +    qemu_img_pipe('create', '-f', iotests.imgfmt, base_img_path, '64M')
> +
> +    log(filter_qemu_io(qemu_io(base_img_path, '-c', 'write -P 1 0M 1M')))
> +
> +    qemu_img_pipe('create', '-f', iotests.imgfmt, '-b', base_img_path,
> +                  top_img_path)
> +
> +    log(filter_qemu_io(qemu_io(top_img_path,  '-c', 'write -P 2 1M 1M')))
> +
> +    log('')
> +    log('--- Doing COR ---')
> +    log('')
> +
> +    # Compare with e.g. the following:
> +    #   vm.add_drive_raw('if=none,node-name=node0,copy-on-read=on,driver=raw,' \
> +    #                    'file.driver=%s,file.file.filename=%s' %
> +    #                       (iotests.imgfmt, top_img_path))
> +    # (Remove the blockdev-add instead.)
> +    # ((Not tested here because it hits an assertion in the permission
> +    #   system.))

That's... kind of unfortunate.

This probably means that we can't continue with converting stuff like
throttling to filter nodes internally before this is fixed (i.e. copy on
read must be converted first). Let's hope it's the only one and we don't
end up with mutual dependencies...

Kevin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (8 preceding siblings ...)
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes Max Reitz
@ 2018-04-24 16:51 ` Kevin Wolf
  2018-04-25 12:18 ` Max Reitz
  10 siblings, 0 replies; 24+ messages in thread
From: Kevin Wolf @ 2018-04-24 16:51 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi, Alberto Garcia

Am 21.04.2018 um 15:29 hat Max Reitz geschrieben:
> This series adds a copy-on-read block filter driver which works by
> simply setting the BDRV_REQ_COPY_ON_READ flag on write requests (and
> otherwise just passing everything through).

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/9] block: Add COR filter driver
  2018-04-24 15:08   ` Alberto Garcia
@ 2018-04-25 11:18     ` Max Reitz
  2018-04-25 11:35       ` Alberto Garcia
  0 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-25 11:18 UTC (permalink / raw)
  To: Alberto Garcia, qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Kevin Wolf

[-- Attachment #1: Type: text/plain, Size: 1716 bytes --]

On 2018-04-24 17:08, Alberto Garcia wrote:
> On Sat 21 Apr 2018 03:29:21 PM CEST, Max Reitz wrote:
>> This adds a simple copy-on-read filter driver.  It relies on the already
>> existing COR functionality in the central block layer code, which may be
>> moved here once we no longer need it there.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> 
>> +#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
>> +                          | BLK_PERM_WRITE \
>> +                          | BLK_PERM_RESIZE)
>> +#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
>> +
>> +static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
>> +                           const BdrvChildRole *role,
>> +                           BlockReopenQueue *reopen_queue,
>> +                           uint64_t perm, uint64_t shared,
>> +                           uint64_t *nperm, uint64_t *nshared)
>> +{
>> +    if (c == NULL) {
>> +        *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED;
>> +        *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
>> +        return;
>> +    }
>> +
>> +    *nperm = (perm & PERM_PASSTHROUGH) |
>> +             (c->perm & PERM_UNCHANGED);
> 
> I admit I'm not completely familiar with this, but don't you need to add
> BLK_PERM_WRITE_UNCHANGED to *nperm ?

As long as it's requested in when the child is attached (which it is in
the "c == NULL" case), it should be part of c->perm then.

(And since PERM_PASSTHROUGH does not contain WRITE_UNCHANGED, it is part
of PERM_UNCHANGED.)

Max

>> +    *nshared = (shared & PERM_PASSTHROUGH) |
>> +               (c->shared_perm & PERM_UNCHANGED);
>> +}
> 
> Berto
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/9] block: Add COR filter driver
  2018-04-25 11:18     ` Max Reitz
@ 2018-04-25 11:35       ` Alberto Garcia
  0 siblings, 0 replies; 24+ messages in thread
From: Alberto Garcia @ 2018-04-25 11:35 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Kevin Wolf

On Wed 25 Apr 2018 01:18:03 PM CEST, Max Reitz wrote:

>>> +#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
>>> +                          | BLK_PERM_WRITE \
>>> +                          | BLK_PERM_RESIZE)
>>> +#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
>>> +
>>> +static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
>>> +                           const BdrvChildRole *role,
>>> +                           BlockReopenQueue *reopen_queue,
>>> +                           uint64_t perm, uint64_t shared,
>>> +                           uint64_t *nperm, uint64_t *nshared)
>>> +{
>>> +    if (c == NULL) {
>>> +        *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED;
>>> +        *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
>>> +        return;
>>> +    }
>>> +
>>> +    *nperm = (perm & PERM_PASSTHROUGH) |
>>> +             (c->perm & PERM_UNCHANGED);
>> 
>> I admit I'm not completely familiar with this, but don't you need to
>> add BLK_PERM_WRITE_UNCHANGED to *nperm ?
>
> As long as it's requested in when the child is attached (which it is
> in the "c == NULL" case), it should be part of c->perm then.
>
> (And since PERM_PASSTHROUGH does not contain WRITE_UNCHANGED, it is
> part of PERM_UNCHANGED.)

I see, thanks.

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver
  2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
                   ` (9 preceding siblings ...)
  2018-04-24 16:51 ` [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Kevin Wolf
@ 2018-04-25 12:18 ` Max Reitz
  2018-04-25 14:36   ` Eric Blake
  10 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-25 12:18 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Stefan Hajnoczi, Alberto Garcia, Kevin Wolf

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

On 2018-04-21 15:29, Max Reitz wrote:
> This series adds a copy-on-read block filter driver which works by
> simply setting the BDRV_REQ_COPY_ON_READ flag on write requests (and
> otherwise just passing everything through).
> 
> Cover letter of v1:
> http://lists.nongnu.org/archive/html/qemu-block/2018-04/msg00285.html
> 
> 
> v2:
> - Renamed the driver "copy-on-read" (from just "cor") [Stefan]
> - Commit message typo fix in patch 2 [Stefan, Berto]
> 
> 
> git-backport-diff against v2:
> 
> Key:
> [----] : patches are identical
> [####] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively
> 
> 001/9:[0020] [FC] 'block: Add COR filter driver'
> 002/9:[----] [--] 'block: BLK_PERM_WRITE includes ..._UNCHANGED'
> 003/9:[----] [--] 'block: Add BDRV_REQ_WRITE_UNCHANGED flag'
> 004/9:[----] [--] 'block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes'
> 005/9:[----] [--] 'block/quorum: Support BDRV_REQ_WRITE_UNCHANGED'
> 006/9:[----] [--] 'block: Support BDRV_REQ_WRITE_UNCHANGED in filters'
> 007/9:[----] [--] 'iotests: Clean up wrap image in 197'
> 008/9:[0010] [FC] 'iotests: Copy 197 for COR filter driver'
> 009/9:[0004] [FC] 'iotests: Add test for COR across nodes'

Thanks for the reviews, applied to my block branch.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag Max Reitz
@ 2018-04-25 14:33   ` Eric Blake
  2018-04-25 15:08     ` Max Reitz
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Blake @ 2018-04-25 14:33 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]

On 04/21/2018 08:29 AM, Max Reitz wrote:
> This flag signifies that a write request will not change the visible
> disk content.  With this flag set, it is sufficient to have the
> BLK_PERM_WRITE_UNCHANGED permission instead of BLK_PERM_WRITE.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Alberto Garcia <berto@igalia.com>
> ---
>  include/block/block.h | 6 +++++-
>  block/io.c            | 6 +++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)

Since patch 5 adds an instance of a driver setting supported_write_flags
= BDRV_REQ_WRITE_UNCHANGED, I think this patch should tweak the
documentation of supported_write_flags (and supported_zero_flags?) in
block_int.h to mention that drivers can opt-in to supporting this
particular flag (rather than it being handled completely at the block
layer).

Also, that does raise the question of whether you have more work to
support write-zero requests with WRITE_UNCHANGED (which indeed sounds
like something plausible to support).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver
  2018-04-25 12:18 ` Max Reitz
@ 2018-04-25 14:36   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2018-04-25 14:36 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 734 bytes --]

On 04/25/2018 07:18 AM, Max Reitz wrote:
> On 2018-04-21 15:29, Max Reitz wrote:
>> This series adds a copy-on-read block filter driver which works by
>> simply setting the BDRV_REQ_COPY_ON_READ flag on write requests (and
>> otherwise just passing everything through).
>>

> 
> Thanks for the reviews, applied to my block branch.

Sorry for a late review - I've pointed out a missing documentation
change to supported_write_flags and supported_zero_flags; it can be
fixed either as a followup, or squashed in if you rebase your staging on
block branch before turning out a pull request.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-25 14:33   ` Eric Blake
@ 2018-04-25 15:08     ` Max Reitz
  2018-04-26  2:12       ` Eric Blake
  0 siblings, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-25 15:08 UTC (permalink / raw)
  To: Eric Blake, qemu-block
  Cc: Kevin Wolf, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2634 bytes --]

On 2018-04-25 16:33, Eric Blake wrote:
> On 04/21/2018 08:29 AM, Max Reitz wrote:
>> This flag signifies that a write request will not change the visible
>> disk content.  With this flag set, it is sufficient to have the
>> BLK_PERM_WRITE_UNCHANGED permission instead of BLK_PERM_WRITE.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> Reviewed-by: Alberto Garcia <berto@igalia.com>
>> ---
>>  include/block/block.h | 6 +++++-
>>  block/io.c            | 6 +++++-
>>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> Since patch 5 adds an instance of a driver setting supported_write_flags
> = BDRV_REQ_WRITE_UNCHANGED, I think this patch should tweak the
> documentation of supported_write_flags (and supported_zero_flags?) in
> block_int.h to mention that drivers can opt-in to supporting this
> particular flag (rather than it being handled completely at the block
> layer).

Ah, right, I just didn't know we had a list of supported flags there.

And the flag isn't handled by the block layer if drivers don't support
it -- it just is cleared then and thus treated like a normal write.

Maybe I should make a note that drivers should support the flag when
they support writes to their children, but don't explicitly request a
WRITE permission (so they rely on their parent to request the proper
permission, which might be just WRITE_UNCHANGED and not the full WRITE).

I'll send a patch.

> Also, that does raise the question of whether you have more work to
> support write-zero requests with WRITE_UNCHANGED (which indeed sounds
> like something plausible to support).

I'm afraid I don't quite understand the question.
BDRV_REQ_WRITE_UNCHANGED support is usually rather simple because as
said above it is only needed by drivers that rely on their parent to
request the permissions, i.e. filter drivers.  Since filter drivers just
forward the requests, all they have to do is retain the
BDRV_REQ_WRITE_UNCHANGED flag, be it a zero write or a normal write.

It would be more complicated for format drivers, because they would have
to verify that the incoming unchanged write actually ends up as an
unchanged write in their file.  But we have already recognized that that
would be too much to ask and that format drivers may want to generally
just write anything to their child if it's writable (even regardless of
whether the grandparent issues writes to the format driver node), so
they always grab a WRITE permission on their file if possible.
Therefore, they do not have to support this request flag.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-25 15:08     ` Max Reitz
@ 2018-04-26  2:12       ` Eric Blake
  2018-04-26  8:58         ` Kevin Wolf
  2018-04-28 11:19         ` Max Reitz
  0 siblings, 2 replies; 24+ messages in thread
From: Eric Blake @ 2018-04-26  2:12 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On 04/25/2018 10:08 AM, Max Reitz wrote:

> 
>> Also, that does raise the question of whether you have more work to
>> support write-zero requests with WRITE_UNCHANGED (which indeed sounds
>> like something plausible to support).
> 
> I'm afraid I don't quite understand the question.
> BDRV_REQ_WRITE_UNCHANGED support is usually rather simple because as
> said above it is only needed by drivers that rely on their parent to
> request the permissions, i.e. filter drivers.  Since filter drivers just
> forward the requests, all they have to do is retain the
> BDRV_REQ_WRITE_UNCHANGED flag, be it a zero write or a normal write.

I'm trying to figure out if BDRV_REQ_WRITE_UNCHANGED makes sense for
bdrv_co_pwrite_zeroes as well as bdrv_co_pwrite. I think the answer is
yes (if we know the guest already reads zeroes, but need to write to the
protocol layer anyways because of a commit operation, then mixing both
BDRV_REQ_WRITE_UNCHANGED and BDRV_REQ_ZERO_WRITE to the block layer
makes sense, and supported_zero_flags should indeed pass
BDRV_REQ_WRITE_UNCHANGED on to a driver.

> 
> It would be more complicated for format drivers, because they would have
> to verify that the incoming unchanged write actually ends up as an
> unchanged write in their file.  But we have already recognized that that
> would be too much to ask and that format drivers may want to generally
> just write anything to their child if it's writable (even regardless of
> whether the grandparent issues writes to the format driver node), so
> they always grab a WRITE permission on their file if possible.
> Therefore, they do not have to support this request flag.

So qcow2 doesn't have to support the flag, but file-posix.c might want
to.  Or are you saying that only filter drivers need to advertise
support for the flag?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-26  2:12       ` Eric Blake
@ 2018-04-26  8:58         ` Kevin Wolf
  2018-04-28 11:19         ` Max Reitz
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Wolf @ 2018-04-26  8:58 UTC (permalink / raw)
  To: Eric Blake
  Cc: Max Reitz, qemu-block, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 3578 bytes --]

Am 26.04.2018 um 04:12 hat Eric Blake geschrieben:
> On 04/25/2018 10:08 AM, Max Reitz wrote:
> 
> > 
> >> Also, that does raise the question of whether you have more work to
> >> support write-zero requests with WRITE_UNCHANGED (which indeed sounds
> >> like something plausible to support).
> > 
> > I'm afraid I don't quite understand the question.
> > BDRV_REQ_WRITE_UNCHANGED support is usually rather simple because as
> > said above it is only needed by drivers that rely on their parent to
> > request the permissions, i.e. filter drivers.  Since filter drivers just
> > forward the requests, all they have to do is retain the
> > BDRV_REQ_WRITE_UNCHANGED flag, be it a zero write or a normal write.
> 
> I'm trying to figure out if BDRV_REQ_WRITE_UNCHANGED makes sense for
> bdrv_co_pwrite_zeroes as well as bdrv_co_pwrite. I think the answer is
> yes (if we know the guest already reads zeroes, but need to write to the
> protocol layer anyways because of a commit operation, then mixing both
> BDRV_REQ_WRITE_UNCHANGED and BDRV_REQ_ZERO_WRITE to the block layer
> makes sense, and supported_zero_flags should indeed pass
> BDRV_REQ_WRITE_UNCHANGED on to a driver.

Yes, that makes sense to me. (Except that commit isn't WRITE_UNCHANGED,
but streaming is.)

> > It would be more complicated for format drivers, because they would have
> > to verify that the incoming unchanged write actually ends up as an
> > unchanged write in their file.  But we have already recognized that that
> > would be too much to ask and that format drivers may want to generally
> > just write anything to their child if it's writable (even regardless of
> > whether the grandparent issues writes to the format driver node), so
> > they always grab a WRITE permission on their file if possible.
> > Therefore, they do not have to support this request flag.
> 
> So qcow2 doesn't have to support the flag, but file-posix.c might want
> to.  Or are you saying that only filter drivers need to advertise
> support for the flag?

Essentially a caller of bdrv_co_pwritev() must pass the flag if it wants
to write to a child for which it doesn't have BLK_PERM_WRITE.

This is potentially true for filter drivers if their parent does the
same (because we don't let filter drivers take BLK_PERM_WRITE
unconditionally, but only if one of their parents requests it on the
filter node).

In order to be able to rightfully set BDRV_REQ_WRITE_UNCHANGED for its
own requests, however, the child node must make sure that the parent
request it is processing already had the flag, i.e. it must support
BDRV_REQ_WRITE_UNCHANGED.

In contrast, qcow2 and other format drivers never need to send
BDRV_REQ_WRITE_UNCHANGED requests to their children (because they always
take BLK_PERM_WRITE for writable images anyway), so they don't have a
use for the flag and therefore don't need to advertise support for it.

This means that after putting a filter driver between a node and its
read-only (or write-unchanged-only, to be precise) parent, that node may
still be shared with users that don't tolerate a writer, but putting a
format driver above it will fail while that other user is still
attached. This is the expected behaviour.

(And if you think about it, there is no reason for format drivers to
ever send a write unchanged request: Something like image streaming is
about the backing chain and doesn't exist for the file child. So if the
content really doesn't change, the format driver just shouldn't send the
request at all.)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-26  2:12       ` Eric Blake
  2018-04-26  8:58         ` Kevin Wolf
@ 2018-04-28 11:19         ` Max Reitz
  2018-04-30  8:41           ` Kevin Wolf
  1 sibling, 1 reply; 24+ messages in thread
From: Max Reitz @ 2018-04-28 11:19 UTC (permalink / raw)
  To: Eric Blake, qemu-block
  Cc: Kevin Wolf, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 3629 bytes --]

On 2018-04-26 04:12, Eric Blake wrote:
> On 04/25/2018 10:08 AM, Max Reitz wrote:
> 
>>
>>> Also, that does raise the question of whether you have more work to
>>> support write-zero requests with WRITE_UNCHANGED (which indeed sounds
>>> like something plausible to support).
>>
>> I'm afraid I don't quite understand the question.
>> BDRV_REQ_WRITE_UNCHANGED support is usually rather simple because as
>> said above it is only needed by drivers that rely on their parent to
>> request the permissions, i.e. filter drivers.  Since filter drivers just
>> forward the requests, all they have to do is retain the
>> BDRV_REQ_WRITE_UNCHANGED flag, be it a zero write or a normal write.
> 
> I'm trying to figure out if BDRV_REQ_WRITE_UNCHANGED makes sense for
> bdrv_co_pwrite_zeroes as well as bdrv_co_pwrite. I think the answer is
> yes (if we know the guest already reads zeroes, but need to write to the
> protocol layer anyways because of a commit operation, then mixing both
> BDRV_REQ_WRITE_UNCHANGED and BDRV_REQ_ZERO_WRITE to the block layer
> makes sense, and supported_zero_flags should indeed pass
> BDRV_REQ_WRITE_UNCHANGED on to a driver.

Well, why not.

>> It would be more complicated for format drivers, because they would have
>> to verify that the incoming unchanged write actually ends up as an
>> unchanged write in their file.  But we have already recognized that that
>> would be too much to ask and that format drivers may want to generally
>> just write anything to their child if it's writable (even regardless of
>> whether the grandparent issues writes to the format driver node), so
>> they always grab a WRITE permission on their file if possible.
>> Therefore, they do not have to support this request flag.
> 
> So qcow2 doesn't have to support the flag, but file-posix.c might want
> to.  Or are you saying that only filter drivers need to advertise
> support for the flag?

It might make sense for file-posix, but when you think further, it
wouldn't do anything in practice.

First, if a protocol driver receives WRITE_UNCHANGED, there are two
things it can do: If it knows that it only has very plain storage, it
can just ignore the request because it won't do anything.

If a protocol driver supports backing files (gluster does, I think), it
cannot ignore them because just like qcow2 it needs to now make sure
that the data ends up in the top layer.  So if the protocol driver can
draw any benefits from advertising to the protocol that this is an
unchanging write (or that it just wants to bring data from a backing
file up to the top level, which it can infer from the WRITE_UNCHANGED
flag), then it would make sense for it to support the flag.

(Good point to add to the documentation.)

But from qemu's perspective, this is not required.  The protocol driver
will just be less efficient if it submits normal writes (and it can do
that, because qemu will refuse to grant WRITE_UNCHANGED on read-only
nodes, so the protocol driver must have write access).

OTOH, filter drivers (usually) *must* support this flag because they are
caught up in qemu's permission system.  They do not submit the writes to
some protocol, but to the block layer, and the block layer wants you to
have the proper permissions for that -- and that's not a simple RW/RO
decision, but we have read/write-unchanged in between.  So if a filter
driver that only forwards the permissions from its parent may only do
the very same requests the parent does -- and if the parent does a
WRITE_UNCHANGED request, the filter driver must do the same.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag
  2018-04-28 11:19         ` Max Reitz
@ 2018-04-30  8:41           ` Kevin Wolf
  0 siblings, 0 replies; 24+ messages in thread
From: Kevin Wolf @ 2018-04-30  8:41 UTC (permalink / raw)
  To: Max Reitz
  Cc: Eric Blake, qemu-block, Alberto Garcia, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

Am 28.04.2018 um 13:19 hat Max Reitz geschrieben:
> On 2018-04-26 04:12, Eric Blake wrote:
> > On 04/25/2018 10:08 AM, Max Reitz wrote:
> > 
> >>
> >>> Also, that does raise the question of whether you have more work to
> >>> support write-zero requests with WRITE_UNCHANGED (which indeed sounds
> >>> like something plausible to support).
> >>
> >> I'm afraid I don't quite understand the question.
> >> BDRV_REQ_WRITE_UNCHANGED support is usually rather simple because as
> >> said above it is only needed by drivers that rely on their parent to
> >> request the permissions, i.e. filter drivers.  Since filter drivers just
> >> forward the requests, all they have to do is retain the
> >> BDRV_REQ_WRITE_UNCHANGED flag, be it a zero write or a normal write.
> > 
> > I'm trying to figure out if BDRV_REQ_WRITE_UNCHANGED makes sense for
> > bdrv_co_pwrite_zeroes as well as bdrv_co_pwrite. I think the answer is
> > yes (if we know the guest already reads zeroes, but need to write to the
> > protocol layer anyways because of a commit operation, then mixing both
> > BDRV_REQ_WRITE_UNCHANGED and BDRV_REQ_ZERO_WRITE to the block layer
> > makes sense, and supported_zero_flags should indeed pass
> > BDRV_REQ_WRITE_UNCHANGED on to a driver.
> 
> Well, why not.
> 
> >> It would be more complicated for format drivers, because they would have
> >> to verify that the incoming unchanged write actually ends up as an
> >> unchanged write in their file.  But we have already recognized that that
> >> would be too much to ask and that format drivers may want to generally
> >> just write anything to their child if it's writable (even regardless of
> >> whether the grandparent issues writes to the format driver node), so
> >> they always grab a WRITE permission on their file if possible.
> >> Therefore, they do not have to support this request flag.
> > 
> > So qcow2 doesn't have to support the flag, but file-posix.c might want
> > to.  Or are you saying that only filter drivers need to advertise
> > support for the flag?
> 
> It might make sense for file-posix, but when you think further, it
> wouldn't do anything in practice.
> 
> First, if a protocol driver receives WRITE_UNCHANGED, there are two
> things it can do: If it knows that it only has very plain storage, it
> can just ignore the request because it won't do anything.

I think even a WRITE_UNCHANGED request is supposed to allocate a
currently unmappen block, so "very plain storage" probably has to mean
at least "supports neither backing files nor thin provisioning". Or you
would have to check the current allocation status first...

But I don't see a reason anyway why a WRITE_UNCHANGED request should end
up in the protocol layer.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-04-30  8:41 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-21 13:29 [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 1/9] " Max Reitz
2018-04-24 15:08   ` Alberto Garcia
2018-04-25 11:18     ` Max Reitz
2018-04-25 11:35       ` Alberto Garcia
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 2/9] block: BLK_PERM_WRITE includes ..._UNCHANGED Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 3/9] block: Add BDRV_REQ_WRITE_UNCHANGED flag Max Reitz
2018-04-25 14:33   ` Eric Blake
2018-04-25 15:08     ` Max Reitz
2018-04-26  2:12       ` Eric Blake
2018-04-26  8:58         ` Kevin Wolf
2018-04-28 11:19         ` Max Reitz
2018-04-30  8:41           ` Kevin Wolf
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 4/9] block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 5/9] block/quorum: Support BDRV_REQ_WRITE_UNCHANGED Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 6/9] block: Support BDRV_REQ_WRITE_UNCHANGED in filters Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 7/9] iotests: Clean up wrap image in 197 Max Reitz
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 8/9] iotests: Copy 197 for COR filter driver Max Reitz
2018-04-24 15:17   ` Alberto Garcia
2018-04-21 13:29 ` [Qemu-devel] [PATCH v2 9/9] iotests: Add test for COR across nodes Max Reitz
2018-04-24 16:50   ` Kevin Wolf
2018-04-24 16:51 ` [Qemu-devel] [PATCH v2 0/9] block: Add COR filter driver Kevin Wolf
2018-04-25 12:18 ` Max Reitz
2018-04-25 14:36   ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.