All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, mreitz@redhat.com, kwolf@redhat.com,
	vsementsov@virtuozzo.com, den@openvz.org
Subject: [PATCH v6 09/12] qcow2: introduce host-range-refs
Date: Thu, 22 Apr 2021 19:30:43 +0300	[thread overview]
Message-ID: <20210422163046.442932-10-vsementsov@virtuozzo.com> (raw)
In-Reply-To: <20210422163046.442932-1-vsementsov@virtuozzo.com>

We have a bug in qcow2: assume we've started data write into host
cluster A. s->lock is unlocked. During the write the refcount of
cluster A may become zero, cluster may be reallocated for other needs,
and our in-flight write become a use-after-free. More details will be
in the further commit which actually fixes the bug.

For now, let's prepare infrastructure for the following fix. We are
going to track these in-flight data writes and other operations. So, we
create a hash map

  cluster_index -> HostCluster

And for each HostCluster we calculate number of in-flight operations on
it (which does qcow2_host_range_ref() of course).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h                 |  12 ++++
 block/qcow2-host-range-refs.c | 127 ++++++++++++++++++++++++++++++++++
 block/qcow2.c                 |   3 +
 block/meson.build             |   1 +
 4 files changed, 143 insertions(+)
 create mode 100644 block/qcow2-host-range-refs.c

diff --git a/block/qcow2.h b/block/qcow2.h
index 511db948ec..d6de9543c4 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -420,6 +420,9 @@ typedef struct BDRVQcow2State {
      * is to convert the image with the desired compression type set.
      */
     Qcow2CompressionType compression_type;
+
+    /* For qcow2-host-range-refs.c */
+    GHashTable *host_range_refs;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
@@ -899,6 +902,15 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs);
 void qcow2_cache_host_discard(BlockDriverState *bs,
                               uint64_t offset, uint64_t length);
 
+void qcow2_init_host_range_refs(BDRVQcow2State *s);
+void qcow2_release_host_range_refs(BDRVQcow2State *s);
+void qcow2_host_range_ref(BlockDriverState *bs, int64_t offset,
+                               int64_t length);
+void qcow2_host_range_unref(BlockDriverState *bs, int64_t offset,
+                               int64_t length);
+uint64_t qcow2_get_host_range_refcnt(BlockDriverState *bs,
+                                     int64_t cluster_index);
+
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size);
diff --git a/block/qcow2-host-range-refs.c b/block/qcow2-host-range-refs.c
new file mode 100644
index 0000000000..54f0be27a4
--- /dev/null
+++ b/block/qcow2-host-range-refs.c
@@ -0,0 +1,127 @@
+/*
+ * Block driver for the QCOW version 2 format
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qcow2.h"
+
+typedef struct HostCluster {
+    uint64_t host_range_refcnt;
+
+    /* For convenience, keep cluster_index here */
+    int64_t cluster_index;
+} HostCluster;
+
+void qcow2_init_host_range_refs(BDRVQcow2State *s)
+{
+    s->host_range_refs =
+        g_hash_table_new_full(g_int64_hash, g_int64_equal, g_free, g_free);
+}
+
+void qcow2_release_host_range_refs(BDRVQcow2State *s)
+{
+    assert(g_hash_table_size(s->host_range_refs) == 0);
+    g_hash_table_unref(s->host_range_refs);
+}
+
+static HostCluster *find_host_cluster(BDRVQcow2State *s, int64_t cluster_index)
+{
+    HostCluster *cl;
+
+    if (!s->host_range_refs) {
+        return NULL;
+    }
+
+    cl = g_hash_table_lookup(s->host_range_refs, &cluster_index);
+
+    if (cl) {
+        assert(cl->host_range_refcnt > 0);
+    }
+
+    return cl;
+}
+
+uint64_t qcow2_get_host_range_refcnt(BlockDriverState *bs,
+                                     int64_t cluster_index)
+{
+    BDRVQcow2State *s = bs->opaque;
+    HostCluster *cl = find_host_cluster(s, cluster_index);
+
+    if (!cl) {
+        return 0;
+    }
+
+    return cl->host_range_refcnt;
+}
+
+/* Inrease host_range_refcnt of clusters intersecting with range */
+void coroutine_fn
+qcow2_host_range_ref(BlockDriverState *bs, int64_t offset, int64_t length)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int64_t start, last, cluster_index;
+
+    start = start_of_cluster(s, offset) >> s->cluster_bits;
+    last = start_of_cluster(s, offset + length - 1) >> s->cluster_bits;
+    for (cluster_index = start; cluster_index <= last; cluster_index++) {
+        HostCluster *cl = find_host_cluster(s, cluster_index);
+
+        if (!cl) {
+            cl = g_new(HostCluster, 1);
+            *cl = (HostCluster) {
+                .cluster_index = cluster_index,
+                .host_range_refcnt = 1,
+            };
+            g_hash_table_insert(s->host_range_refs,
+                                g_memdup(&cluster_index,
+                                         sizeof(cluster_index)), cl);
+        } else {
+            cl->host_range_refcnt++;
+        }
+        continue;
+    }
+}
+
+/* Decrease host_range_refcnt of clusters intersecting with range */
+void coroutine_fn
+qcow2_host_range_unref(BlockDriverState *bs, int64_t offset, int64_t length)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int64_t start, last, cluster_index;
+
+    start = start_of_cluster(s, offset) >> s->cluster_bits;
+    last = start_of_cluster(s, offset + length - 1) >> s->cluster_bits;
+    for (cluster_index = start; cluster_index <= last; cluster_index++) {
+        HostCluster *cl = find_host_cluster(s, cluster_index);
+
+        assert(cl);
+        assert(cl->host_range_refcnt >= 1);
+
+        if (cl->host_range_refcnt > 1) {
+            cl->host_range_refcnt--;
+            continue;
+        }
+
+        g_hash_table_remove(s->host_range_refs, &cluster_index);
+    }
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index be62585e03..aa298c9e42 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1834,6 +1834,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
 #endif
 
     qemu_co_queue_init(&s->thread_task_queue);
+    qcow2_init_host_range_refs(s);
 
     return ret;
 
@@ -2714,6 +2715,8 @@ static void qcow2_close(BlockDriverState *bs)
     g_free(s->image_backing_file);
     g_free(s->image_backing_format);
 
+    qcow2_release_host_range_refs(s);
+
     if (has_data_file(bs)) {
         bdrv_unref_child(bs, s->data_file);
         s->data_file = NULL;
diff --git a/block/meson.build b/block/meson.build
index d21990ec95..a9bf6fde0c 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -25,6 +25,7 @@ block_ss.add(files(
   'qcow2-bitmap.c',
   'qcow2-cache.c',
   'qcow2-cluster.c',
+  'qcow2-host-range-refs.c',
   'qcow2-refcount.c',
   'qcow2-snapshot.c',
   'qcow2-threads.c',
-- 
2.29.2



  parent reply	other threads:[~2021-04-22 16:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 16:30 [PATCH v6 00/12] qcow2: fix parallel rewrite and discard (lockless) Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 01/12] iotests: add qcow2-discard-during-rewrite Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 02/12] qcow2: fix cache discarding in update_refcount() Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 03/12] block/qcow2-cluster: assert no data_file on compressed write path Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 04/12] block/qcow2-refcount: rename and publish update_refcount_discard() Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 05/12] block/qcow2: introduce qcow2_parse_compressed_cluster_descriptor() Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 06/12] block/qcow2: refactor qcow2_co_preadv_task() to have one return Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 07/12] block/qcow2: qcow2_co_pwrite_zeroes: use QEMU_LOCK_GUARD Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 08/12] qcow2: introduce is_cluster_free() helper Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` Vladimir Sementsov-Ogievskiy [this message]
2021-04-22 16:30 ` [PATCH v6 10/12] qcow2: introduce qcow2_host_cluster_postponed_discard() Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 11/12] qcow2: protect data writing by host range reference Vladimir Sementsov-Ogievskiy
2021-04-22 16:30 ` [PATCH v6 12/12] qcow2: protect data reading " Vladimir Sementsov-Ogievskiy
2021-04-26 12:15 ` [PATCH v6 00/12] qcow2: fix parallel rewrite and discard (lockless) Vladimir Sementsov-Ogievskiy
2021-05-10 13:07 ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210422163046.442932-10-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.