All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs
@ 2020-09-03 16:37 Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 1/3] " Alberto Garcia
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Alberto Garcia @ 2020-09-03 16:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz

Hi,

here are the changes from v1:

- Split changes into three different patches.
- Rewrite the documentation of qcow2_alloc_cluster_offset() [Max]
- Use peek_file_be in the test case to read the offset of the refcount
  table [Max].
- Extend the list of unsupported options for the test case [Max].

Berto

Alberto Garcia (3):
  qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs
  qcow2: Don't check nb_clusters when removing l2meta from the list
  qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset()

 block/qcow2-cluster.c      | 27 +++++++-------
 block/qcow2.c              |  4 +--
 tests/qemu-iotests/305     | 74 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/305.out | 16 +++++++++
 tests/qemu-iotests/group   |  1 +
 5 files changed, 106 insertions(+), 16 deletions(-)
 create mode 100755 tests/qemu-iotests/305
 create mode 100644 tests/qemu-iotests/305.out

-- 
2.20.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs
  2020-09-03 16:37 [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Alberto Garcia
@ 2020-09-03 16:37 ` Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 2/3] qcow2: Don't check nb_clusters when removing l2meta from the list Alberto Garcia
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Alberto Garcia @ 2020-09-03 16:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz

When a write request needs to allocate new clusters (or change the L2
bitmap of existing ones) a QCowL2Meta structure is created so the L2
metadata can be later updated and any copy-on-write can be performed
if necessary.

A write request can span a region consisting of an arbitrary
combination of previously unallocated and allocated clusters, and if
the unallocated ones can be put contiguous to the existing ones then
QEMU will do so in order to minimize the number of write operations.

In practice this means that a write request has not just one but a
number of QCowL2Meta structures. All of them are added to the
cluster_allocs list that is stored in BDRVQcow2State and is used to
detect overlapping requests. After the write request finishes all its
associated QCowL2Meta are removed from that list. calculate_l2_meta()
takes care of creating and putting those structures in the list, and
qcow2_handle_l2meta() takes care of removing them.

The problem is that the error path in handle_alloc() also tries to
remove an item in that list, a remnant from the time when this was
handled there (that code would not even be correct anymore because
it only removes one struct and not all the ones from the same write
request).

This can trigger a double removal of the same item from the list,
causing a crash. This is not easy to reproduce in practice because
it requires that do_alloc_cluster_offset() fails after a successful
previous allocation during the same write request, but it can be
reproduced with the included test case.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c      |  3 --
 tests/qemu-iotests/305     | 74 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/305.out | 16 +++++++++
 tests/qemu-iotests/group   |  1 +
 4 files changed, 91 insertions(+), 3 deletions(-)
 create mode 100755 tests/qemu-iotests/305
 create mode 100644 tests/qemu-iotests/305.out

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 996b3314f4..25e38daa78 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1709,9 +1709,6 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
 
 out:
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
-    if (ret < 0 && *m && (*m)->nb_clusters > 0) {
-        QLIST_REMOVE(*m, next_in_flight);
-    }
     return ret;
 }
 
diff --git a/tests/qemu-iotests/305 b/tests/qemu-iotests/305
new file mode 100755
index 0000000000..768818af4a
--- /dev/null
+++ b/tests/qemu-iotests/305
@@ -0,0 +1,74 @@
+#!/usr/bin/env bash
+#
+# Test the handling of errors in write requests with multiple allocations
+#
+# Copyright (C) 2020 Igalia, S.L.
+# Author: Alberto Garcia <berto@igalia.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=berto@igalia.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1    # failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+_unsupported_imgopts cluster_size refcount_bits extended_l2 compat=0.10 data_file
+
+echo '### Create the image'
+_make_test_img -o refcount_bits=64,cluster_size=1k 1M
+
+# The reference counts of the clusters for the first 123k of this
+# write request are stored in the first refcount block. The last
+# cluster (guest offset 123k) is referenced in the second refcount
+# block.
+echo '### Fill the first refcount block and one data cluster from the second'
+$QEMU_IO -c 'write 0 124k' "$TEST_IMG" | _filter_qemu_io
+
+echo '### Discard two of the last data clusters, leave one in the middle'
+$QEMU_IO -c 'discard 121k 1k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'discard 123k 1k' "$TEST_IMG" | _filter_qemu_io
+
+echo '### Corrupt the offset of the second refcount block'
+refcount_table_offset=$(peek_file_be "$TEST_IMG" 48 8)
+poke_file "$TEST_IMG" $(($refcount_table_offset+14)) "\x06"
+
+# This tries to allocate the two clusters discarded earlier (guest
+# offsets 121k and 123k). Their reference counts are in the first and
+# second refcount blocks respectively, but only the first one can be
+# allocated correctly because the second entry of the refcount table
+# is corrupted.
+echo '### Try to allocate the discarded clusters again'
+$QEMU_IO -c 'write 121k 3k' "$TEST_IMG" | _filter_qemu_io
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/305.out b/tests/qemu-iotests/305.out
new file mode 100644
index 0000000000..538019e726
--- /dev/null
+++ b/tests/qemu-iotests/305.out
@@ -0,0 +1,16 @@
+QA output created by 305
+### Create the image
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+### Fill the first refcount block and one data cluster from the second
+wrote 126976/126976 bytes at offset 0
+124 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+### Discard two of the last data clusters, leave one in the middle
+discard 1024/1024 bytes at offset 123904
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 1024/1024 bytes at offset 125952
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+### Corrupt the offset of the second refcount block
+### Try to allocate the discarded clusters again
+qcow2: Marking image as corrupt: Refblock offset 0x20600 unaligned (reftable index: 0x1); further corruption events will be suppressed
+write failed: Input/output error
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 5cad015231..ff59cfd2d4 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -313,3 +313,4 @@
 302 quick
 303 rw quick
 304 rw quick
+305 rw quick
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] qcow2: Don't check nb_clusters when removing l2meta from the list
  2020-09-03 16:37 [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 1/3] " Alberto Garcia
@ 2020-09-03 16:37 ` Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset() Alberto Garcia
  2020-09-04  9:30 ` [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Max Reitz
  3 siblings, 0 replies; 8+ messages in thread
From: Alberto Garcia @ 2020-09-03 16:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz

In the past, when a new cluster was allocated the l2meta structure was
a variable in the stack so it was necessary to have a way to tell
whether it had been initialized and contained valid data or not. The
nb_clusters field was used for this purpose. Since commit f50f88b9fe
this is no longer the case, l2meta (nowadays a pointer to a list) is
only allocated when needed and nb_clusters is guaranteed to be > 0 so
this check is unnecessary.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index da56b1a4df..54a7d2f475 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2112,9 +2112,7 @@ static coroutine_fn int qcow2_handle_l2meta(BlockDriverState *bs,
         }
 
         /* Take the request off the list of running requests */
-        if (l2meta->nb_clusters != 0) {
-            QLIST_REMOVE(l2meta, next_in_flight);
-        }
+        QLIST_REMOVE(l2meta, next_in_flight);
 
         qemu_co_queue_restart_all(&l2meta->dependent_requests);
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset()
  2020-09-03 16:37 [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 1/3] " Alberto Garcia
  2020-09-03 16:37 ` [PATCH v2 2/3] qcow2: Don't check nb_clusters when removing l2meta from the list Alberto Garcia
@ 2020-09-03 16:37 ` Alberto Garcia
  2020-09-04  9:28   ` Max Reitz
  2020-09-04  9:30 ` [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Max Reitz
  3 siblings, 1 reply; 8+ messages in thread
From: Alberto Garcia @ 2020-09-03 16:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Alberto Garcia, qemu-block, Max Reitz

The current text corresponds to an earlier, simpler version of this
function and it does not explain how it works now.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 25e38daa78..f1ce6afcf5 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1713,18 +1713,22 @@ out:
 }
 
 /*
- * alloc_cluster_offset
+ * For a given area on the virtual disk defined by @offset and @bytes,
+ * find the corresponding area on the qcow2 image, allocating new
+ * clusters (or subclusters) if necessary. The result can span a
+ * combination of allocated and previously unallocated clusters.
  *
- * For a given offset on the virtual disk, find the cluster offset in qcow2
- * file. If the offset is not found, allocate a new cluster.
+ * On return, @host_offset is set to the beginning of the requested
+ * area. This area is guaranteed to be contiguous on the qcow2 file
+ * but it can be smaller than initially requested. In this case @bytes
+ * is updated with the actual size.
  *
- * If the cluster was already allocated, m->nb_clusters is set to 0 and
- * other fields in m are meaningless.
- *
- * If the cluster is newly allocated, m->nb_clusters is set to the number of
- * contiguous clusters that have been allocated. In this case, the other
- * fields of m are valid and contain information about the first allocated
- * cluster.
+ * If any clusters or subclusters were allocated then @m contains a
+ * list with the information of all the affected regions. Note that
+ * this can happen regardless of whether this function succeeds or
+ * not. The caller is responsible for updating the L2 metadata of the
+ * allocated clusters (on success) or freeing them (on failure), and
+ * for clearing the contents of @m afterwards in both cases.
  *
  * If the request conflicts with another write request in flight, the coroutine
  * is queued and will be reentered when the dependency has completed.
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset()
  2020-09-03 16:37 ` [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset() Alberto Garcia
@ 2020-09-04  9:28   ` Max Reitz
  2020-09-04  9:36     ` Alberto Garcia
  0 siblings, 1 reply; 8+ messages in thread
From: Max Reitz @ 2020-09-04  9:28 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 2178 bytes --]

On 03.09.20 18:37, Alberto Garcia wrote:
> The current text corresponds to an earlier, simpler version of this
> function and it does not explain how it works now.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 25e38daa78..f1ce6afcf5 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1713,18 +1713,22 @@ out:
>  }
>  
>  /*
> - * alloc_cluster_offset
> + * For a given area on the virtual disk defined by @offset and @bytes,
> + * find the corresponding area on the qcow2 image, allocating new
> + * clusters (or subclusters) if necessary. The result can span a
> + * combination of allocated and previously unallocated clusters.
>   *
> - * For a given offset on the virtual disk, find the cluster offset in qcow2
> - * file. If the offset is not found, allocate a new cluster.
> + * On return, @host_offset is set to the beginning of the requested
> + * area. This area is guaranteed to be contiguous on the qcow2 file
> + * but it can be smaller than initially requested. In this case @bytes
> + * is updated with the actual size.
>   *
> - * If the cluster was already allocated, m->nb_clusters is set to 0 and
> - * other fields in m are meaningless.
> - *
> - * If the cluster is newly allocated, m->nb_clusters is set to the number of
> - * contiguous clusters that have been allocated. In this case, the other
> - * fields of m are valid and contain information about the first allocated
> - * cluster.
> + * If any clusters or subclusters were allocated then @m contains a
> + * list with the information of all the affected regions. Note that
> + * this can happen regardless of whether this function succeeds or
> + * not. The caller is responsible for updating the L2 metadata of the
> + * allocated clusters (on success) or freeing them (on failure), and
> + * for clearing the contents of @m afterwards in both cases.

It’s too bad that preallocate_co() doesn’t do that then, isn’t it...

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs
  2020-09-03 16:37 [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Alberto Garcia
                   ` (2 preceding siblings ...)
  2020-09-03 16:37 ` [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset() Alberto Garcia
@ 2020-09-04  9:30 ` Max Reitz
  3 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2020-09-04  9:30 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 484 bytes --]

On 03.09.20 18:37, Alberto Garcia wrote:
> Hi,
> 
> here are the changes from v1:
> 
> - Split changes into three different patches.
> - Rewrite the documentation of qcow2_alloc_cluster_offset() [Max]
> - Use peek_file_be in the test case to read the offset of the refcount
>   table [Max].
> - Extend the list of unsupported options for the test case [Max].
> 
> Berto

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset()
  2020-09-04  9:28   ` Max Reitz
@ 2020-09-04  9:36     ` Alberto Garcia
  2020-09-04  9:43       ` Max Reitz
  0 siblings, 1 reply; 8+ messages in thread
From: Alberto Garcia @ 2020-09-04  9:36 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, qemu-block

On Fri 04 Sep 2020 11:28:18 AM CEST, Max Reitz wrote:
>> + * If any clusters or subclusters were allocated then @m contains a
>> + * list with the information of all the affected regions. Note that
>> + * this can happen regardless of whether this function succeeds or
>> + * not. The caller is responsible for updating the L2 metadata of the
>> + * allocated clusters (on success) or freeing them (on failure), and
>> + * for clearing the contents of @m afterwards in both cases.
>
> It’s too bad that preallocate_co() doesn’t do that then, isn’t it...

I was planning to have a closer look at that function next week.

Berto


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset()
  2020-09-04  9:36     ` Alberto Garcia
@ 2020-09-04  9:43       ` Max Reitz
  0 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2020-09-04  9:43 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: Kevin Wolf, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 715 bytes --]

On 04.09.20 11:36, Alberto Garcia wrote:
> On Fri 04 Sep 2020 11:28:18 AM CEST, Max Reitz wrote:
>>> + * If any clusters or subclusters were allocated then @m contains a
>>> + * list with the information of all the affected regions. Note that
>>> + * this can happen regardless of whether this function succeeds or
>>> + * not. The caller is responsible for updating the L2 metadata of the
>>> + * allocated clusters (on success) or freeing them (on failure), and
>>> + * for clearing the contents of @m afterwards in both cases.
>>
>> It’s too bad that preallocate_co() doesn’t do that then, isn’t it...
> 
> I was planning to have a closer look at that function next week.

Great! :)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-04  9:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 16:37 [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Alberto Garcia
2020-09-03 16:37 ` [PATCH v2 1/3] " Alberto Garcia
2020-09-03 16:37 ` [PATCH v2 2/3] qcow2: Don't check nb_clusters when removing l2meta from the list Alberto Garcia
2020-09-03 16:37 ` [PATCH v2 3/3] qcow2: Rewrite the documentation of qcow2_alloc_cluster_offset() Alberto Garcia
2020-09-04  9:28   ` Max Reitz
2020-09-04  9:36     ` Alberto Garcia
2020-09-04  9:43       ` Max Reitz
2020-09-04  9:30 ` [PATCH v2 0/3] qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.