All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes
@ 2014-08-07 20:47 Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation Max Reitz
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Max Reitz @ 2014-08-07 20:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

The first two patches in this series address
https://bugs.launchpad.net/qemu/+bug/1349972.

For the third patch I found it hard to write an appropriate test case
(it would have to make qemu-img check repair some leaks but induce the
corruption prevention at the same time). One can use the test image from
the bug report above, set the refcount block offset to 0 and that works.
However, the patch is simple enough that no test should be necessary.


Max Reitz (3):
  qcow2: Catch !*host_offset for data allocation
  iotests: Add test for image header overlap
  block: Catch !bs->drv in bdrv_check()

 block.c                    |  3 +++
 block/qcow2-cluster.c      | 11 +++++++++++
 tests/qemu-iotests/060     |  9 +++++++++
 tests/qemu-iotests/060.out |  8 ++++++++
 4 files changed, 31 insertions(+)

-- 
2.0.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation
  2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
@ 2014-08-07 20:47 ` Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 2/3] iotests: Add test for image header overlap Max Reitz
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Max Reitz @ 2014-08-07 20:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

qcow2_alloc_cluster_offset() uses host_offset == 0 as "no preferred
offset" for the (data) cluster range to be allocated. However, this
offset is actually valid and may be allocated on images with a corrupted
refcount table or first refcount block.

In this case, the corruption prevention should normally catch that
write anyway (because it would overwrite the image header). But since 0
is a special value here, the function assumes that nothing has been
allocated at all which it asserts against.

Because this condition is not qemu's fault but rather that of a broken
image, it shouldn't throw an assertion but rather mark the image corrupt
and show an appropriate message, which this patch does by calling the
corruption check earlier than it would be called normally (before the
assertion).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4208dc0..c6af456 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1106,6 +1106,17 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
         return 0;
     }
 
+    /* !*host_offset would overwrite the image header and is reserved for "no
+     * host offset preferred". If 0 was a valid host offset, it'd trigger the
+     * following overlap check; do that now to avoid having an invalid value in
+     * *host_offset. */
+    if (!alloc_cluster_offset) {
+        ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
+                                            nb_clusters * s->cluster_size);
+        assert(ret < 0);
+        goto fail;
+    }
+
     /*
      * Save info needed for meta data update.
      *
-- 
2.0.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 2/3] iotests: Add test for image header overlap
  2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation Max Reitz
@ 2014-08-07 20:47 ` Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() Max Reitz
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Max Reitz @ 2014-08-07 20:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add a test for an image with an unallocated image header; instead of an
assertion, this should result in the image being marked corrupt.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/060     | 9 +++++++++
 tests/qemu-iotests/060.out | 8 ++++++++
 2 files changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index 3cffc12..830386f 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -164,6 +164,15 @@ wait_break 0
 write 64k 64k
 resume 0" | $QEMU_IO | _filter_qemu_io
 
+echo
+echo "=== Testing unallocated image header ==="
+echo
+_make_test_img 64M
+# Create L1/L2
+$QEMU_IO -c "$OPEN_RW" -c "write 0 64k" | _filter_qemu_io
+poke_file "$TEST_IMG" "$rb_offset" "\x00\x00"
+$QEMU_IO -c "$OPEN_RW" -c "write 64k 64k" | _filter_qemu_io
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index a517948..c27c952 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -93,4 +93,12 @@ blkdebug: Suspended request '0'
 write failed: Input/output error
 blkdebug: Resuming request '0'
 aio_write failed: No medium found
+
+=== Testing unallocated image header ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Preventing invalid write on metadata (overlaps with qcow2_header); image marked as corrupt.
+write failed: Input/output error
 *** done
-- 
2.0.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check()
  2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation Max Reitz
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 2/3] iotests: Add test for image header overlap Max Reitz
@ 2014-08-07 20:47 ` Max Reitz
  2014-08-08  9:15   ` Kevin Wolf
  2014-08-07 20:59 ` [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Eric Blake
  2014-08-08  9:11 ` Kevin Wolf
  4 siblings, 1 reply; 9+ messages in thread
From: Max Reitz @ 2014-08-07 20:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

qemu-img check calls bdrv_check() twice if the first run repaired some
inconsistencies. If the first run however again triggered corruption
prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block.c b/block.c
index 8cf519b..2b470d8 100644
--- a/block.c
+++ b/block.c
@@ -2203,6 +2203,9 @@ bool bdrv_dev_is_medium_locked(BlockDriverState *bs)
  */
 int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix)
 {
+    if (bs->drv == NULL) {
+        return -ENOMEDIUM;
+    }
     if (bs->drv->bdrv_check == NULL) {
         return -ENOTSUP;
     }
-- 
2.0.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes
  2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
                   ` (2 preceding siblings ...)
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() Max Reitz
@ 2014-08-07 20:59 ` Eric Blake
  2014-08-08  9:11 ` Kevin Wolf
  4 siblings, 0 replies; 9+ messages in thread
From: Eric Blake @ 2014-08-07 20:59 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

On 08/07/2014 02:47 PM, Max Reitz wrote:
> The first two patches in this series address
> https://bugs.launchpad.net/qemu/+bug/1349972.
> 
> For the third patch I found it hard to write an appropriate test case
> (it would have to make qemu-img check repair some leaks but induce the
> corruption prevention at the same time). One can use the test image from
> the bug report above, set the refcount block offset to 0 and that works.
> However, the patch is simple enough that no test should be necessary.

Series: Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> 
> Max Reitz (3):
>   qcow2: Catch !*host_offset for data allocation
>   iotests: Add test for image header overlap
>   block: Catch !bs->drv in bdrv_check()
> 
>  block.c                    |  3 +++
>  block/qcow2-cluster.c      | 11 +++++++++++
>  tests/qemu-iotests/060     |  9 +++++++++
>  tests/qemu-iotests/060.out |  8 ++++++++
>  4 files changed, 31 insertions(+)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes
  2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
                   ` (3 preceding siblings ...)
  2014-08-07 20:59 ` [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Eric Blake
@ 2014-08-08  9:11 ` Kevin Wolf
  4 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2014-08-08  9:11 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 07.08.2014 um 22:47 hat Max Reitz geschrieben:
> The first two patches in this series address
> https://bugs.launchpad.net/qemu/+bug/1349972.
> 
> For the third patch I found it hard to write an appropriate test case
> (it would have to make qemu-img check repair some leaks but induce the
> corruption prevention at the same time). One can use the test image from
> the bug report above, set the refcount block offset to 0 and that works.
> However, the patch is simple enough that no test should be necessary.

Thanks, applied all to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check()
  2014-08-07 20:47 ` [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() Max Reitz
@ 2014-08-08  9:15   ` Kevin Wolf
  2014-08-08 21:11     ` Max Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2014-08-08  9:15 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 07.08.2014 um 22:47 hat Max Reitz geschrieben:
> qemu-img check calls bdrv_check() twice if the first run repaired some
> inconsistencies. If the first run however again triggered corruption
> prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
> NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

I suppose there was a real case of this happening? I think bdrv_check()
triggering corruption prevention is a rather bad sign. The most
important point for image repair should be that it doesn't make the
situation any worse. Smells like a follow-up patch to the qcow2 code.

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check()
  2014-08-08  9:15   ` Kevin Wolf
@ 2014-08-08 21:11     ` Max Reitz
  2014-08-08 22:53       ` Max Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Max Reitz @ 2014-08-08 21:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

On 08.08.2014 11:15, Kevin Wolf wrote:
> Am 07.08.2014 um 22:47 hat Max Reitz geschrieben:
>> qemu-img check calls bdrv_check() twice if the first run repaired some
>> inconsistencies. If the first run however again triggered corruption
>> prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
>> NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> I suppose there was a real case of this happening? I think bdrv_check()
> triggering corruption prevention is a rather bad sign. The most
> important point for image repair should be that it doesn't make the
> situation any worse. Smells like a follow-up patch to the qcow2 code.

Yes, as I wrote in the cover letter, using the image provided in 
https://bugs.launchpad.net/qemu/+bug/1353456 and setting the refblock 
offset to 0 (the reftable entry) results in a segmentation fault.

A simple way to trigger corruption during bdrv_check() is creating an 
image, setting the first (and only) reftable entry to 0 and running 
qemu-img check -r all. bdrv_check() will try to allocate a refblock, but 
since the first clusters are unallocated, it will allocate them there 
which would obviously overwrite the image header and/or L1 table and/or 
reftable.

The only way I can imagine to fix this is to completely disregard the 
on-disk refcount information during bdrv_check() and instead only use 
the calculated refcounts. This would require own allocation functions 
which may probably be rather simple, but in any case we'd need to write 
them.

I think I should have some time, so I'll have a look into it.

Max

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check()
  2014-08-08 21:11     ` Max Reitz
@ 2014-08-08 22:53       ` Max Reitz
  0 siblings, 0 replies; 9+ messages in thread
From: Max Reitz @ 2014-08-08 22:53 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

On 08.08.2014 23:11, Max Reitz wrote:
> On 08.08.2014 11:15, Kevin Wolf wrote:
>> Am 07.08.2014 um 22:47 hat Max Reitz geschrieben:
>>> qemu-img check calls bdrv_check() twice if the first run repaired some
>>> inconsistencies. If the first run however again triggered corruption
>>> prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
>>> NULL afterwards. Thus, bdrv_check() should check whether bs->drv is 
>>> set.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> I suppose there was a real case of this happening? I think bdrv_check()
>> triggering corruption prevention is a rather bad sign. The most
>> important point for image repair should be that it doesn't make the
>> situation any worse. Smells like a follow-up patch to the qcow2 code.
>
> Yes, as I wrote in the cover letter, using the image provided in 
> https://bugs.launchpad.net/qemu/+bug/1353456 and setting the refblock 
> offset to 0 (the reftable entry) results in a segmentation fault.
>
> A simple way to trigger corruption during bdrv_check() is creating an 
> image, setting the first (and only) reftable entry to 0 and running 
> qemu-img check -r all. bdrv_check() will try to allocate a refblock, 
> but since the first clusters are unallocated, it will allocate them 
> there which would obviously overwrite the image header and/or L1 table 
> and/or reftable.
>
> The only way I can imagine to fix this is to completely disregard the 
> on-disk refcount information during bdrv_check() and instead only use 
> the calculated refcounts. This would require own allocation functions 
> which may probably be rather simple, but in any case we'd need to 
> write them.
>
> I think I should have some time, so I'll have a look into it.

Okay, after thinking about the situation (which involved looking through 
the other bug reports by Maria), I think there is only one way to truly 
do the repair operation correctly. The general problem is that a damaged 
refcount structure may lead to a new reftable or new refblocks being 
allocated during the repair process. However, since the refcounts are 
not accurate, these new clusters may collide with existing allocations. 
We could fix this by replicating all the refcount operations for 
in-memory refcounts (which qcow2_check_refcounts() creates), but I think 
this to be a rather bad idea.

Instead, I'd rather create completely new refcount structures in 
qcow2_check_refcounts() when so much as a single referenced cluster with 
refcount=0 is encountered. If there is any cluster which is indeed 
referenced but for which the refcount structures say it's free, any new 
allocation may break things. Since changing refcounts may result in new 
cluster allocations, we should not update the existing refcount 
structures at all.

Alternatively, we can rewrite the refcount update functions to take an 
in-memory refcount table to know which clusters to avoid, but 
considering that those functions are complicated enough already, I'd 
rather refrain from that.

Max

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-08-08 22:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-07 20:47 [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 1/3] qcow2: Catch !*host_offset for data allocation Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 2/3] iotests: Add test for image header overlap Max Reitz
2014-08-07 20:47 ` [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() Max Reitz
2014-08-08  9:15   ` Kevin Wolf
2014-08-08 21:11     ` Max Reitz
2014-08-08 22:53       ` Max Reitz
2014-08-07 20:59 ` [Qemu-devel] [PATCH 0/3] qcow2: Prevent corruption-related crashes Eric Blake
2014-08-08  9:11 ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.