All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment
@ 2014-09-05 14:07 Max Reitz
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

The image fuzzer from Maria exposed a lot of assertions which might fail
in qemu when fed with a broken qcow2 image. Some of them are related to
qemu trusting the offsets given in the L1, L2 and refcount tables to
always be properly aligned on cluster boundaries (e.g.
https://bugs.launchpad.net/qemu/+bug/1354529).

This series fixes this by verifying (hopefully) all data read from L1,
L2 and refcount tables accordingly; if the offsets are not aligned on
cluster boundaries, an error message is emitted and the image is marked
corrupt unless it has been opened read-only.

v2:
- Added patch 1 which adds a new field to the BLOCK_IMAGE_CORRUPTED. I
  know this might be incompatible to existing users of that field, so I
  did an investigative search (I googled) but did not find any. So I'm
  confident this won't break anything.
- Patch 2:
  - Suppress corruption events and messages after the first one;
    however, a fatal corruption should still be signaled even after a
    non-fatal one has already occured [Kevin]
  - Also, use the new "fatal" field
- Patch 3:
  - Fix test 060 right in this patch (squashed the old patch 3 into
    patch 2, the result is now this patch 3) [Kevin]
  - Set the "fatal" field to true
- Patch 4:
  - Emit more debugging information (like table indices etc.) [Kevin]
  - Generally set the "fatal" field to true
  - In case a cluster with an unaligned offset should be freed, call the
    corruption signalling function just like everywhere else; but set
    "fatal" to false [Kevin+Eric]
- Patch 5: Added [Kevin]


git-backport-diff against v1 (although not very useful):

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/5:[down] 'qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED'
002/5:[0045] [FC] 'qcow2: Add qcow2_signal_corruption()'
003/5:[0015] [FC] 'qcow2: Use qcow2_signal_corruption() for overlaps'
004/5:[0068] [FC] 'qcow2: Check L1/L2/reftable entries for alignment'
005/5:[down] 'iotests: Add more tests for qcow2 corruption'


Max Reitz (5):
  qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  qcow2: Add qcow2_signal_corruption()
  qcow2: Use qcow2_signal_corruption() for overlaps
  qcow2: Check L1/L2/reftable entries for alignment
  iotests: Add more tests for qcow2 corruption

 block/qcow2-cluster.c      | 43 ++++++++++++++++++++++++++---
 block/qcow2-refcount.c     | 67 +++++++++++++++++++++++++++++++---------------
 block/qcow2.c              | 48 +++++++++++++++++++++++++++++++++
 block/qcow2.h              |  5 ++++
 qapi/block-core.json       |  9 +++++--
 tests/qemu-iotests/060     | 56 ++++++++++++++++++++++++++++++++++++--
 tests/qemu-iotests/060.out | 61 +++++++++++++++++++++++++++++++++++++----
 7 files changed, 255 insertions(+), 34 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
@ 2014-09-05 14:07 ` Max Reitz
  2014-09-05 14:29   ` Eric Blake
                     ` (2 more replies)
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption() Max Reitz
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
reading from an image, they should generally not be. Nonetheless, even
an image only read from may of course be corrupted and this can be
detected during normal operation. In this case, a non-fatal event should
be emitted, but the image should not be marked corrupt (in accordance to
"fatal" set to false).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-refcount.c | 1 +
 qapi/block-core.json   | 9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 43665b8..0bd75d2 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1853,6 +1853,7 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
                                               offset,
                                               true,
                                               size,
+                                              true,
                                               &error_abort);
         g_free(message);
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a685d02..d23bcc2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1554,7 +1554,7 @@
 ##
 # @BLOCK_IMAGE_CORRUPTED
 #
-# Emitted when a disk image is being marked corrupt
+# Emitted when a corruption has been detected in a disk image
 #
 # @device: device name
 #
@@ -1568,13 +1568,18 @@
 # @size: #optional, if the corruption resulted from an image access, this is
 #        the access size
 #
+# fatal: if set, the image is marked corrupt and therefore unusable after this
+#        event and must be repaired (Since 2.2; before, every
+#        BLOCK_IMAGE_CORRUPTED event was fatal)
+#
 # Since: 1.7
 ##
 { 'event': 'BLOCK_IMAGE_CORRUPTED',
   'data': { 'device' : 'str',
             'msg'    : 'str',
             '*offset': 'int',
-            '*size'  : 'int' } }
+            '*size'  : 'int',
+            'fatal'  : 'bool' } }
 
 ##
 # @BLOCK_IO_ERROR
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption()
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
@ 2014-09-05 14:07 ` Max Reitz
  2014-09-05 14:43   ` Eric Blake
  2014-09-08 14:15   ` Benoît Canet
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps Max Reitz
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add a helper function for easily marking an image corrupt (on fatal
corruptions) while outputting an informative message to stderr and via
QAPI.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h |  5 +++++
 2 files changed, 53 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index f9e045f..fc4217b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -30,6 +30,8 @@
 #include "qemu/error-report.h"
 #include "qapi/qmp/qerror.h"
 #include "qapi/qmp/qbool.h"
+#include "qapi/qmp/types.h"
+#include "qapi-event.h"
 #include "trace.h"
 #include "qemu/option_int.h"
 
@@ -2478,6 +2480,52 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
     return 0;
 }
 
+/*
+ * If offset or size are negative, respectively, they will not be included in
+ * the BLOCK_IMAGE_CORRUPTED event emitted.
+ * fatal will be ignored for read-only BDS; corruptions found there will always
+ * be considered non-fatal.
+ */
+void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
+                             int64_t size, const char *message_format, ...)
+{
+    BDRVQcowState *s = bs->opaque;
+    char *message;
+    va_list ap;
+
+    fatal = fatal && !bs->read_only;
+
+    if (s->signaled_corruption &&
+        (!fatal || (s->incompatible_features & QCOW2_INCOMPAT_CORRUPT)))
+    {
+        return;
+    }
+
+    va_start(ap, message_format);
+    message = g_strdup_vprintf(message_format, ap);
+    va_end(ap);
+
+    if (fatal) {
+        fprintf(stderr, "qcow2: Marking image as corrupt: %s; further "
+                "corruption events will be suppressed\n", message);
+    } else {
+        fprintf(stderr, "qcow2: Image is corrupt: %s; further non-fatal "
+                "corruption events will be suppressed\n", message);
+    }
+
+    qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs), message,
+                                          offset >= 0, offset, size >= 0, size,
+                                          fatal, &error_abort);
+    g_free(message);
+
+    if (fatal) {
+        qcow2_mark_corrupt(bs);
+        bs->drv = NULL; /* make BDS unusable */
+    }
+
+    s->signaled_corruption = true;
+}
+
 static QemuOptsList qcow2_create_opts = {
     .name = "qcow2-create-opts",
     .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
diff --git a/block/qcow2.h b/block/qcow2.h
index 6aeb7ea..7b7b6a6 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -261,6 +261,7 @@ typedef struct BDRVQcowState {
     bool discard_passthrough[QCOW2_DISCARD_MAX];
 
     int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
+    bool signaled_corruption;
 
     uint64_t incompatible_features;
     uint64_t compatible_features;
@@ -477,6 +478,10 @@ int qcow2_mark_corrupt(BlockDriverState *bs);
 int qcow2_mark_consistent(BlockDriverState *bs);
 int qcow2_update_header(BlockDriverState *bs);
 
+void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
+                             int64_t size, const char *message_format, ...)
+                             GCC_FMT_ATTR(5, 6);
+
 /* qcow2-refcount.c functions */
 int qcow2_refcount_init(BlockDriverState *bs);
 void qcow2_refcount_close(BlockDriverState *bs);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption() Max Reitz
@ 2014-09-05 14:07 ` Max Reitz
  2014-09-05 14:52   ` Eric Blake
  2014-09-08 14:21   ` Benoît Canet
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Use the new function in case of a failed overlap check.

This changes output in case of corruption, so adapt iotest 060's
reference output accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-refcount.c     | 24 +++---------------------
 tests/qemu-iotests/060.out | 10 +++++-----
 2 files changed, 8 insertions(+), 26 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 0bd75d2..b9d421e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -26,8 +26,6 @@
 #include "block/block_int.h"
 #include "block/qcow2.h"
 #include "qemu/range.h"
-#include "qapi/qmp/types.h"
-#include "qapi-event.h"
 
 static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -1838,27 +1836,11 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
         return ret;
     } else if (ret > 0) {
         int metadata_ol_bitnr = ffs(ret) - 1;
-        char *message;
-
         assert(metadata_ol_bitnr < QCOW2_OL_MAX_BITNR);
 
-        fprintf(stderr, "qcow2: Preventing invalid write on metadata (overlaps "
-                "with %s); image marked as corrupt.\n",
-                metadata_ol_names[metadata_ol_bitnr]);
-        message = g_strdup_printf("Prevented %s overwrite",
-                metadata_ol_names[metadata_ol_bitnr]);
-        qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs),
-                                              message,
-                                              true,
-                                              offset,
-                                              true,
-                                              size,
-                                              true,
-                                              &error_abort);
-        g_free(message);
-
-        qcow2_mark_corrupt(bs);
-        bs->drv = NULL; /* make BDS unusable */
+        qcow2_signal_corruption(bs, true, offset, size, "Preventing invalid "
+                                "write on metadata (overlaps with %s)",
+                                metadata_ol_names[metadata_ol_bitnr]);
         return -EIO;
     }
 
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index c27c952..30806da 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -8,7 +8,7 @@ ERROR cluster 3 refcount=1 reference=3
 1 errors were found on the image.
 Data may be corrupted, or further writes to the image may corrupt it.
 incompatible_features     0x0
-qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
+qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L1 table); further corruption events will be suppressed
 write failed: Input/output error
 incompatible_features     0x2
 qemu-io: can't open device TEST_DIR/t.IMGFMT: IMGFMT: Image is corrupt; cannot be opened read/write
@@ -24,7 +24,7 @@ ERROR cluster 2 refcount=1 reference=2
 2 errors were found on the image.
 Data may be corrupted, or further writes to the image may corrupt it.
 incompatible_features     0x0
-qcow2: Preventing invalid write on metadata (overlaps with refcount block); image marked as corrupt.
+qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with refcount block); further corruption events will be suppressed
 write failed: Input/output error
 incompatible_features     0x2
 Repairing refcount block 0 refcount=2
@@ -56,7 +56,7 @@ Data may be corrupted, or further writes to the image may corrupt it.
 1 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
 incompatible_features     0x0
-qcow2: Preventing invalid write on metadata (overlaps with inactive L2 table); image marked as corrupt.
+qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with inactive L2 table); further corruption events will be suppressed
 write failed: Input/output error
 incompatible_features     0x2
 Repairing cluster 4 refcount=1 reference=2
@@ -88,7 +88,7 @@ wrote 65536/65536 bytes at offset 536870912
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 discard 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-qcow2: Preventing invalid write on metadata (overlaps with active L2 table); image marked as corrupt.
+qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L2 table); further corruption events will be suppressed
 blkdebug: Suspended request '0'
 write failed: Input/output error
 blkdebug: Resuming request '0'
@@ -99,6 +99,6 @@ aio_write failed: No medium found
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
 wrote 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-qcow2: Preventing invalid write on metadata (overlaps with qcow2_header); image marked as corrupt.
+qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed
 write failed: Input/output error
 *** done
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
                   ` (2 preceding siblings ...)
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps Max Reitz
@ 2014-09-05 14:07 ` Max Reitz
  2014-09-05 15:03   ` Eric Blake
  2014-09-08 14:40   ` Benoît Canet
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption Max Reitz
  2014-09-16 13:48 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Stefan Hajnoczi
  5 siblings, 2 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Offsets taken from the L1, L2 and refcount tables are generally assumed
to be correctly aligned. However, this cannot be guaranteed if the image
has been written to by something different than qemu, thus check all
offsets taken from these tables for correct cluster alignment.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c  | 43 ++++++++++++++++++++++++++++++++++++++++---
 block/qcow2-refcount.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 82 insertions(+), 5 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 735f687..f7dd8c0 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -486,6 +486,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         goto out;
     }
 
+    if (offset_into_cluster(s, l2_offset)) {
+        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
+                                " unaligned (L1 index: %#" PRIx64 ")",
+                                l2_offset, l1_index);
+        return -EIO;
+    }
+
     /* load the l2 table in memory */
 
     ret = l2_load(bs, l2_offset, &l2_table);
@@ -508,8 +515,11 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         break;
     case QCOW2_CLUSTER_ZERO:
         if (s->qcow_version < 3) {
-            qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
-            return -EIO;
+            qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
+                                    " in pre-v3 image (L2 offset: %#" PRIx64
+                                    ", L2 index: %#x)", l2_offset, l2_index);
+            ret = -EIO;
+            goto fail;
         }
         c = count_contiguous_clusters(nb_clusters, s->cluster_size,
                 &l2_table[l2_index], QCOW_OFLAG_ZERO);
@@ -525,6 +535,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         c = count_contiguous_clusters(nb_clusters, s->cluster_size,
                 &l2_table[l2_index], QCOW_OFLAG_ZERO);
         *cluster_offset &= L2E_OFFSET_MASK;
+        if (offset_into_cluster(s, *cluster_offset)) {
+            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
+                                    PRIx64 " unaligned (L2 offset: %#" PRIx64
+                                    ", L2 index: %#x)", *cluster_offset,
+                                    l2_offset, l2_index);
+            ret = -EIO;
+            goto fail;
+        }
         break;
     default:
         abort();
@@ -541,6 +559,10 @@ out:
     *num = nb_available - index_in_cluster;
 
     return ret;
+
+fail:
+    qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
+    return ret;
 }
 
 /*
@@ -576,6 +598,12 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
 
     assert(l1_index < s->l1_size);
     l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
+    if (offset_into_cluster(s, l2_offset)) {
+        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
+                                " unaligned (L1 index: %#" PRIx64 ")",
+                                l2_offset, l1_index);
+        return -EIO;
+    }
 
     /* seek the l2 table of the given l2 offset */
 
@@ -948,6 +976,15 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
         bool offset_matches =
             (cluster_offset & L2E_OFFSET_MASK) == *host_offset;
 
+        if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
+            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
+                                    "%#llx unaligned (guest offset: %#" PRIx64
+                                    ")", cluster_offset & L2E_OFFSET_MASK,
+                                    guest_offset);
+            ret = -EIO;
+            goto out;
+        }
+
         if (*host_offset != 0 && !offset_matches) {
             *bytes = 0;
             ret = 0;
@@ -979,7 +1016,7 @@ out:
 
     /* Only return a host offset if we actually made progress. Otherwise we
      * would make requirements for handle_alloc() that it can't fulfill */
-    if (ret) {
+    if (ret > 0) {
         *host_offset = (cluster_offset & L2E_OFFSET_MASK)
                      + offset_into_cluster(s, guest_offset);
     }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index b9d421e..2bcaaf9 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -108,6 +108,13 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
     if (!refcount_block_offset)
         return 0;
 
+    if (offset_into_cluster(s, refcount_block_offset)) {
+        qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#" PRIx64
+                                " unaligned (reftable index: %#" PRIx64 ")",
+                                refcount_block_offset, refcount_table_index);
+        return -EIO;
+    }
+
     ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
         (void**) &refcount_block);
     if (ret < 0) {
@@ -181,6 +188,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
         /* If it's already there, we're done */
         if (refcount_block_offset) {
+            if (offset_into_cluster(s, refcount_block_offset)) {
+                qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
+                                        PRIx64 " unaligned (reftable index: "
+                                        "%#x)", refcount_block_offset,
+                                        refcount_table_index);
+                return -EIO;
+            }
+
              return load_refcount_block(bs, refcount_block_offset,
                  (void**) refcount_block);
         }
@@ -836,8 +851,14 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
     case QCOW2_CLUSTER_NORMAL:
     case QCOW2_CLUSTER_ZERO:
         if (l2_entry & L2E_OFFSET_MASK) {
-            qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
-                                nb_clusters << s->cluster_bits, type);
+            if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
+                qcow2_signal_corruption(bs, false, -1, -1,
+                                        "Cannot free unaligned cluster %#llx",
+                                        l2_entry & L2E_OFFSET_MASK);
+            } else {
+                qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
+                                    nb_clusters << s->cluster_bits, type);
+            }
         }
         break;
     case QCOW2_CLUSTER_UNALLOCATED:
@@ -901,6 +922,14 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
             old_l2_offset = l2_offset;
             l2_offset &= L1E_OFFSET_MASK;
 
+            if (offset_into_cluster(s, l2_offset)) {
+                qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
+                                        PRIx64 " unaligned (L1 index: %#x)",
+                                        l2_offset, i);
+                ret = -EIO;
+                goto fail;
+            }
+
             ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
                 (void**) &l2_table);
             if (ret < 0) {
@@ -933,6 +962,17 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
                     case QCOW2_CLUSTER_NORMAL:
                     case QCOW2_CLUSTER_ZERO:
+                        if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
+                            qcow2_signal_corruption(bs, true, -1, -1, "Data "
+                                                    "cluster offset %#llx "
+                                                    "unaligned (L2 offset: %#"
+                                                    PRIx64 ", L2 index: %#x)",
+                                                    offset & L2E_OFFSET_MASK,
+                                                    l2_offset, j);
+                            ret = -EIO;
+                            goto fail;
+                        }
+
                         cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
                         if (!cluster_index) {
                             /* unallocated */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
                   ` (3 preceding siblings ...)
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
@ 2014-09-05 14:07 ` Max Reitz
  2014-09-05 15:09   ` Eric Blake
  2014-09-16 13:48 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Stefan Hajnoczi
  5 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add tests for unaligned L1/L2/reftable entries and non-fatal corruption
reports.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/060     | 56 ++++++++++++++++++++++++++++++++++++++++++++--
 tests/qemu-iotests/060.out | 51 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index 830386f..2355567 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -169,9 +169,61 @@ echo "=== Testing unallocated image header ==="
 echo
 _make_test_img 64M
 # Create L1/L2
-$QEMU_IO -c "$OPEN_RW" -c "write 0 64k" | _filter_qemu_io
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
 poke_file "$TEST_IMG" "$rb_offset" "\x00\x00"
-$QEMU_IO -c "$OPEN_RW" -c "write 64k 64k" | _filter_qemu_io
+$QEMU_IO -c "write 64k 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing unaligned L1 entry ==="
+echo
+_make_test_img 64M
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+# This will be masked with ~(512 - 1) = ~0x1ff, so whether the lower 9 bits are
+# aligned or not does not matter
+poke_file "$TEST_IMG" "$l1_offset" "\x80\x00\x00\x00\x00\x04\x2a\x00"
+$QEMU_IO -c "read 0 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing unaligned L2 entry ==="
+echo
+_make_test_img 64M
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+poke_file "$TEST_IMG" "$l2_offset" "\x80\x00\x00\x00\x00\x05\x2a\x00"
+$QEMU_IO -c "read 0 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing unaligned reftable entry ==="
+echo
+_make_test_img 64M
+poke_file "$TEST_IMG" "$rt_offset" "\x00\x00\x00\x00\x00\x02\x2a\x00"
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing non-fatal corruption on freeing ==="
+echo
+_make_test_img 64M
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+poke_file "$TEST_IMG" "$l2_offset" "\x80\x00\x00\x00\x00\x05\x2a\x00"
+$QEMU_IO -c "discard 0 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing read-only corruption report ==="
+echo
+_make_test_img 64M
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+poke_file "$TEST_IMG" "$l2_offset" "\x80\x00\x00\x00\x00\x05\x2a\x00"
+# Should only emit a single error message
+$QEMU_IO -c "$OPEN_RO" -c "read 0 64k" -c "read 0 64k" | _filter_qemu_io
+
+echo
+echo "=== Testing non-fatal and then fatal corruption report ==="
+echo
+_make_test_img 64M
+$QEMU_IO -c "write 0 128k" "$TEST_IMG" | _filter_qemu_io
+poke_file "$TEST_IMG" "$l2_offset"        "\x80\x00\x00\x00\x00\x05\x2a\x00"
+poke_file "$TEST_IMG" "$(($l2_offset+8))" "\x80\x00\x00\x00\x00\x06\x2a\x00"
+# Should emit two error messages
+$QEMU_IO -c "discard 0 64k" -c "read 64k 64k" "$TEST_IMG" | _filter_qemu_io
 
 # success, all done
 echo "*** done"
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index 30806da..4f0c6d0 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -101,4 +101,55 @@ wrote 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed
 write failed: Input/output error
+
+=== Testing unaligned L1 entry ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Marking image as corrupt: L2 table offset 0x42a00 unaligned (L1 index: 0); further corruption events will be suppressed
+read failed: Input/output error
+
+=== Testing unaligned L2 entry ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Marking image as corrupt: Data cluster offset 0x52a00 unaligned (L2 offset: 0x40000, L2 index: 0); further corruption events will be suppressed
+read failed: Input/output error
+
+=== Testing unaligned reftable entry ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+qcow2: Marking image as corrupt: Refblock offset 0x22a00 unaligned (reftable index: 0); further corruption events will be suppressed
+write failed: Input/output error
+
+=== Testing non-fatal corruption on freeing ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Image is corrupt: Cannot free unaligned cluster 0x52a00; further non-fatal corruption events will be suppressed
+discard 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing read-only corruption report ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Image is corrupt: Data cluster offset 0x52a00 unaligned (L2 offset: 0x40000, L2 index: 0); further non-fatal corruption events will be suppressed
+read failed: Input/output error
+read failed: Input/output error
+
+=== Testing non-fatal and then fatal corruption report ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Image is corrupt: Cannot free unaligned cluster 0x52a00; further non-fatal corruption events will be suppressed
+qcow2: Marking image as corrupt: Data cluster offset 0x62a00 unaligned (L2 offset: 0x40000, L2 index: 0x1); further corruption events will be suppressed
+discard 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read failed: Input/output error
 *** done
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
@ 2014-09-05 14:29   ` Eric Blake
  2014-09-05 14:40   ` Eric Blake
  2014-09-08 14:01   ` Benoît Canet
  2 siblings, 0 replies; 23+ messages in thread
From: Eric Blake @ 2014-09-05 14:29 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 957 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
> reading from an image, they should generally not be. Nonetheless, even
> an image only read from may of course be corrupted and this can be
> detected during normal operation. In this case, a non-fatal event should
> be emitted, but the image should not be marked corrupt (in accordance to
> "fatal" set to false).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-refcount.c | 1 +
>  qapi/block-core.json   | 9 +++++++--
>  2 files changed, 8 insertions(+), 2 deletions(-)

As this is purely additive, it should not break any former clients.
Older clients not knowing to look for the fatal flag will treat
everything as fatal, but that's okay.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
  2014-09-05 14:29   ` Eric Blake
@ 2014-09-05 14:40   ` Eric Blake
  2014-09-05 14:47     ` Max Reitz
  2014-09-08 14:01   ` Benoît Canet
  2 siblings, 1 reply; 23+ messages in thread
From: Eric Blake @ 2014-09-05 14:40 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
> reading from an image, they should generally not be. Nonetheless, even
> an image only read from may of course be corrupted and this can be
> detected during normal operation. In this case, a non-fatal event should
> be emitted, but the image should not be marked corrupt (in accordance to
> "fatal" set to false).
> 

Question - what happens if management misses the signal?  For example,
if libvirt opens qemu on a read-only image, then goes away, then
corruption is detected, then libvirt reconnects.  Does query-block need
to also be updated to report whether a read-only BDS is currently
detected as fatal, but that an event has already been delivered?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption()
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption() Max Reitz
@ 2014-09-05 14:43   ` Eric Blake
  2014-09-08 14:15   ` Benoît Canet
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Blake @ 2014-09-05 14:43 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 935 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Add a helper function for easily marking an image corrupt (on fatal
> corruptions) while outputting an informative message to stderr and via
> QAPI.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.h |  5 +++++
>  2 files changed, 53 insertions(+)
> 

> +void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
> +                             int64_t size, const char *message_format, ...)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    char *message;
> +    va_list ap;
> +
> +    fatal = fatal && !bs->read_only;

I probably would have written: fatal &= !bs->read_only
but that's cosmetic.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:40   ` Eric Blake
@ 2014-09-05 14:47     ` Max Reitz
  2014-09-05 14:51       ` Eric Blake
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:47 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

On 05.09.2014 16:40, Eric Blake wrote:
> On 09/05/2014 08:07 AM, Max Reitz wrote:
>> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
>> reading from an image, they should generally not be. Nonetheless, even
>> an image only read from may of course be corrupted and this can be
>> detected during normal operation. In this case, a non-fatal event should
>> be emitted, but the image should not be marked corrupt (in accordance to
>> "fatal" set to false).
>>
> Question - what happens if management misses the signal?  For example,
> if libvirt opens qemu on a read-only image, then goes away, then
> corruption is detected, then libvirt reconnects.  Does query-block need
> to also be updated to report whether a read-only BDS is currently
> detected as fatal, but that an event has already been delivered?

Well, the obvious problem with that is that corruption currently is a 
strongly format-specific topic, and only reported for qcow2. So, to do 
this, we'd have to move the corruption signalling code into the common 
block layer functions and proceed from there. This actually probably 
isn't too bad of an idea, anyway. But then we'll need a global 
bdrv_mark_corrupt() function (which we probably don't want in case we 
get more flags in the future, so we'll rather want bdrv_set_flag() or 
something) and I don't really want to make these changes now... We could 
postpone this, though. Making this change later shouldn't break anything.

The other solution would be simply to suppress the stderr message but to 
always deliver the QMP event.

Max

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:47     ` Max Reitz
@ 2014-09-05 14:51       ` Eric Blake
  2014-09-05 14:53         ` Max Reitz
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Blake @ 2014-09-05 14:51 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2143 bytes --]

On 09/05/2014 08:47 AM, Max Reitz wrote:
> On 05.09.2014 16:40, Eric Blake wrote:
>> On 09/05/2014 08:07 AM, Max Reitz wrote:
>>> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
>>> reading from an image, they should generally not be. Nonetheless, even
>>> an image only read from may of course be corrupted and this can be
>>> detected during normal operation. In this case, a non-fatal event should
>>> be emitted, but the image should not be marked corrupt (in accordance to
>>> "fatal" set to false).
>>>
>> Question - what happens if management misses the signal?  For example,
>> if libvirt opens qemu on a read-only image, then goes away, then
>> corruption is detected, then libvirt reconnects.  Does query-block need
>> to also be updated to report whether a read-only BDS is currently
>> detected as fatal, but that an event has already been delivered?
> 
> Well, the obvious problem with that is that corruption currently is a
> strongly format-specific topic, and only reported for qcow2. So, to do
> this, we'd have to move the corruption signalling code into the common
> block layer functions and proceed from there. This actually probably
> isn't too bad of an idea, anyway. But then we'll need a global
> bdrv_mark_corrupt() function (which we probably don't want in case we
> get more flags in the future, so we'll rather want bdrv_set_flag() or
> something) and I don't really want to make these changes now... We could
> postpone this, though. Making this change later shouldn't break anything.
> 
> The other solution would be simply to suppress the stderr message but to
> always deliver the QMP event.

That feels like it would flood the channel, if the image is being
repeatedly read.  The approach of warning exactly once is nice because
it prevents flooding.  We already have a way to report per-format
details, is it sufficient to modify ImageInfoSpecificQCow2 to add a bool
field that reports true if the image is currently known to be corrupted?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps Max Reitz
@ 2014-09-05 14:52   ` Eric Blake
  2014-09-08 14:21   ` Benoît Canet
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Blake @ 2014-09-05 14:52 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 589 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Use the new function in case of a failed overlap check.
> 
> This changes output in case of corruption, so adapt iotest 060's
> reference output accordingly.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-refcount.c     | 24 +++---------------------
>  tests/qemu-iotests/060.out | 10 +++++-----
>  2 files changed, 8 insertions(+), 26 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:51       ` Eric Blake
@ 2014-09-05 14:53         ` Max Reitz
  0 siblings, 0 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-05 14:53 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

On 05.09.2014 16:51, Eric Blake wrote:
> On 09/05/2014 08:47 AM, Max Reitz wrote:
>> On 05.09.2014 16:40, Eric Blake wrote:
>>> On 09/05/2014 08:07 AM, Max Reitz wrote:
>>>> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
>>>> reading from an image, they should generally not be. Nonetheless, even
>>>> an image only read from may of course be corrupted and this can be
>>>> detected during normal operation. In this case, a non-fatal event should
>>>> be emitted, but the image should not be marked corrupt (in accordance to
>>>> "fatal" set to false).
>>>>
>>> Question - what happens if management misses the signal?  For example,
>>> if libvirt opens qemu on a read-only image, then goes away, then
>>> corruption is detected, then libvirt reconnects.  Does query-block need
>>> to also be updated to report whether a read-only BDS is currently
>>> detected as fatal, but that an event has already been delivered?
>> Well, the obvious problem with that is that corruption currently is a
>> strongly format-specific topic, and only reported for qcow2. So, to do
>> this, we'd have to move the corruption signalling code into the common
>> block layer functions and proceed from there. This actually probably
>> isn't too bad of an idea, anyway. But then we'll need a global
>> bdrv_mark_corrupt() function (which we probably don't want in case we
>> get more flags in the future, so we'll rather want bdrv_set_flag() or
>> something) and I don't really want to make these changes now... We could
>> postpone this, though. Making this change later shouldn't break anything.
>>
>> The other solution would be simply to suppress the stderr message but to
>> always deliver the QMP event.
> That feels like it would flood the channel, if the image is being
> repeatedly read.  The approach of warning exactly once is nice because
> it prevents flooding.  We already have a way to report per-format
> details, is it sufficient to modify ImageInfoSpecificQCow2 to add a bool
> field that reports true if the image is currently known to be corrupted?

Actually, I'm asking myself right now why we don't already have such a 
field. :-)

I'll add it in an independent patch.

Max

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
@ 2014-09-05 15:03   ` Eric Blake
  2014-09-08 14:40   ` Benoît Canet
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Blake @ 2014-09-05 15:03 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Offsets taken from the L1, L2 and refcount tables are generally assumed
> to be correctly aligned. However, this cannot be guaranteed if the image
> has been written to by something different than qemu, thus check all
> offsets taken from these tables for correct cluster alignment.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cluster.c  | 43 ++++++++++++++++++++++++++++++++++++++++---
>  block/qcow2-refcount.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 82 insertions(+), 5 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption Max Reitz
@ 2014-09-05 15:09   ` Eric Blake
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Blake @ 2014-09-05 15:09 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 566 bytes --]

On 09/05/2014 08:07 AM, Max Reitz wrote:
> Add tests for unaligned L1/L2/reftable entries and non-fatal corruption
> reports.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/060     | 56 ++++++++++++++++++++++++++++++++++++++++++++--
>  tests/qemu-iotests/060.out | 51 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 105 insertions(+), 2 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
  2014-09-05 14:29   ` Eric Blake
  2014-09-05 14:40   ` Eric Blake
@ 2014-09-08 14:01   ` Benoît Canet
  2014-09-08 17:40     ` Max Reitz
  2 siblings, 1 reply; 23+ messages in thread
From: Benoît Canet @ 2014-09-08 14:01 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

The Friday 05 Sep 2014 à 16:07:15 (+0200), Max Reitz wrote :
> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
> reading from an image, they should generally not be. Nonetheless, even
> an image only read from may of course be corrupted and this can be
> detected during normal operation. In this case, a non-fatal event should
> be emitted, but the image should not be marked corrupt (in accordance to
> "fatal" set to false).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-refcount.c | 1 +
>  qapi/block-core.json   | 9 +++++++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 43665b8..0bd75d2 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1853,6 +1853,7 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
>                                                offset,
>                                                true,
>                                                size,

> +                                              true,

What is this line ?

>                                                &error_abort);
>          g_free(message);
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index a685d02..d23bcc2 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1554,7 +1554,7 @@
>  ##
>  # @BLOCK_IMAGE_CORRUPTED
>  #
> -# Emitted when a disk image is being marked corrupt
> +# Emitted when a corruption has been detected in a disk image
>  #
>  # @device: device name
>  #
> @@ -1568,13 +1568,18 @@
>  # @size: #optional, if the corruption resulted from an image access, this is
>  #        the access size
>  #
> +# fatal: if set, the image is marked corrupt and therefore unusable after this
> +#        event and must be repaired (Since 2.2; before, every
> +#        BLOCK_IMAGE_CORRUPTED event was fatal)
> +#
>  # Since: 1.7
>  ##
>  { 'event': 'BLOCK_IMAGE_CORRUPTED',
>    'data': { 'device' : 'str',
>              'msg'    : 'str',
>              '*offset': 'int',
> -            '*size'  : 'int' } }
> +            '*size'  : 'int',
> +            'fatal'  : 'bool' } }
>  
>  ##
>  # @BLOCK_IO_ERROR
> -- 
> 2.1.0
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption()
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption() Max Reitz
  2014-09-05 14:43   ` Eric Blake
@ 2014-09-08 14:15   ` Benoît Canet
  1 sibling, 0 replies; 23+ messages in thread
From: Benoît Canet @ 2014-09-08 14:15 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

The Friday 05 Sep 2014 à 16:07:16 (+0200), Max Reitz wrote :
> Add a helper function for easily marking an image corrupt (on fatal
> corruptions) while outputting an informative message to stderr and via
> QAPI.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.h |  5 +++++
>  2 files changed, 53 insertions(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index f9e045f..fc4217b 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -30,6 +30,8 @@
>  #include "qemu/error-report.h"
>  #include "qapi/qmp/qerror.h"
>  #include "qapi/qmp/qbool.h"
> +#include "qapi/qmp/types.h"
> +#include "qapi-event.h"
>  #include "trace.h"
>  #include "qemu/option_int.h"
>  
> @@ -2478,6 +2480,52 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
>      return 0;
>  }
>  
> +/*
> + * If offset or size are negative, respectively, they will not be included in
> + * the BLOCK_IMAGE_CORRUPTED event emitted.
> + * fatal will be ignored for read-only BDS; corruptions found there will always
> + * be considered non-fatal.
> + */
> +void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
> +                             int64_t size, const char *message_format, ...)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    char *message;
> +    va_list ap;
> +
> +    fatal = fatal && !bs->read_only;
> +
> +    if (s->signaled_corruption &&
> +        (!fatal || (s->incompatible_features & QCOW2_INCOMPAT_CORRUPT)))
> +    {
> +        return;
> +    }
> +
> +    va_start(ap, message_format);
> +    message = g_strdup_vprintf(message_format, ap);
> +    va_end(ap);
> +
> +    if (fatal) {
> +        fprintf(stderr, "qcow2: Marking image as corrupt: %s; further "
> +                "corruption events will be suppressed\n", message);
> +    } else {
> +        fprintf(stderr, "qcow2: Image is corrupt: %s; further non-fatal "
> +                "corruption events will be suppressed\n", message);
> +    }
> +
> +    qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs), message,
> +                                          offset >= 0, offset, size >= 0, size,
> +                                          fatal, &error_abort);
> +    g_free(message);
> +
> +    if (fatal) {
> +        qcow2_mark_corrupt(bs);
> +        bs->drv = NULL; /* make BDS unusable */
> +    }
> +
> +    s->signaled_corruption = true;
> +}
> +
>  static QemuOptsList qcow2_create_opts = {
>      .name = "qcow2-create-opts",
>      .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 6aeb7ea..7b7b6a6 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -261,6 +261,7 @@ typedef struct BDRVQcowState {
>      bool discard_passthrough[QCOW2_DISCARD_MAX];
>  
>      int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
> +    bool signaled_corruption;
>  
>      uint64_t incompatible_features;
>      uint64_t compatible_features;
> @@ -477,6 +478,10 @@ int qcow2_mark_corrupt(BlockDriverState *bs);
>  int qcow2_mark_consistent(BlockDriverState *bs);
>  int qcow2_update_header(BlockDriverState *bs);
>  
> +void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
> +                             int64_t size, const char *message_format, ...)
> +                             GCC_FMT_ATTR(5, 6);
> +
>  /* qcow2-refcount.c functions */
>  int qcow2_refcount_init(BlockDriverState *bs);
>  void qcow2_refcount_close(BlockDriverState *bs);
> -- 
> 2.1.0
> 
> 
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps Max Reitz
  2014-09-05 14:52   ` Eric Blake
@ 2014-09-08 14:21   ` Benoît Canet
  1 sibling, 0 replies; 23+ messages in thread
From: Benoît Canet @ 2014-09-08 14:21 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

The Friday 05 Sep 2014 à 16:07:17 (+0200), Max Reitz wrote :
> Use the new function in case of a failed overlap check.
> 
> This changes output in case of corruption, so adapt iotest 060's
> reference output accordingly.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-refcount.c     | 24 +++---------------------
>  tests/qemu-iotests/060.out | 10 +++++-----
>  2 files changed, 8 insertions(+), 26 deletions(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 0bd75d2..b9d421e 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -26,8 +26,6 @@
>  #include "block/block_int.h"
>  #include "block/qcow2.h"
>  #include "qemu/range.h"
> -#include "qapi/qmp/types.h"
> -#include "qapi-event.h"
>  
>  static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
>  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
> @@ -1838,27 +1836,11 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
>          return ret;
>      } else if (ret > 0) {
>          int metadata_ol_bitnr = ffs(ret) - 1;
> -        char *message;
> -
>          assert(metadata_ol_bitnr < QCOW2_OL_MAX_BITNR);
>  
> -        fprintf(stderr, "qcow2: Preventing invalid write on metadata (overlaps "
> -                "with %s); image marked as corrupt.\n",
> -                metadata_ol_names[metadata_ol_bitnr]);
> -        message = g_strdup_printf("Prevented %s overwrite",
> -                metadata_ol_names[metadata_ol_bitnr]);
> -        qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs),
> -                                              message,
> -                                              true,
> -                                              offset,
> -                                              true,
> -                                              size,
> -                                              true,
> -                                              &error_abort);
> -        g_free(message);
> -
> -        qcow2_mark_corrupt(bs);
> -        bs->drv = NULL; /* make BDS unusable */
> +        qcow2_signal_corruption(bs, true, offset, size, "Preventing invalid "
> +                                "write on metadata (overlaps with %s)",
> +                                metadata_ol_names[metadata_ol_bitnr]);
>          return -EIO;
>      }
>  
> diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
> index c27c952..30806da 100644
> --- a/tests/qemu-iotests/060.out
> +++ b/tests/qemu-iotests/060.out
> @@ -8,7 +8,7 @@ ERROR cluster 3 refcount=1 reference=3
>  1 errors were found on the image.
>  Data may be corrupted, or further writes to the image may corrupt it.
>  incompatible_features     0x0
> -qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
> +qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L1 table); further corruption events will be suppressed
>  write failed: Input/output error
>  incompatible_features     0x2
>  qemu-io: can't open device TEST_DIR/t.IMGFMT: IMGFMT: Image is corrupt; cannot be opened read/write
> @@ -24,7 +24,7 @@ ERROR cluster 2 refcount=1 reference=2
>  2 errors were found on the image.
>  Data may be corrupted, or further writes to the image may corrupt it.
>  incompatible_features     0x0
> -qcow2: Preventing invalid write on metadata (overlaps with refcount block); image marked as corrupt.
> +qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with refcount block); further corruption events will be suppressed
>  write failed: Input/output error
>  incompatible_features     0x2
>  Repairing refcount block 0 refcount=2
> @@ -56,7 +56,7 @@ Data may be corrupted, or further writes to the image may corrupt it.
>  1 leaked clusters were found on the image.
>  This means waste of disk space, but no harm to data.
>  incompatible_features     0x0
> -qcow2: Preventing invalid write on metadata (overlaps with inactive L2 table); image marked as corrupt.
> +qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with inactive L2 table); further corruption events will be suppressed
>  write failed: Input/output error
>  incompatible_features     0x2
>  Repairing cluster 4 refcount=1 reference=2
> @@ -88,7 +88,7 @@ wrote 65536/65536 bytes at offset 536870912
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  discard 65536/65536 bytes at offset 0
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -qcow2: Preventing invalid write on metadata (overlaps with active L2 table); image marked as corrupt.
> +qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L2 table); further corruption events will be suppressed
>  blkdebug: Suspended request '0'
>  write failed: Input/output error
>  blkdebug: Resuming request '0'
> @@ -99,6 +99,6 @@ aio_write failed: No medium found
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
>  wrote 65536/65536 bytes at offset 0
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -qcow2: Preventing invalid write on metadata (overlaps with qcow2_header); image marked as corrupt.
> +qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed
>  write failed: Input/output error
>  *** done
> -- 
> 2.1.0
> 
> 
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
  2014-09-05 15:03   ` Eric Blake
@ 2014-09-08 14:40   ` Benoît Canet
  2014-09-08 17:47     ` Max Reitz
  1 sibling, 1 reply; 23+ messages in thread
From: Benoît Canet @ 2014-09-08 14:40 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

The Friday 05 Sep 2014 à 16:07:18 (+0200), Max Reitz wrote :
> Offsets taken from the L1, L2 and refcount tables are generally assumed
> to be correctly aligned. However, this cannot be guaranteed if the image
> has been written to by something different than qemu, thus check all
> offsets taken from these tables for correct cluster alignment.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cluster.c  | 43 ++++++++++++++++++++++++++++++++++++++++---
>  block/qcow2-refcount.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 82 insertions(+), 5 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 735f687..f7dd8c0 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -486,6 +486,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>          goto out;
>      }
>  
> +    if (offset_into_cluster(s, l2_offset)) {
> +        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
> +                                " unaligned (L1 index: %#" PRIx64 ")",
> +                                l2_offset, l1_index);
> +        return -EIO;

This function mix return ret and goto out and there is more of the second.
Can we do ret = -EIO and goto out for consistency ?
bs->drv == NULL after qcow2_signal_corruption so we are not afraid of out
sides effects.

> +    }
> +
>      /* load the l2 table in memory */
>  
>      ret = l2_load(bs, l2_offset, &l2_table);
> @@ -508,8 +515,11 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>          break;
>      case QCOW2_CLUSTER_ZERO:
>          if (s->qcow_version < 3) {
> -            qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
> -            return -EIO;
> +            qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
> +                                    " in pre-v3 image (L2 offset: %#" PRIx64
> +                                    ", L2 index: %#x)", l2_offset, l2_index);
> +            ret = -EIO;
> +            goto fail;
>          }
>          c = count_contiguous_clusters(nb_clusters, s->cluster_size,
>                  &l2_table[l2_index], QCOW_OFLAG_ZERO);
> @@ -525,6 +535,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>          c = count_contiguous_clusters(nb_clusters, s->cluster_size,
>                  &l2_table[l2_index], QCOW_OFLAG_ZERO);
>          *cluster_offset &= L2E_OFFSET_MASK;
> +        if (offset_into_cluster(s, *cluster_offset)) {
> +            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
> +                                    PRIx64 " unaligned (L2 offset: %#" PRIx64
> +                                    ", L2 index: %#x)", *cluster_offset,
> +                                    l2_offset, l2_index);
> +            ret = -EIO;
> +            goto fail;
> +        }
>          break;
>      default:
>          abort();
> @@ -541,6 +559,10 @@ out:
>      *num = nb_available - index_in_cluster;
>  
>      return ret;
> +
> +fail:
> +    qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
> +    return ret;
>  }
>  
>  /*
> @@ -576,6 +598,12 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
>  
>      assert(l1_index < s->l1_size);
>      l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
> +    if (offset_into_cluster(s, l2_offset)) {
> +        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
> +                                " unaligned (L1 index: %#" PRIx64 ")",
> +                                l2_offset, l1_index);
> +        return -EIO;
> +    }
>  
>      /* seek the l2 table of the given l2 offset */
>  
> @@ -948,6 +976,15 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
>          bool offset_matches =
>              (cluster_offset & L2E_OFFSET_MASK) == *host_offset;
>  
> +        if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
> +            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
> +                                    "%#llx unaligned (guest offset: %#" PRIx64
> +                                    ")", cluster_offset & L2E_OFFSET_MASK,
> +                                    guest_offset);
> +            ret = -EIO;
> +            goto out;
> +        }
> +
>          if (*host_offset != 0 && !offset_matches) {
>              *bytes = 0;
>              ret = 0;
> @@ -979,7 +1016,7 @@ out:
>  
>      /* Only return a host offset if we actually made progress. Otherwise we
>       * would make requirements for handle_alloc() that it can't fulfill */
> -    if (ret) {
> +    if (ret > 0) {
>          *host_offset = (cluster_offset & L2E_OFFSET_MASK)
>                       + offset_into_cluster(s, guest_offset);
>      }
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index b9d421e..2bcaaf9 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -108,6 +108,13 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
>      if (!refcount_block_offset)
>          return 0;
>  
> +    if (offset_into_cluster(s, refcount_block_offset)) {
> +        qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#" PRIx64
> +                                " unaligned (reftable index: %#" PRIx64 ")",
> +                                refcount_block_offset, refcount_table_index);
> +        return -EIO;
> +    }
> +
>      ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
>          (void**) &refcount_block);
>      if (ret < 0) {
> @@ -181,6 +188,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
>  
>          /* If it's already there, we're done */
>          if (refcount_block_offset) {
> +            if (offset_into_cluster(s, refcount_block_offset)) {
> +                qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
> +                                        PRIx64 " unaligned (reftable index: "
> +                                        "%#x)", refcount_block_offset,
> +                                        refcount_table_index);
> +                return -EIO;
> +            }
> +
>               return load_refcount_block(bs, refcount_block_offset,
>                   (void**) refcount_block);
>          }
> @@ -836,8 +851,14 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
>      case QCOW2_CLUSTER_NORMAL:
>      case QCOW2_CLUSTER_ZERO:
>          if (l2_entry & L2E_OFFSET_MASK) {
> -            qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
> -                                nb_clusters << s->cluster_bits, type);
> +            if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
> +                qcow2_signal_corruption(bs, false, -1, -1,
> +                                        "Cannot free unaligned cluster %#llx",
> +                                        l2_entry & L2E_OFFSET_MASK);
> +            } else {
> +                qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
> +                                    nb_clusters << s->cluster_bits, type);
> +            }
>          }
>          break;
>      case QCOW2_CLUSTER_UNALLOCATED:
> @@ -901,6 +922,14 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>              old_l2_offset = l2_offset;
>              l2_offset &= L1E_OFFSET_MASK;
>  
> +            if (offset_into_cluster(s, l2_offset)) {
> +                qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
> +                                        PRIx64 " unaligned (L1 index: %#x)",
> +                                        l2_offset, i);
> +                ret = -EIO;
> +                goto fail;
> +            }
> +
>              ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
>                  (void**) &l2_table);
>              if (ret < 0) {
> @@ -933,6 +962,17 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>  
>                      case QCOW2_CLUSTER_NORMAL:
>                      case QCOW2_CLUSTER_ZERO:
> +                        if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
> +                            qcow2_signal_corruption(bs, true, -1, -1, "Data "
> +                                                    "cluster offset %#llx "
> +                                                    "unaligned (L2 offset: %#"
> +                                                    PRIx64 ", L2 index: %#x)",
> +                                                    offset & L2E_OFFSET_MASK,
> +                                                    l2_offset, j);
> +                            ret = -EIO;
> +                            goto fail;
> +                        }
> +
>                          cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
>                          if (!cluster_index) {
>                              /* unallocated */
> -- 
> 2.1.0
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
  2014-09-08 14:01   ` Benoît Canet
@ 2014-09-08 17:40     ` Max Reitz
  0 siblings, 0 replies; 23+ messages in thread
From: Max Reitz @ 2014-09-08 17:40 UTC (permalink / raw)
  To: Benoît Canet; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

On 08.09.2014 16:01, Benoît Canet wrote:
> The Friday 05 Sep 2014 à 16:07:15 (+0200), Max Reitz wrote :
>> Not every BLOCK_IMAGE_CORRUPTED event must be fatal; for example, when
>> reading from an image, they should generally not be. Nonetheless, even
>> an image only read from may of course be corrupted and this can be
>> detected during normal operation. In this case, a non-fatal event should
>> be emitted, but the image should not be marked corrupt (in accordance to
>> "fatal" set to false).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-refcount.c | 1 +
>>   qapi/block-core.json   | 9 +++++++--
>>   2 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index 43665b8..0bd75d2 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -1853,6 +1853,7 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
>>                                                 offset,
>>                                                 true,
>>                                                 size,
>> +                                              true,
> What is this line ?

It's the new "fatal" field in BLOCK_IMAGE_CORRUPTED (missing context: 
qapi_event_send_block_image_corrupted(bdrv_get_device_name(bs), message, 
true, offset, true, size, +true, &error_abort)).

Max

>>                                                 &error_abort);
>>           g_free(message);
>>   
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index a685d02..d23bcc2 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -1554,7 +1554,7 @@
>>   ##
>>   # @BLOCK_IMAGE_CORRUPTED
>>   #
>> -# Emitted when a disk image is being marked corrupt
>> +# Emitted when a corruption has been detected in a disk image
>>   #
>>   # @device: device name
>>   #
>> @@ -1568,13 +1568,18 @@
>>   # @size: #optional, if the corruption resulted from an image access, this is
>>   #        the access size
>>   #
>> +# fatal: if set, the image is marked corrupt and therefore unusable after this
>> +#        event and must be repaired (Since 2.2; before, every
>> +#        BLOCK_IMAGE_CORRUPTED event was fatal)
>> +#
>>   # Since: 1.7
>>   ##
>>   { 'event': 'BLOCK_IMAGE_CORRUPTED',
>>     'data': { 'device' : 'str',
>>               'msg'    : 'str',
>>               '*offset': 'int',
>> -            '*size'  : 'int' } }
>> +            '*size'  : 'int',
>> +            'fatal'  : 'bool' } }
>>   
>>   ##
>>   # @BLOCK_IO_ERROR
>> -- 
>> 2.1.0
>>
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-08 14:40   ` Benoît Canet
@ 2014-09-08 17:47     ` Max Reitz
  2014-09-08 18:03       ` Benoît Canet
  0 siblings, 1 reply; 23+ messages in thread
From: Max Reitz @ 2014-09-08 17:47 UTC (permalink / raw)
  To: Benoît Canet; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

On 08.09.2014 16:40, Benoît Canet wrote:
> The Friday 05 Sep 2014 à 16:07:18 (+0200), Max Reitz wrote :
>> Offsets taken from the L1, L2 and refcount tables are generally assumed
>> to be correctly aligned. However, this cannot be guaranteed if the image
>> has been written to by something different than qemu, thus check all
>> offsets taken from these tables for correct cluster alignment.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-cluster.c  | 43 ++++++++++++++++++++++++++++++++++++++++---
>>   block/qcow2-refcount.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
>>   2 files changed, 82 insertions(+), 5 deletions(-)
>>
>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>> index 735f687..f7dd8c0 100644
>> --- a/block/qcow2-cluster.c
>> +++ b/block/qcow2-cluster.c
>> @@ -486,6 +486,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>>           goto out;
>>       }
>>   
>> +    if (offset_into_cluster(s, l2_offset)) {
>> +        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
>> +                                " unaligned (L1 index: %#" PRIx64 ")",
>> +                                l2_offset, l1_index);
>> +        return -EIO;
> This function mix return ret and goto out and there is more of the second.
> Can we do ret = -EIO and goto out for consistency ?
> bs->drv == NULL after qcow2_signal_corruption so we are not afraid of out
> sides effects.

The "out" label here is for success; that's why I introduced the "fail" 
label in this series. I could make qcow2_cache_put() in the fail path 
optional and then use goto fail, though. But this would only increase 
the code size with no real benefit apparent to me (no code 
deduplication; and as far as I remember, we have many functions with 
fail labels which however use a plain "return" before cleaning up is 
needed).

(before this patch, there were two places using "goto out" in this 
function, both of which were "successes" (cluster found to be 
unallocated)); and two places using "return -errno", both of which were 
failures (the first one due to l2_load() failing and the second one due 
to a zero cluster found in a pre-v3 image))

Max

>> +    }
>> +
>>       /* load the l2 table in memory */
>>   
>>       ret = l2_load(bs, l2_offset, &l2_table);
>> @@ -508,8 +515,11 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>>           break;
>>       case QCOW2_CLUSTER_ZERO:
>>           if (s->qcow_version < 3) {
>> -            qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>> -            return -EIO;
>> +            qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
>> +                                    " in pre-v3 image (L2 offset: %#" PRIx64
>> +                                    ", L2 index: %#x)", l2_offset, l2_index);
>> +            ret = -EIO;
>> +            goto fail;
>>           }
>>           c = count_contiguous_clusters(nb_clusters, s->cluster_size,
>>                   &l2_table[l2_index], QCOW_OFLAG_ZERO);
>> @@ -525,6 +535,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>>           c = count_contiguous_clusters(nb_clusters, s->cluster_size,
>>                   &l2_table[l2_index], QCOW_OFLAG_ZERO);
>>           *cluster_offset &= L2E_OFFSET_MASK;
>> +        if (offset_into_cluster(s, *cluster_offset)) {
>> +            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
>> +                                    PRIx64 " unaligned (L2 offset: %#" PRIx64
>> +                                    ", L2 index: %#x)", *cluster_offset,
>> +                                    l2_offset, l2_index);
>> +            ret = -EIO;
>> +            goto fail;
>> +        }
>>           break;
>>       default:
>>           abort();
>> @@ -541,6 +559,10 @@ out:
>>       *num = nb_available - index_in_cluster;
>>   
>>       return ret;
>> +
>> +fail:
>> +    qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
>> +    return ret;
>>   }
>>   
>>   /*
>> @@ -576,6 +598,12 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
>>   
>>       assert(l1_index < s->l1_size);
>>       l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
>> +    if (offset_into_cluster(s, l2_offset)) {
>> +        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
>> +                                " unaligned (L1 index: %#" PRIx64 ")",
>> +                                l2_offset, l1_index);
>> +        return -EIO;
>> +    }
>>   
>>       /* seek the l2 table of the given l2 offset */
>>   
>> @@ -948,6 +976,15 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
>>           bool offset_matches =
>>               (cluster_offset & L2E_OFFSET_MASK) == *host_offset;
>>   
>> +        if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
>> +            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
>> +                                    "%#llx unaligned (guest offset: %#" PRIx64
>> +                                    ")", cluster_offset & L2E_OFFSET_MASK,
>> +                                    guest_offset);
>> +            ret = -EIO;
>> +            goto out;
>> +        }
>> +
>>           if (*host_offset != 0 && !offset_matches) {
>>               *bytes = 0;
>>               ret = 0;
>> @@ -979,7 +1016,7 @@ out:
>>   
>>       /* Only return a host offset if we actually made progress. Otherwise we
>>        * would make requirements for handle_alloc() that it can't fulfill */
>> -    if (ret) {
>> +    if (ret > 0) {
>>           *host_offset = (cluster_offset & L2E_OFFSET_MASK)
>>                        + offset_into_cluster(s, guest_offset);
>>       }
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index b9d421e..2bcaaf9 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -108,6 +108,13 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
>>       if (!refcount_block_offset)
>>           return 0;
>>   
>> +    if (offset_into_cluster(s, refcount_block_offset)) {
>> +        qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#" PRIx64
>> +                                " unaligned (reftable index: %#" PRIx64 ")",
>> +                                refcount_block_offset, refcount_table_index);
>> +        return -EIO;
>> +    }
>> +
>>       ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
>>           (void**) &refcount_block);
>>       if (ret < 0) {
>> @@ -181,6 +188,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
>>   
>>           /* If it's already there, we're done */
>>           if (refcount_block_offset) {
>> +            if (offset_into_cluster(s, refcount_block_offset)) {
>> +                qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
>> +                                        PRIx64 " unaligned (reftable index: "
>> +                                        "%#x)", refcount_block_offset,
>> +                                        refcount_table_index);
>> +                return -EIO;
>> +            }
>> +
>>                return load_refcount_block(bs, refcount_block_offset,
>>                    (void**) refcount_block);
>>           }
>> @@ -836,8 +851,14 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
>>       case QCOW2_CLUSTER_NORMAL:
>>       case QCOW2_CLUSTER_ZERO:
>>           if (l2_entry & L2E_OFFSET_MASK) {
>> -            qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
>> -                                nb_clusters << s->cluster_bits, type);
>> +            if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
>> +                qcow2_signal_corruption(bs, false, -1, -1,
>> +                                        "Cannot free unaligned cluster %#llx",
>> +                                        l2_entry & L2E_OFFSET_MASK);
>> +            } else {
>> +                qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
>> +                                    nb_clusters << s->cluster_bits, type);
>> +            }
>>           }
>>           break;
>>       case QCOW2_CLUSTER_UNALLOCATED:
>> @@ -901,6 +922,14 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>>               old_l2_offset = l2_offset;
>>               l2_offset &= L1E_OFFSET_MASK;
>>   
>> +            if (offset_into_cluster(s, l2_offset)) {
>> +                qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
>> +                                        PRIx64 " unaligned (L1 index: %#x)",
>> +                                        l2_offset, i);
>> +                ret = -EIO;
>> +                goto fail;
>> +            }
>> +
>>               ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
>>                   (void**) &l2_table);
>>               if (ret < 0) {
>> @@ -933,6 +962,17 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>>   
>>                       case QCOW2_CLUSTER_NORMAL:
>>                       case QCOW2_CLUSTER_ZERO:
>> +                        if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
>> +                            qcow2_signal_corruption(bs, true, -1, -1, "Data "
>> +                                                    "cluster offset %#llx "
>> +                                                    "unaligned (L2 offset: %#"
>> +                                                    PRIx64 ", L2 index: %#x)",
>> +                                                    offset & L2E_OFFSET_MASK,
>> +                                                    l2_offset, j);
>> +                            ret = -EIO;
>> +                            goto fail;
>> +                        }
>> +
>>                           cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
>>                           if (!cluster_index) {
>>                               /* unallocated */
>> -- 
>> 2.1.0
>>
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-08 17:47     ` Max Reitz
@ 2014-09-08 18:03       ` Benoît Canet
  0 siblings, 0 replies; 23+ messages in thread
From: Benoît Canet @ 2014-09-08 18:03 UTC (permalink / raw)
  To: Max Reitz; +Cc: Benoît Canet, Kevin Wolf, qemu-devel, Stefan Hajnoczi

The Monday 08 Sep 2014 à 19:47:31 (+0200), Max Reitz wrote :
> On 08.09.2014 16:40, Benoît Canet wrote:
> >The Friday 05 Sep 2014 à 16:07:18 (+0200), Max Reitz wrote :
> >>Offsets taken from the L1, L2 and refcount tables are generally assumed
> >>to be correctly aligned. However, this cannot be guaranteed if the image
> >>has been written to by something different than qemu, thus check all
> >>offsets taken from these tables for correct cluster alignment.
> >>
> >>Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>---
> >>  block/qcow2-cluster.c  | 43 ++++++++++++++++++++++++++++++++++++++++---
> >>  block/qcow2-refcount.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
> >>  2 files changed, 82 insertions(+), 5 deletions(-)
> >>
> >>diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> >>index 735f687..f7dd8c0 100644
> >>--- a/block/qcow2-cluster.c
> >>+++ b/block/qcow2-cluster.c
> >>@@ -486,6 +486,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
> >>          goto out;
> >>      }
> >>+    if (offset_into_cluster(s, l2_offset)) {
> >>+        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
> >>+                                " unaligned (L1 index: %#" PRIx64 ")",
> >>+                                l2_offset, l1_index);
> >>+        return -EIO;
> >This function mix return ret and goto out and there is more of the second.
> >Can we do ret = -EIO and goto out for consistency ?
> >bs->drv == NULL after qcow2_signal_corruption so we are not afraid of out
> >sides effects.
> 
> The "out" label here is for success; that's why I introduced the "fail"
> label in this series. I could make qcow2_cache_put() in the fail path
> optional and then use goto fail, though. But this would only increase the
> code size with no real benefit apparent to me (no code deduplication; and as
> far as I remember, we have many functions with fail labels which however use
> a plain "return" before cleaning up is needed).
> 
> (before this patch, there were two places using "goto out" in this function,
> both of which were "successes" (cluster found to be unallocated)); and two
> places using "return -errno", both of which were failures (the first one due
> to l2_load() failing and the second one due to a zero cluster found in a
> pre-v3 image))

Thanks for the explanation this make me think I should question and improve
the quality of my reviews.

Best regards

Benoît

> 
> Max
> 
> >>+    }
> >>+
> >>      /* load the l2 table in memory */
> >>      ret = l2_load(bs, l2_offset, &l2_table);
> >>@@ -508,8 +515,11 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
> >>          break;
> >>      case QCOW2_CLUSTER_ZERO:
> >>          if (s->qcow_version < 3) {
> >>-            qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
> >>-            return -EIO;
> >>+            qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
> >>+                                    " in pre-v3 image (L2 offset: %#" PRIx64
> >>+                                    ", L2 index: %#x)", l2_offset, l2_index);
> >>+            ret = -EIO;
> >>+            goto fail;
> >>          }
> >>          c = count_contiguous_clusters(nb_clusters, s->cluster_size,
> >>                  &l2_table[l2_index], QCOW_OFLAG_ZERO);
> >>@@ -525,6 +535,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
> >>          c = count_contiguous_clusters(nb_clusters, s->cluster_size,
> >>                  &l2_table[l2_index], QCOW_OFLAG_ZERO);
> >>          *cluster_offset &= L2E_OFFSET_MASK;
> >>+        if (offset_into_cluster(s, *cluster_offset)) {
> >>+            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
> >>+                                    PRIx64 " unaligned (L2 offset: %#" PRIx64
> >>+                                    ", L2 index: %#x)", *cluster_offset,
> >>+                                    l2_offset, l2_index);
> >>+            ret = -EIO;
> >>+            goto fail;
> >>+        }
> >>          break;
> >>      default:
> >>          abort();
> >>@@ -541,6 +559,10 @@ out:
> >>      *num = nb_available - index_in_cluster;
> >>      return ret;
> >>+
> >>+fail:
> >>+    qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
> >>+    return ret;
> >>  }
> >>  /*
> >>@@ -576,6 +598,12 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
> >>      assert(l1_index < s->l1_size);
> >>      l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
> >>+    if (offset_into_cluster(s, l2_offset)) {
> >>+        qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
> >>+                                " unaligned (L1 index: %#" PRIx64 ")",
> >>+                                l2_offset, l1_index);
> >>+        return -EIO;
> >>+    }
> >>      /* seek the l2 table of the given l2 offset */
> >>@@ -948,6 +976,15 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
> >>          bool offset_matches =
> >>              (cluster_offset & L2E_OFFSET_MASK) == *host_offset;
> >>+        if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
> >>+            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
> >>+                                    "%#llx unaligned (guest offset: %#" PRIx64
> >>+                                    ")", cluster_offset & L2E_OFFSET_MASK,
> >>+                                    guest_offset);
> >>+            ret = -EIO;
> >>+            goto out;
> >>+        }
> >>+
> >>          if (*host_offset != 0 && !offset_matches) {
> >>              *bytes = 0;
> >>              ret = 0;
> >>@@ -979,7 +1016,7 @@ out:
> >>      /* Only return a host offset if we actually made progress. Otherwise we
> >>       * would make requirements for handle_alloc() that it can't fulfill */
> >>-    if (ret) {
> >>+    if (ret > 0) {
> >>          *host_offset = (cluster_offset & L2E_OFFSET_MASK)
> >>                       + offset_into_cluster(s, guest_offset);
> >>      }
> >>diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> >>index b9d421e..2bcaaf9 100644
> >>--- a/block/qcow2-refcount.c
> >>+++ b/block/qcow2-refcount.c
> >>@@ -108,6 +108,13 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
> >>      if (!refcount_block_offset)
> >>          return 0;
> >>+    if (offset_into_cluster(s, refcount_block_offset)) {
> >>+        qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#" PRIx64
> >>+                                " unaligned (reftable index: %#" PRIx64 ")",
> >>+                                refcount_block_offset, refcount_table_index);
> >>+        return -EIO;
> >>+    }
> >>+
> >>      ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
> >>          (void**) &refcount_block);
> >>      if (ret < 0) {
> >>@@ -181,6 +188,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
> >>          /* If it's already there, we're done */
> >>          if (refcount_block_offset) {
> >>+            if (offset_into_cluster(s, refcount_block_offset)) {
> >>+                qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
> >>+                                        PRIx64 " unaligned (reftable index: "
> >>+                                        "%#x)", refcount_block_offset,
> >>+                                        refcount_table_index);
> >>+                return -EIO;
> >>+            }
> >>+
> >>               return load_refcount_block(bs, refcount_block_offset,
> >>                   (void**) refcount_block);
> >>          }
> >>@@ -836,8 +851,14 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
> >>      case QCOW2_CLUSTER_NORMAL:
> >>      case QCOW2_CLUSTER_ZERO:
> >>          if (l2_entry & L2E_OFFSET_MASK) {
> >>-            qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
> >>-                                nb_clusters << s->cluster_bits, type);
> >>+            if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
> >>+                qcow2_signal_corruption(bs, false, -1, -1,
> >>+                                        "Cannot free unaligned cluster %#llx",
> >>+                                        l2_entry & L2E_OFFSET_MASK);
> >>+            } else {
> >>+                qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
> >>+                                    nb_clusters << s->cluster_bits, type);
> >>+            }
> >>          }
> >>          break;
> >>      case QCOW2_CLUSTER_UNALLOCATED:
> >>@@ -901,6 +922,14 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
> >>              old_l2_offset = l2_offset;
> >>              l2_offset &= L1E_OFFSET_MASK;
> >>+            if (offset_into_cluster(s, l2_offset)) {
> >>+                qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
> >>+                                        PRIx64 " unaligned (L1 index: %#x)",
> >>+                                        l2_offset, i);
> >>+                ret = -EIO;
> >>+                goto fail;
> >>+            }
> >>+
> >>              ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
> >>                  (void**) &l2_table);
> >>              if (ret < 0) {
> >>@@ -933,6 +962,17 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
> >>                      case QCOW2_CLUSTER_NORMAL:
> >>                      case QCOW2_CLUSTER_ZERO:
> >>+                        if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
> >>+                            qcow2_signal_corruption(bs, true, -1, -1, "Data "
> >>+                                                    "cluster offset %#llx "
> >>+                                                    "unaligned (L2 offset: %#"
> >>+                                                    PRIx64 ", L2 index: %#x)",
> >>+                                                    offset & L2E_OFFSET_MASK,
> >>+                                                    l2_offset, j);
> >>+                            ret = -EIO;
> >>+                            goto fail;
> >>+                        }
> >>+
> >>                          cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
> >>                          if (!cluster_index) {
> >>                              /* unallocated */
> >>-- 
> >>2.1.0
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment
  2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
                   ` (4 preceding siblings ...)
  2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption Max Reitz
@ 2014-09-16 13:48 ` Stefan Hajnoczi
  5 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2014-09-16 13:48 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 3285 bytes --]

On Fri, Sep 05, 2014 at 04:07:14PM +0200, Max Reitz wrote:
> The image fuzzer from Maria exposed a lot of assertions which might fail
> in qemu when fed with a broken qcow2 image. Some of them are related to
> qemu trusting the offsets given in the L1, L2 and refcount tables to
> always be properly aligned on cluster boundaries (e.g.
> https://bugs.launchpad.net/qemu/+bug/1354529).
> 
> This series fixes this by verifying (hopefully) all data read from L1,
> L2 and refcount tables accordingly; if the offsets are not aligned on
> cluster boundaries, an error message is emitted and the image is marked
> corrupt unless it has been opened read-only.
> 
> v2:
> - Added patch 1 which adds a new field to the BLOCK_IMAGE_CORRUPTED. I
>   know this might be incompatible to existing users of that field, so I
>   did an investigative search (I googled) but did not find any. So I'm
>   confident this won't break anything.
> - Patch 2:
>   - Suppress corruption events and messages after the first one;
>     however, a fatal corruption should still be signaled even after a
>     non-fatal one has already occured [Kevin]
>   - Also, use the new "fatal" field
> - Patch 3:
>   - Fix test 060 right in this patch (squashed the old patch 3 into
>     patch 2, the result is now this patch 3) [Kevin]
>   - Set the "fatal" field to true
> - Patch 4:
>   - Emit more debugging information (like table indices etc.) [Kevin]
>   - Generally set the "fatal" field to true
>   - In case a cluster with an unaligned offset should be freed, call the
>     corruption signalling function just like everywhere else; but set
>     "fatal" to false [Kevin+Eric]
> - Patch 5: Added [Kevin]
> 
> 
> git-backport-diff against v1 (although not very useful):
> 
> Key:
> [----] : patches are identical
> [####] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively
> 
> 001/5:[down] 'qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED'
> 002/5:[0045] [FC] 'qcow2: Add qcow2_signal_corruption()'
> 003/5:[0015] [FC] 'qcow2: Use qcow2_signal_corruption() for overlaps'
> 004/5:[0068] [FC] 'qcow2: Check L1/L2/reftable entries for alignment'
> 005/5:[down] 'iotests: Add more tests for qcow2 corruption'
> 
> 
> Max Reitz (5):
>   qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED
>   qcow2: Add qcow2_signal_corruption()
>   qcow2: Use qcow2_signal_corruption() for overlaps
>   qcow2: Check L1/L2/reftable entries for alignment
>   iotests: Add more tests for qcow2 corruption
> 
>  block/qcow2-cluster.c      | 43 ++++++++++++++++++++++++++---
>  block/qcow2-refcount.c     | 67 +++++++++++++++++++++++++++++++---------------
>  block/qcow2.c              | 48 +++++++++++++++++++++++++++++++++
>  block/qcow2.h              |  5 ++++
>  qapi/block-core.json       |  9 +++++--
>  tests/qemu-iotests/060     | 56 ++++++++++++++++++++++++++++++++++++--
>  tests/qemu-iotests/060.out | 61 +++++++++++++++++++++++++++++++++++++----
>  7 files changed, 255 insertions(+), 34 deletions(-)
> 
> -- 
> 2.1.0
> 
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-09-16 13:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05 14:07 [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 1/5] qapi/block: Add "fatal" to BLOCK_IMAGE_CORRUPTED Max Reitz
2014-09-05 14:29   ` Eric Blake
2014-09-05 14:40   ` Eric Blake
2014-09-05 14:47     ` Max Reitz
2014-09-05 14:51       ` Eric Blake
2014-09-05 14:53         ` Max Reitz
2014-09-08 14:01   ` Benoît Canet
2014-09-08 17:40     ` Max Reitz
2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add qcow2_signal_corruption() Max Reitz
2014-09-05 14:43   ` Eric Blake
2014-09-08 14:15   ` Benoît Canet
2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Use qcow2_signal_corruption() for overlaps Max Reitz
2014-09-05 14:52   ` Eric Blake
2014-09-08 14:21   ` Benoît Canet
2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Check L1/L2/reftable entries for alignment Max Reitz
2014-09-05 15:03   ` Eric Blake
2014-09-08 14:40   ` Benoît Canet
2014-09-08 17:47     ` Max Reitz
2014-09-08 18:03       ` Benoît Canet
2014-09-05 14:07 ` [Qemu-devel] [PATCH v2 5/5] iotests: Add more tests for qcow2 corruption Max Reitz
2014-09-05 15:09   ` Eric Blake
2014-09-16 13:48 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Check L1/L2/reftable entries for alignment Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.