[Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
@ 2017-04-27  8:06 Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points Ashijeet Acharya
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

Previously posted series patches:
v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg04641.html

This series helps to provide chunk size independence for DMG driver to prevent
denial-of-service in cases where untrusted files are being accessed by the user.

This task is mentioned on the public block ToDo
Here -> http://wiki.qemu.org/ToDo/Block/DmgChunkSizeIndependence

Patch 1 introduces a new data structure to aid caching of random access points
within a compressed stream.

Patch 2 is an extension of patch 1 and introduces a new function to
initialize/update/reset our cached random access point.

Patch 3 limits the output buffer size to a max of 2MB to avoid QEMU allocate
huge amounts of memory.

Patch 4 is a simple preparatory patch to aid handling of various types of chunks.

Patches 5 & 6 help to handle various types of chunks.

Patch 7 simply refactors dmg_co_preadv() to read multiple sectors at once.

Patch 8 finally removes the error messages QEMU used to throw when an image with
chunk sizes above 64MB were accessed by the user.

->Testing procedure:
Convert a DMG file to raw format using the "qemu-img convert" tool present in
v2.9.0
Next convert the same image again after applying these patches.
Compare the two images using "qemu-img compare" tool to check if they are
identical.

You can pickup any DMG image from the collection present
Here -> https://lists.gnu.org/archive/html/qemu-devel/2014-12/msg03606.html

->Important note:
These patches assume that the terms "chunk" and "block" are synonyms of each other
when we talk about bz2 compressed streams. Thus according to the bz2 docs[1],
the max uncompressed size of a chunk/block can reach to 46MB which is less than
the previously allowed size of 64MB, so we can continue decompressing the whole
chunk/block at once instead of partial decompression just like we do now.

This limitation was forced by the fact that bz2 compressed streams do not allow
random access midway through a chunk/block as the BZ2_bzDecompress() API in bzlib
seeks for the magic key "BZh" before starting decompression.[2] This magic key is
present at the start of every chunk/block only and since our cached random access
points need not necessarily point to the start of a chunk/block, BZ2_bzDecompress()
fails with an error value BZ_DATA_ERROR_MAGIC[3]

[1] https://en.wikipedia.org/wiki/Bzip2#File_format
[2] https://blastedbio.blogspot.in/2011/11/random-access-to-bzip2.html
[3] http://linux.math.tifr.res.in/manuals/html/manual_3.html#SEC17

Special thanks to Peter Wu for helping me understand and tackle the bz2
compressed chunks.

Changes in v2:
- limit the buffer size to 2MB after fixing the buffering problems (john/fam)

Ashijeet Acharya (7):
  dmg: Introduce a new struct to cache random access points
  dmg: New function to help us cache random access point
  dmg: Refactor and prepare dmg_read_chunk() to cache random access
    points
  dmg: Handle zlib compressed chunks
  dmg: Handle bz2 compressed/raw/zeroed chunks
  dmg: Refactor dmg_co_preadv() to start reading multiple sectors
  dmg: Limit the output buffer size to a max of 2MB

 block/dmg.c | 214 +++++++++++++++++++++++++++++++++++++++---------------------
 block/dmg.h |  10 +++
 2 files changed, 148 insertions(+), 76 deletions(-)

-- 
2.6.2

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-05-05 13:12   ` Stefan Hajnoczi
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 2/7] dmg: New function to help us cache random access point Ashijeet Acharya
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

We need to cache the random access point while performing partial
decompression so that we can resume decompression from that point
onwards in our next sequential read request. Introduce a new struct
DMGReadState which will help us do this.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/dmg.h b/block/dmg.h
index b592d6f..ee67ae1 100644
--- a/block/dmg.h
+++ b/block/dmg.h
@@ -31,6 +31,15 @@
 #include "block/block_int.h"
 #include <zlib.h>
 
+/* used to cache current position in compressed input stream */
+typedef struct DMGReadState {
+    uint8_t *saved_next_in;
+    int64_t saved_avail_in;
+    int32_t saved_chunk_type;
+    int64_t sectors_read; /* possible sectors read in each cycle */
+    int32_t sector_offset_in_chunk;
+} DMGReadState;
+
 typedef struct BDRVDMGState {
     CoMutex lock;
     /* each chunk contains a certain number of sectors,
@@ -51,6 +60,7 @@ typedef struct BDRVDMGState {
     uint8_t *compressed_chunk;
     uint8_t *uncompressed_chunk;
     z_stream zstream;
+    DMGReadState *drs;
 } BDRVDMGState;
 
 extern int (*dmg_uncompress_bz2)(char *next_in, unsigned int avail_in,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 2/7] dmg: New function to help us cache random access point
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points Ashijeet Acharya
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

Introduce a new cache_access_point() function which will help us first
cache a random access point inside a compressed stream and then keep
updating it according to our requirement at appropriate times.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/block/dmg.c b/block/dmg.c
index a7d25fc..c6fe8b0 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -128,6 +128,18 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
     }
 }
 
+static void cache_access_point(DMGReadState *drs, uint8_t *next_in,
+                                int64_t avail_in, int32_t chunk,
+                                int64_t sectors_read, int32_t sector_offset)
+{
+    drs->saved_next_in = next_in;
+    drs->saved_avail_in = avail_in;
+    drs->saved_chunk_type = chunk;
+    drs->sectors_read = sectors_read;
+    drs->sector_offset_in_chunk = sector_offset;
+    return;
+}
+
 static int64_t dmg_find_koly_offset(BdrvChild *file, Error **errp)
 {
     BlockDriverState *file_bs = file->bs;
@@ -507,6 +519,10 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail;
     }
 
+    s->drs = g_malloc(sizeof(DMGReadState));
+    /* initialise our access point cache */
+    cache_access_point(s->drs, NULL, -1, -1, -1, -1);
+
     if (inflateInit(&s->zstream) != Z_OK) {
         ret = -EINVAL;
         goto fail;
@@ -523,6 +539,7 @@ fail:
     g_free(s->lengths);
     g_free(s->sectors);
     g_free(s->sectorcounts);
+    g_free(s->drs);
     qemu_vfree(s->compressed_chunk);
     qemu_vfree(s->uncompressed_chunk);
     return ret;
@@ -685,6 +702,7 @@ static void dmg_close(BlockDriverState *bs)
     g_free(s->lengths);
     g_free(s->sectors);
     g_free(s->sectorcounts);
+    g_free(s->drs);
     qemu_vfree(s->compressed_chunk);
     qemu_vfree(s->uncompressed_chunk);
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 2/7] dmg: New function to help us cache random access point Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-05-05 13:33   ` Stefan Hajnoczi
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 4/7] dmg: Handle zlib compressed chunks Ashijeet Acharya
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

Refactor dmg_read_chunk() to start making use of the new DMGReadState
structure and do chunk and sector related calculations based on it.
Add a new argument "DMGReadState *drs" to it.

Also, rename DMG_SECTORCOUNTS_MAX to DMG_SECTOR_MAX to avoid indentaion
problems at some places inside dmg_read_chunk()

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 159 +++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 94 insertions(+), 65 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index c6fe8b0..32623e2 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -38,7 +38,7 @@ enum {
      * or truncating when converting to 32-bit types
      */
     DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
-    DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
+    DMG_SECTOR_MAX = DMG_LENGTHS_MAX / 512,
 };
 
 static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -260,10 +260,10 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
 
         /* all-zeroes sector (type 2) does not need to be "uncompressed" and can
          * therefore be unbounded. */
-        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
+        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTOR_MAX) {
             error_report("sector count %" PRIu64 " for chunk %" PRIu32
                          " is larger than max (%u)",
-                         s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
+                         s->sectorcounts[i], i, DMG_SECTOR_MAX);
             ret = -EINVAL;
             goto fail;
         }
@@ -578,78 +578,106 @@ static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
     return s->n_chunks; /* error */
 }
 
-static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
+static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num,
+                                 DMGReadState *drs)
 {
     BDRVDMGState *s = bs->opaque;
 
+    int ret;
+    uint32_t sector_offset;
+    uint64_t sectors_read;
+    uint32_t chunk;
+
     if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
-        int ret;
-        uint32_t chunk = search_chunk(s, sector_num);
+        chunk = search_chunk(s, sector_num);
+    } else {
+        chunk = drs->saved_chunk_type;
+    }
 
-        if (chunk >= s->n_chunks) {
+    if (chunk >= s->n_chunks) {
+            return -1;
+    }
+
+    /* reset our access point cache if we had a change in current chunk */
+    if (chunk != drs->saved_chunk_type) {
+        cache_access_point(drs, NULL, -1, -1, -1, -1);
+    }
+
+    sector_offset = sector_num - s->sectors[chunk];
+
+    if ((s->sectorcounts[chunk] - sector_offset) > DMG_SECTOR_MAX) {
+        sectors_read = DMG_SECTOR_MAX;
+    } else {
+        sectors_read = s->sectorcounts[chunk] - sector_offset;
+    }
+
+    /* truncate sectors read if it exceeds the 2MB buffer of qemu-img
+     * convert */
+    if ((sector_num % DMG_SECTOR_MAX) + sectors_read > DMG_SECTOR_MAX) {
+        sectors_read = DMG_SECTOR_MAX - (sector_num % DMG_SECTOR_MAX);
+    }
+
+    s->current_chunk = s->n_chunks;
+
+    switch (s->types[chunk]) { /* block entry type */
+    case 0x80000005: { /* zlib compressed */
+        /* we need to buffer, because only the chunk as whole can be
+         * inflated. */
+        ret = bdrv_pread(bs->file, s->offsets[chunk],
+                         s->compressed_chunk, s->lengths[chunk]);
+        if (ret != s->lengths[chunk]) {
             return -1;
         }
 
-        s->current_chunk = s->n_chunks;
-        switch (s->types[chunk]) { /* block entry type */
-        case 0x80000005: { /* zlib compressed */
-            /* we need to buffer, because only the chunk as whole can be
-             * inflated. */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
-                             s->compressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
-
-            s->zstream.next_in = s->compressed_chunk;
-            s->zstream.avail_in = s->lengths[chunk];
-            s->zstream.next_out = s->uncompressed_chunk;
-            s->zstream.avail_out = 512 * s->sectorcounts[chunk];
-            ret = inflateReset(&s->zstream);
-            if (ret != Z_OK) {
-                return -1;
-            }
-            ret = inflate(&s->zstream, Z_FINISH);
-            if (ret != Z_STREAM_END ||
-                s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
-                return -1;
-            }
-            break; }
-        case 0x80000006: /* bzip2 compressed */
-            if (!dmg_uncompress_bz2) {
-                break;
-            }
-            /* we need to buffer, because only the chunk as whole can be
-             * inflated. */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
-                             s->compressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
-
-            ret = dmg_uncompress_bz2((char *)s->compressed_chunk,
-                                     (unsigned int) s->lengths[chunk],
-                                     (char *)s->uncompressed_chunk,
-                                     (unsigned int)
-                                         (512 * s->sectorcounts[chunk]));
-            if (ret < 0) {
-                return ret;
-            }
-            break;
-        case 1: /* copy */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
-                             s->uncompressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
-            break;
-        case 2: /* zero */
-            /* see dmg_read, it is treated specially. No buffer needs to be
-             * pre-filled, the zeroes can be set directly. */
+        s->zstream.next_in = s->compressed_chunk;
+        s->zstream.avail_in = s->lengths[chunk];
+        s->zstream.next_out = s->uncompressed_chunk;
+        s->zstream.avail_out = 512 * s->sectorcounts[chunk];
+        ret = inflateReset(&s->zstream);
+        if (ret != Z_OK) {
+            return -1;
+        }
+        ret = inflate(&s->zstream, Z_FINISH);
+        if (ret != Z_STREAM_END ||
+            s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
+            return -1;
+        }
+        break; }
+    case 0x80000006: /* bzip2 compressed */
+        if (!dmg_uncompress_bz2) {
             break;
         }
-        s->current_chunk = chunk;
+        /* we need to buffer, because only the chunk as whole can be
+         * inflated. */
+        ret = bdrv_pread(bs->file, s->offsets[chunk],
+                         s->compressed_chunk, s->lengths[chunk]);
+        if (ret != s->lengths[chunk]) {
+            return -1;
+        }
+
+        ret = dmg_uncompress_bz2((char *)s->compressed_chunk,
+                                 (unsigned int) s->lengths[chunk],
+                                 (char *)s->uncompressed_chunk,
+                                 (unsigned int)
+                                    (512 * s->sectorcounts[chunk]));
+        if (ret < 0) {
+            return ret;
+        }
+        break;
+    case 1: /* copy */
+        ret = bdrv_pread(bs->file, s->offsets[chunk],
+                         s->uncompressed_chunk, s->lengths[chunk]);
+        if (ret != s->lengths[chunk]) {
+            return -1;
+        }
+        break;
+    case 2: /* zero */
+        /* see dmg_read, it is treated specially. No buffer needs to be
+         * pre-filled, the zeroes can be set directly. */
+        break;
     }
+    s->current_chunk = chunk;
+
     return 0;
 }
 
@@ -661,6 +689,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
     int nb_sectors = bytes >> BDRV_SECTOR_BITS;
     int ret, i;
+    DMGReadState *drs = s->drs;
 
     assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
@@ -671,7 +700,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
         uint32_t sector_offset_in_chunk;
         void *data;
 
-        if (dmg_read_chunk(bs, sector_num + i) != 0) {
+        if (dmg_read_chunk(bs, sector_num + i, drs) != 0) {
             ret = -EIO;
             goto fail;
         }
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 4/7] dmg: Handle zlib compressed chunks
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (2 preceding siblings ...)
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 5/7] dmg: Handle bz2 compressed/raw/zeroed chunks Ashijeet Acharya
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

Set the output buffer size to be equal to the size of number of sectors
stored in @sectors_read. Start inflating to a max output buffer size of
2MB and cache our access point to aid random access later if required.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index 32623e2..073e864 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -621,27 +621,47 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num,
 
     switch (s->types[chunk]) { /* block entry type */
     case 0x80000005: { /* zlib compressed */
-        /* we need to buffer, because only the chunk as whole can be
-         * inflated. */
-        ret = bdrv_pread(bs->file, s->offsets[chunk],
-                         s->compressed_chunk, s->lengths[chunk]);
-        if (ret != s->lengths[chunk]) {
-            return -1;
+        /* check for cached random access point */
+        if (drs->saved_next_in == NULL) {
+            /* we need to buffer, because only the chunk as whole can be
+             * inflated. */
+            ret = bdrv_pread(bs->file, s->offsets[chunk],
+                             s->compressed_chunk, s->lengths[chunk]);
+            if (ret != s->lengths[chunk]) {
+                return -1;
+            }
+
+            s->zstream.next_in = s->compressed_chunk;
+            s->zstream.avail_in = s->lengths[chunk];
+        } else {
+            s->zstream.next_in = drs->saved_next_in;
+            s->zstream.avail_in = drs->saved_avail_in;
         }
 
-        s->zstream.next_in = s->compressed_chunk;
-        s->zstream.avail_in = s->lengths[chunk];
         s->zstream.next_out = s->uncompressed_chunk;
-        s->zstream.avail_out = 512 * s->sectorcounts[chunk];
-        ret = inflateReset(&s->zstream);
-        if (ret != Z_OK) {
-            return -1;
+
+        s->zstream.avail_out = sectors_read * BDRV_SECTOR_SIZE;
+
+        if (drs->saved_next_in == NULL) {
+            ret = inflateReset(&s->zstream);
+            if (ret != Z_OK) {
+                return -1;
+            }
+        }
+        /* reset total_out for each successive call */
+        s->zstream.total_out = 0;
+        ret = inflate(&s->zstream, Z_SYNC_FLUSH);
+        if (ret == Z_OK &&
+            s->zstream.total_out == 512 * sectors_read) {
+            goto update;
         }
-        ret = inflate(&s->zstream, Z_FINISH);
         if (ret != Z_STREAM_END ||
-            s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
+            s->zstream.total_out != 512 * sectors_read) {
             return -1;
         }
+update:
+        cache_access_point(drs, s->zstream.next_in, s->zstream.avail_in,
+                           chunk, sectors_read, sector_offset);
         break; }
     case 0x80000006: /* bzip2 compressed */
         if (!dmg_uncompress_bz2) {
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 5/7] dmg: Handle bz2 compressed/raw/zeroed chunks
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (3 preceding siblings ...)
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 4/7] dmg: Handle zlib compressed chunks Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 6/7] dmg: Refactor dmg_co_preadv() to start reading multiple sectors Ashijeet Acharya
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

We do not need to cache the access point for these chunks but need to
update our various supporting variables like chunk, sectors_read etc.
to keep maintaining our code structure. Call cache_access_point() after
reading chunks of these types.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index 073e864..f9045f9 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -680,20 +680,30 @@ update:
                                  (char *)s->uncompressed_chunk,
                                  (unsigned int)
                                     (512 * s->sectorcounts[chunk]));
+
         if (ret < 0) {
             return ret;
         }
+        cache_access_point(drs, NULL, -1, chunk, sectors_read,
+                           sector_offset);
         break;
     case 1: /* copy */
-        ret = bdrv_pread(bs->file, s->offsets[chunk],
-                         s->uncompressed_chunk, s->lengths[chunk]);
-        if (ret != s->lengths[chunk]) {
-            return -1;
+        if (drs->sectors_read == -1) {
+            ret = bdrv_pread(bs->file, s->offsets[chunk],
+                             s->uncompressed_chunk, s->lengths[chunk]);
+            if (ret != s->lengths[chunk]) {
+                return -1;
+            }
         }
+        cache_access_point(drs, NULL, -1, chunk, sectors_read,
+                           sector_offset);
         break;
     case 2: /* zero */
         /* see dmg_read, it is treated specially. No buffer needs to be
          * pre-filled, the zeroes can be set directly. */
+        cache_access_point(drs, NULL, -1, chunk, sectors_read,
+                           sector_offset);
+
         break;
     }
     s->current_chunk = chunk;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 6/7] dmg: Refactor dmg_co_preadv() to start reading multiple sectors
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (4 preceding siblings ...)
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 5/7] dmg: Handle bz2 compressed/raw/zeroed chunks Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 7/7] dmg: Limit the output buffer size to a max of 2MB Ashijeet Acharya
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

At the moment, dmg_co_preadv() reads one sector at a time. Make it
read multiple sectors at a time depending on the number of sectors
stored in "drs->sectors_read". This does not provide any significant
optimization in the I/O process of DMG but is still a nicer way.

Adjust the 'data' variable depending on our cached access point
situation to align our read requests appropriately.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index f9045f9..8b7460c 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -718,7 +718,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     BDRVDMGState *s = bs->opaque;
     uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
     int nb_sectors = bytes >> BDRV_SECTOR_BITS;
-    int ret, i;
+    int ret, i = 0;
     DMGReadState *drs = s->drs;
 
     assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
@@ -726,8 +726,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 
     qemu_co_mutex_lock(&s->lock);
 
-    for (i = 0; i < nb_sectors; i++) {
-        uint32_t sector_offset_in_chunk;
+    while (i < nb_sectors) {
         void *data;
 
         if (dmg_read_chunk(bs, sector_num + i, drs) != 0) {
@@ -738,12 +737,20 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
          * s->uncompressed_chunk may be too small to cover the large all-zeroes
          * section. dmg_read_chunk is called to find s->current_chunk */
         if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
-            qemu_iovec_memset(qiov, i * 512, 0, 512);
-            continue;
+            qemu_iovec_memset(qiov, i * 512, 0,
+                                    512 * drs->sectors_read);
+            goto increment;
+        }
+
+        if (drs->saved_next_in == NULL) {
+            data = s->uncompressed_chunk + drs->sector_offset_in_chunk * 512;
+        } else {
+            data = s->uncompressed_chunk;
         }
-        sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
-        data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
-        qemu_iovec_from_buf(qiov, i * 512, data, 512);
+        qemu_iovec_from_buf(qiov, i * 512, data, drs->sectors_read * 512);
+
+increment:
+        i += drs->sectors_read;
     }
 
     ret = 0;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH v2 7/7] dmg: Limit the output buffer size to a max of 2MB
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (5 preceding siblings ...)
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 6/7] dmg: Refactor dmg_co_preadv() to start reading multiple sectors Ashijeet Acharya
@ 2017-04-27  8:06 ` Ashijeet Acharya
  2017-04-27 20:59 ` [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27  8:06 UTC (permalink / raw)
  To: stefanha
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block,
	Ashijeet Acharya

The size of the output buffer is limited to a maximum of 2MB so that
QEMU doesn't end up allocating huge amounts of memory while
decompressing compressed input streams.

2MB is an appropriate size because "qemu-img convert" has the same I/O
buffer size and the most important use case for DMG files is to be
compatible with qemu-img convert.

We have refactored the DMG driver to accept and process images
irrespective of their chunk sizes since we now have a limit of 2MB on
our output buffer size. Thus QEMU will not allocate huge amounts of
memory no matter what the chunk size is. Remove the error messages to
prevent denial-of-service in cases where untrusted files are being
accessed by the user.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/dmg.c | 24 +-----------------------
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index 8b7460c..ade2578 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -37,7 +37,7 @@ enum {
     /* Limit chunk sizes to prevent unreasonable amounts of memory being used
      * or truncating when converting to 32-bit types
      */
-    DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
+    DMG_LENGTHS_MAX = 2 * 1024 * 1024, /* 2 MB */
     DMG_SECTOR_MAX = DMG_LENGTHS_MAX / 512,
 };
 
@@ -209,7 +209,6 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
                                uint8_t *buffer, uint32_t count)
 {
     uint32_t type, i;
-    int ret;
     size_t new_size;
     uint32_t chunk_count;
     int64_t offset = 0;
@@ -258,16 +257,6 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
         /* sector count */
         s->sectorcounts[i] = buff_read_uint64(buffer, offset + 0x10);
 
-        /* all-zeroes sector (type 2) does not need to be "uncompressed" and can
-         * therefore be unbounded. */
-        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTOR_MAX) {
-            error_report("sector count %" PRIu64 " for chunk %" PRIu32
-                         " is larger than max (%u)",
-                         s->sectorcounts[i], i, DMG_SECTOR_MAX);
-            ret = -EINVAL;
-            goto fail;
-        }
-
         /* offset in (compressed) data fork */
         s->offsets[i] = buff_read_uint64(buffer, offset + 0x18);
         s->offsets[i] += in_offset;
@@ -275,23 +264,12 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
         /* length in (compressed) data fork */
         s->lengths[i] = buff_read_uint64(buffer, offset + 0x20);
 
-        if (s->lengths[i] > DMG_LENGTHS_MAX) {
-            error_report("length %" PRIu64 " for chunk %" PRIu32
-                         " is larger than max (%u)",
-                         s->lengths[i], i, DMG_LENGTHS_MAX);
-            ret = -EINVAL;
-            goto fail;
-        }
-
         update_max_chunk_size(s, i, &ds->max_compressed_size,
                               &ds->max_sectors_per_chunk);
         offset += 40;
     }
     s->n_chunks += chunk_count;
     return 0;
-
-fail:
-    return ret;
 }
 
 static int dmg_read_resource_fork(BlockDriverState *bs, DmgHeaderState *ds,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (6 preceding siblings ...)
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 7/7] dmg: Limit the output buffer size to a max of 2MB Ashijeet Acharya
@ 2017-04-27 20:59 ` Ashijeet Acharya
  2017-05-05 13:06 ` Stefan Hajnoczi
  2017-05-05 13:59 ` Stefan Hajnoczi
  9 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-04-27 20:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block, Ashijeet Acharya

On Thu, Apr 27, 2017 at 1:36 PM, Ashijeet Acharya
<ashijeetacharya@gmail.com> wrote:
> Previously posted series patches:
> v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg04641.html
>
> This series helps to provide chunk size independence for DMG driver to prevent
> denial-of-service in cases where untrusted files are being accessed by the user.
>
> This task is mentioned on the public block ToDo
> Here -> http://wiki.qemu.org/ToDo/Block/DmgChunkSizeIndependence
>
> Patch 1 introduces a new data structure to aid caching of random access points
> within a compressed stream.
>
> Patch 2 is an extension of patch 1 and introduces a new function to
> initialize/update/reset our cached random access point.
>
> Patch 3 limits the output buffer size to a max of 2MB to avoid QEMU allocate
> huge amounts of memory.
>
> Patch 4 is a simple preparatory patch to aid handling of various types of chunks.
>
> Patches 5 & 6 help to handle various types of chunks.
>
> Patch 7 simply refactors dmg_co_preadv() to read multiple sectors at once.
>
> Patch 8 finally removes the error messages QEMU used to throw when an image with
> chunk sizes above 64MB were accessed by the user.

John, I have squashed patch 3 and 8 (as in v1) actually and that
change is represented in patch 7 (as in v2). The cover letter here is
quite misleading, as I forgot to update it and simply did a ctrl-c --
ctrl-v carelessly.

Ashijeet

> Ashijeet Acharya (7):
>   dmg: Introduce a new struct to cache random access points
>   dmg: New function to help us cache random access point
>   dmg: Refactor and prepare dmg_read_chunk() to cache random access
>     points
>   dmg: Handle zlib compressed chunks
>   dmg: Handle bz2 compressed/raw/zeroed chunks
>   dmg: Refactor dmg_co_preadv() to start reading multiple sectors
>   dmg: Limit the output buffer size to a max of 2MB
>
>  block/dmg.c | 214 +++++++++++++++++++++++++++++++++++++++---------------------
>  block/dmg.h |  10 +++
>  2 files changed, 148 insertions(+), 76 deletions(-)
>
> --
> 2.6.2
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (7 preceding siblings ...)
  2017-04-27 20:59 ` [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
@ 2017-05-05 13:06 ` Stefan Hajnoczi
  2017-05-05 13:59 ` Stefan Hajnoczi
  9 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-05-05 13:06 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> ->Testing procedure:
> Convert a DMG file to raw format using the "qemu-img convert" tool present in
> v2.9.0
> Next convert the same image again after applying these patches.
> Compare the two images using "qemu-img compare" tool to check if they are
> identical.
> 
> You can pickup any DMG image from the collection present
> Here -> https://lists.gnu.org/archive/html/qemu-devel/2014-12/msg03606.html

Please add a qemu-iotests test case:
http://wiki.qemu.org/Testing/QemuIoTests

Image files can be included in tests/qemu-iotests/sample_images/.  They
must be small.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points Ashijeet Acharya
@ 2017-05-05 13:12   ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-05-05 13:12 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 437 bytes --]

On Thu, Apr 27, 2017 at 01:36:31PM +0530, Ashijeet Acharya wrote:
> @@ -51,6 +60,7 @@ typedef struct BDRVDMGState {
>      uint8_t *compressed_chunk;
>      uint8_t *uncompressed_chunk;
>      z_stream zstream;
> +    DMGReadState *drs;

This doesn't need to be a pointer.  The object is allocated in
.bdrv_open() and freed in .bdrv_close().  It's simpler to drop the heap
allocation and simply inline the struct:

    DMGReadState drs;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points
  2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points Ashijeet Acharya
@ 2017-05-05 13:33   ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-05-05 13:33 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 10008 bytes --]

On Thu, Apr 27, 2017 at 01:36:33PM +0530, Ashijeet Acharya wrote:
> Refactor dmg_read_chunk() to start making use of the new DMGReadState
> structure and do chunk and sector related calculations based on it.
> Add a new argument "DMGReadState *drs" to it.
> 
> Also, rename DMG_SECTORCOUNTS_MAX to DMG_SECTOR_MAX to avoid indentaion
> problems at some places inside dmg_read_chunk()
> 
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> ---
>  block/dmg.c | 159 +++++++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 94 insertions(+), 65 deletions(-)

This patch makes dmg_read_chunk() reread the chunk even if it has
already been read into s->uncompressed_chunk.  This is inefficient.  If
this is necessary for some reason then the commit description should
include a justification.

> diff --git a/block/dmg.c b/block/dmg.c
> index c6fe8b0..32623e2 100644
> --- a/block/dmg.c
> +++ b/block/dmg.c
> @@ -38,7 +38,7 @@ enum {
>       * or truncating when converting to 32-bit types
>       */
>      DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
> -    DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
> +    DMG_SECTOR_MAX = DMG_LENGTHS_MAX / 512,
>  };
>  
>  static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
> @@ -260,10 +260,10 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
>  
>          /* all-zeroes sector (type 2) does not need to be "uncompressed" and can
>           * therefore be unbounded. */
> -        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
> +        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTOR_MAX) {
>              error_report("sector count %" PRIu64 " for chunk %" PRIu32
>                           " is larger than max (%u)",
> -                         s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
> +                         s->sectorcounts[i], i, DMG_SECTOR_MAX);
>              ret = -EINVAL;
>              goto fail;
>          }
> @@ -578,78 +578,106 @@ static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
>      return s->n_chunks; /* error */
>  }
>  
> -static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
> +static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num,
> +                                 DMGReadState *drs)
>  {
>      BDRVDMGState *s = bs->opaque;
>  
> +    int ret;
> +    uint32_t sector_offset;
> +    uint64_t sectors_read;
> +    uint32_t chunk;
> +
>      if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
> -        int ret;
> -        uint32_t chunk = search_chunk(s, sector_num);
> +        chunk = search_chunk(s, sector_num);
> +    } else {
> +        chunk = drs->saved_chunk_type;

chunk is an index into s->sectors[] and s->sectorcounts[].  It is not a
chunk type, so drs->saved_chunk_type is an odd name for this new field.

But now this makes me wonder - how is s->current_chunk different from
drs->saved_chunk_type?

> +    }
>  
> -        if (chunk >= s->n_chunks) {
> +    if (chunk >= s->n_chunks) {
> +            return -1;
> +    }
> +
> +    /* reset our access point cache if we had a change in current chunk */
> +    if (chunk != drs->saved_chunk_type) {
> +        cache_access_point(drs, NULL, -1, -1, -1, -1);
> +    }
> +

The results of the computation from here...

> +    sector_offset = sector_num - s->sectors[chunk];
> +
> +    if ((s->sectorcounts[chunk] - sector_offset) > DMG_SECTOR_MAX) {
> +        sectors_read = DMG_SECTOR_MAX;
> +    } else {
> +        sectors_read = s->sectorcounts[chunk] - sector_offset;
> +    }
> +
> +    /* truncate sectors read if it exceeds the 2MB buffer of qemu-img
> +     * convert */
> +    if ((sector_num % DMG_SECTOR_MAX) + sectors_read > DMG_SECTOR_MAX) {
> +        sectors_read = DMG_SECTOR_MAX - (sector_num % DMG_SECTOR_MAX);
> +    }

...to here are never used.  Please remove the unused code and variables
from this patch.  It may be necessary to add them back in a later patch
(I haven't looked ahead), but each patch must be self-contained and make
sense without foreknowledge of future patches.

> +
> +    s->current_chunk = s->n_chunks;
> +
> +    switch (s->types[chunk]) { /* block entry type */
> +    case 0x80000005: { /* zlib compressed */
> +        /* we need to buffer, because only the chunk as whole can be
> +         * inflated. */
> +        ret = bdrv_pread(bs->file, s->offsets[chunk],
> +                         s->compressed_chunk, s->lengths[chunk]);
> +        if (ret != s->lengths[chunk]) {
>              return -1;
>          }
>  
> -        s->current_chunk = s->n_chunks;
> -        switch (s->types[chunk]) { /* block entry type */
> -        case 0x80000005: { /* zlib compressed */
> -            /* we need to buffer, because only the chunk as whole can be
> -             * inflated. */
> -            ret = bdrv_pread(bs->file, s->offsets[chunk],
> -                             s->compressed_chunk, s->lengths[chunk]);
> -            if (ret != s->lengths[chunk]) {
> -                return -1;
> -            }
> -
> -            s->zstream.next_in = s->compressed_chunk;
> -            s->zstream.avail_in = s->lengths[chunk];
> -            s->zstream.next_out = s->uncompressed_chunk;
> -            s->zstream.avail_out = 512 * s->sectorcounts[chunk];
> -            ret = inflateReset(&s->zstream);
> -            if (ret != Z_OK) {
> -                return -1;
> -            }
> -            ret = inflate(&s->zstream, Z_FINISH);
> -            if (ret != Z_STREAM_END ||
> -                s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
> -                return -1;
> -            }
> -            break; }
> -        case 0x80000006: /* bzip2 compressed */
> -            if (!dmg_uncompress_bz2) {
> -                break;
> -            }
> -            /* we need to buffer, because only the chunk as whole can be
> -             * inflated. */
> -            ret = bdrv_pread(bs->file, s->offsets[chunk],
> -                             s->compressed_chunk, s->lengths[chunk]);
> -            if (ret != s->lengths[chunk]) {
> -                return -1;
> -            }
> -
> -            ret = dmg_uncompress_bz2((char *)s->compressed_chunk,
> -                                     (unsigned int) s->lengths[chunk],
> -                                     (char *)s->uncompressed_chunk,
> -                                     (unsigned int)
> -                                         (512 * s->sectorcounts[chunk]));
> -            if (ret < 0) {
> -                return ret;
> -            }
> -            break;
> -        case 1: /* copy */
> -            ret = bdrv_pread(bs->file, s->offsets[chunk],
> -                             s->uncompressed_chunk, s->lengths[chunk]);
> -            if (ret != s->lengths[chunk]) {
> -                return -1;
> -            }
> -            break;
> -        case 2: /* zero */
> -            /* see dmg_read, it is treated specially. No buffer needs to be
> -             * pre-filled, the zeroes can be set directly. */
> +        s->zstream.next_in = s->compressed_chunk;
> +        s->zstream.avail_in = s->lengths[chunk];
> +        s->zstream.next_out = s->uncompressed_chunk;
> +        s->zstream.avail_out = 512 * s->sectorcounts[chunk];
> +        ret = inflateReset(&s->zstream);
> +        if (ret != Z_OK) {
> +            return -1;
> +        }
> +        ret = inflate(&s->zstream, Z_FINISH);
> +        if (ret != Z_STREAM_END ||
> +            s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
> +            return -1;
> +        }
> +        break; }
> +    case 0x80000006: /* bzip2 compressed */
> +        if (!dmg_uncompress_bz2) {
>              break;
>          }
> -        s->current_chunk = chunk;
> +        /* we need to buffer, because only the chunk as whole can be
> +         * inflated. */
> +        ret = bdrv_pread(bs->file, s->offsets[chunk],
> +                         s->compressed_chunk, s->lengths[chunk]);
> +        if (ret != s->lengths[chunk]) {
> +            return -1;
> +        }
> +
> +        ret = dmg_uncompress_bz2((char *)s->compressed_chunk,
> +                                 (unsigned int) s->lengths[chunk],
> +                                 (char *)s->uncompressed_chunk,
> +                                 (unsigned int)
> +                                    (512 * s->sectorcounts[chunk]));
> +        if (ret < 0) {
> +            return ret;
> +        }
> +        break;
> +    case 1: /* copy */
> +        ret = bdrv_pread(bs->file, s->offsets[chunk],
> +                         s->uncompressed_chunk, s->lengths[chunk]);
> +        if (ret != s->lengths[chunk]) {
> +            return -1;
> +        }
> +        break;
> +    case 2: /* zero */
> +        /* see dmg_read, it is treated specially. No buffer needs to be
> +         * pre-filled, the zeroes can be set directly. */
> +        break;
>      }
> +    s->current_chunk = chunk;
> +
>      return 0;
>  }
>  
> @@ -661,6 +689,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>      uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
>      int nb_sectors = bytes >> BDRV_SECTOR_BITS;
>      int ret, i;
> +    DMGReadState *drs = s->drs;
>  
>      assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>      assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> @@ -671,7 +700,7 @@ dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>          uint32_t sector_offset_in_chunk;
>          void *data;
>  
> -        if (dmg_read_chunk(bs, sector_num + i) != 0) {
> +        if (dmg_read_chunk(bs, sector_num + i, drs) != 0) {
>              ret = -EIO;
>              goto fail;
>          }
> -- 
> 2.6.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
                   ` (8 preceding siblings ...)
  2017-05-05 13:06 ` Stefan Hajnoczi
@ 2017-05-05 13:59 ` Stefan Hajnoczi
  2017-08-20 12:47   ` Ashijeet Acharya
  9 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-05-05 13:59 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: kwolf, jsnow, mreitz, famz, peter, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2277 bytes --]

On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> This series helps to provide chunk size independence for DMG driver to prevent
> denial-of-service in cases where untrusted files are being accessed by the user.

The core of the chunk size dependence problem are these lines:

    s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
                                              ds.max_compressed_size + 1);
    s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
                                                512 * ds.max_sectors_per_chunk);

The refactoring needs to eliminate these buffers because their size is
controlled by the untrusted input file.

After applying your patches these lines remain unchanged and we still
cannot use input files that have a 250 MB chunk size, for example.  So
I'm not sure how this series is supposed to work.

Here is the approach I would take:

In order to achieve this dmg_read_chunk() needs to be scrapped.  It is
designed to read a full chunk.  The new model does not read full chunks
anymore.

Uncompressed reads or zeroes should operate directly on qiov, not
s->uncompressed_chunk.  This code will be dropped:

    data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
    qemu_iovec_from_buf(qiov, i * 512, data, 512);

Compressed reads still buffers.  I suggest the following buffers:

1. compressed_buf - compressed data is read into this buffer from file
2. uncompressed_buf - a place to discard decompressed data while
                      simulating a seek operation

Data is read from compressed chunks by reading a reasonable amount
(64k?) into compressed_buf.  If the user wishes to read at an offset
into this chunk then a loop decompresses data we are seeking over into
uncompressed_buf (and refills compressed_buf if it becomes empty) until
the desired offset is reached.  Then decompression can continue
directly into the user's qiov and uncompressed_buf isn't used to
decompress the data requested by the user.

Sequential compressed reads can be optimized by keeping the compression
state across read calls.  That means the zlib/bz2 state plus
compressed_buf and the current offset.  That way we don't need to
re-seek into the current compressed chunk to handle sequential reads.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-05-05 13:59 ` Stefan Hajnoczi
@ 2017-08-20 12:47   ` Ashijeet Acharya
  2017-08-29 15:25     ` Stefan Hajnoczi
  0 siblings, 1 reply; 18+ messages in thread
From: Ashijeet Acharya @ 2017-08-20 12:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block

On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> > This series helps to provide chunk size independence for DMG driver to
> prevent
> > denial-of-service in cases where untrusted files are being accessed by
> the user.
>
> The core of the chunk size dependence problem are these lines:
>
>     s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
>                                               ds.max_compressed_size + 1);
>     s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
>                                                 512 *
> ds.max_sectors_per_chunk);
>
> The refactoring needs to eliminate these buffers because their size is
> controlled by the untrusted input file.
>

Oh okay, I understand now. But wouldn't I still need to allocate some
memory for these buffers to be able to use them for the compressed chunks
case you mentioned below. Instead of letting the DMG images control the
size of these buffers, maybe I can hard-code the size of these buffers
instead?

>
> After applying your patches these lines remain unchanged and we still
> cannot use input files that have a 250 MB chunk size, for example.  So
> I'm not sure how this series is supposed to work.
>
> Here is the approach I would take:
>
> In order to achieve this dmg_read_chunk() needs to be scrapped.  It is
> designed to read a full chunk.  The new model does not read full chunks
> anymore.
>
> Uncompressed reads or zeroes should operate directly on qiov, not
> s->uncompressed_chunk.  This code will be dropped:
>
>     data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
>     qemu_iovec_from_buf(qiov, i * 512, data, 512);
>

I have never worked with qiov before, are there any places where I can
refer to inside other drivers to get the idea of how to use it directly (I
am searching by myself in the meantime...)? I got clearly what you are
trying to say, but don't know how to implement it. I think, don't we
already do that for the zeroed chunks in DMG in dmg_co_preadv()?

>
> Compressed reads still buffers.  I suggest the following buffers:
>
> 1. compressed_buf - compressed data is read into this buffer from file
> 2. uncompressed_buf - a place to discard decompressed data while
>                       simulating a seek operation
>

Yes, these are the buffers whose size I can hard-code as discussed above?
You can suggest the preferred size to me.

> Data is read from compressed chunks by reading a reasonable amount
> (64k?) into compressed_buf.  If the user wishes to read at an offset
> into this chunk then a loop decompresses data we are seeking over into
> uncompressed_buf (and refills compressed_buf if it becomes empty) until
> the desired offset is reached.  Then decompression can continue
> directly into the user's qiov and uncompressed_buf isn't used to
> decompress the data requested by the user.
>

Yes, this series does exactly that but keeps using the "uncompressed"
buffer once we reach the desired offset. Once, I understand to use qiov
directly, we can do this. Also, Kevin did suggest me (as I remember
vaguely) that in reality we never actually get the read request at a
particular offset because DMG driver is generally used with "qemu-img
convert", which means all read requests are from the top.

>
> Sequential compressed reads can be optimized by keeping the compression
> state across read calls.  That means the zlib/bz2 state plus
> compressed_buf and the current offset.  That way we don't need to
> re-seek into the current compressed chunk to handle sequential reads.
>

I guess, that's what I implemented with this series so now I can reuse the
"caching access point" part in the next series to implement this
optimization.

Thanks
Ashijeet

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-08-20 12:47   ` Ashijeet Acharya
@ 2017-08-29 15:25     ` Stefan Hajnoczi
  2017-08-30 13:02       ` Ashijeet Acharya
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-08-29 15:25 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block

On Sun, Aug 20, 2017 at 1:47 PM, Ashijeet Acharya
<ashijeetacharya@gmail.com> wrote:
> On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>
>> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
>> > This series helps to provide chunk size independence for DMG driver to
>> > prevent
>> > denial-of-service in cases where untrusted files are being accessed by
>> > the user.
>>
>> The core of the chunk size dependence problem are these lines:
>>
>>     s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
>>                                               ds.max_compressed_size + 1);
>>     s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
>>                                                 512 *
>> ds.max_sectors_per_chunk);
>>
>> The refactoring needs to eliminate these buffers because their size is
>> controlled by the untrusted input file.
>
>
> Oh okay, I understand now. But wouldn't I still need to allocate some memory
> for these buffers to be able to use them for the compressed chunks case you
> mentioned below. Instead of letting the DMG images control the size of these
> buffers, maybe I can hard-code the size of these buffers instead?
>
>>
>>
>> After applying your patches these lines remain unchanged and we still
>> cannot use input files that have a 250 MB chunk size, for example.  So
>> I'm not sure how this series is supposed to work.
>>
>> Here is the approach I would take:
>>
>> In order to achieve this dmg_read_chunk() needs to be scrapped.  It is
>> designed to read a full chunk.  The new model does not read full chunks
>> anymore.
>>
>> Uncompressed reads or zeroes should operate directly on qiov, not
>> s->uncompressed_chunk.  This code will be dropped:
>>
>>     data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
>>     qemu_iovec_from_buf(qiov, i * 512, data, 512);
>
>
> I have never worked with qiov before, are there any places where I can refer
> to inside other drivers to get the idea of how to use it directly (I am
> searching by myself in the meantime...)?

A QEMUIOVector is a utility type for struct iovec iov[] processing.
See util/iov.c.  This is called "vectored" or "scatter-gather" I/O.

Instead of transferring data to/from a single <buffer, length> tuple,
they take an array [<buffer, length>].  For example, the buffer "Hello
world" could be split into two elements:
[{"Hello ", strlen("Hello ")},
 {"world", strlen("world")}]

Vectored I/O is often used because it eliminates memory copies.  Say
you have a network packet header struct and also a data payload array.
Traditionally you would have to allocate a new buffer large enough for
both header and payload, copy the header and payload into the buffer,
and finally give this temporary buffer to the I/O function.  This is
inefficient.  With vectored I/O you can create a vector with two
elements, the header and the payload, and the I/O function can process
them without needing a temporary buffer copy.

> I got clearly what you are trying
> to say, but don't know how to implement it. I think, don't we already do
> that for the zeroed chunks in DMG in dmg_co_preadv()?

Yes, dmg_co_preadv() directly zeroes the qiov.  It doesn't use
s->compressed_chunk/s->uncompressed_chunk.

>
>>
>>
>> Compressed reads still buffers.  I suggest the following buffers:
>>
>> 1. compressed_buf - compressed data is read into this buffer from file
>> 2. uncompressed_buf - a place to discard decompressed data while
>>                       simulating a seek operation
>
>
> Yes, these are the buffers whose size I can hard-code as discussed above?
> You can suggest the preferred size to me.

Try starting with 256 KB for both buffers.

>> Data is read from compressed chunks by reading a reasonable amount
>> (64k?) into compressed_buf.  If the user wishes to read at an offset
>> into this chunk then a loop decompresses data we are seeking over into
>> uncompressed_buf (and refills compressed_buf if it becomes empty) until
>> the desired offset is reached.  Then decompression can continue
>> directly into the user's qiov and uncompressed_buf isn't used to
>> decompress the data requested by the user.
>
>
> Yes, this series does exactly that but keeps using the "uncompressed" buffer
> once we reach the desired offset. Once, I understand to use qiov directly,
> we can do this. Also, Kevin did suggest me (as I remember vaguely) that in
> reality we never actually get the read request at a particular offset
> because DMG driver is generally used with "qemu-img convert", which means
> all read requests are from the top.

For performance it's fine to assume a sequential access pattern.  The
block driver should still support random access I/O though.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-08-29 15:25     ` Stefan Hajnoczi
@ 2017-08-30 13:02       ` Ashijeet Acharya
  2017-09-05 10:58         ` Stefan Hajnoczi
  0 siblings, 1 reply; 18+ messages in thread
From: Ashijeet Acharya @ 2017-08-30 13:02 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block

On Tue, Aug 29, 2017 at 8:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Sun, Aug 20, 2017 at 1:47 PM, Ashijeet Acharya
> <ashijeetacharya@gmail.com> wrote:
> > On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefanha@gmail.com>
> wrote:
> >>
> >> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> >> > This series helps to provide chunk size independence for DMG driver to
> >> > prevent
> >> > denial-of-service in cases where untrusted files are being accessed by
> >> > the user.
> >>
> >> The core of the chunk size dependence problem are these lines:
> >>
> >>     s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
> >>                                               ds.max_compressed_size +
> 1);
> >>     s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
> >>                                                 512 *
> >> ds.max_sectors_per_chunk);
> >>
> >> The refactoring needs to eliminate these buffers because their size is
> >> controlled by the untrusted input file.
> >
> >
> > Oh okay, I understand now. But wouldn't I still need to allocate some
> memory
> > for these buffers to be able to use them for the compressed chunks case
> you
> > mentioned below. Instead of letting the DMG images control the size of
> these
> > buffers, maybe I can hard-code the size of these buffers instead?
> >
> >>
> >>
> >> After applying your patches these lines remain unchanged and we still
> >> cannot use input files that have a 250 MB chunk size, for example.  So
> >> I'm not sure how this series is supposed to work.
> >>
> >> Here is the approach I would take:
> >>
> >> In order to achieve this dmg_read_chunk() needs to be scrapped.  It is
> >> designed to read a full chunk.  The new model does not read full chunks
> >> anymore.
> >>
> >> Uncompressed reads or zeroes should operate directly on qiov, not
> >> s->uncompressed_chunk.  This code will be dropped:
> >>
> >>     data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
> >>     qemu_iovec_from_buf(qiov, i * 512, data, 512);
> >
> >
> > I have never worked with qiov before, are there any places where I can
> refer
> > to inside other drivers to get the idea of how to use it directly (I am
> > searching by myself in the meantime...)?
>
> A QEMUIOVector is a utility type for struct iovec iov[] processing.
> See util/iov.c.  This is called "vectored" or "scatter-gather" I/O.
>
> Instead of transferring data to/from a single <buffer, length> tuple,
> they take an array [<buffer, length>].  For example, the buffer "Hello
> world" could be split into two elements:
> [{"Hello ", strlen("Hello ")},
>  {"world", strlen("world")}]
>
> Vectored I/O is often used because it eliminates memory copies.  Say
> you have a network packet header struct and also a data payload array.
> Traditionally you would have to allocate a new buffer large enough for
> both header and payload, copy the header and payload into the buffer,
> and finally give this temporary buffer to the I/O function.  This is
> inefficient.  With vectored I/O you can create a vector with two
> elements, the header and the payload, and the I/O function can process
> them without needing a temporary buffer copy.
>

Thanks for the detailed explanation, I think I understood the concept now
and how to use qiov efficiently.
Correct me if I am wrong here. In order to use qiov directly for
uncompressed chunks:

1. Declare a new local_qiov inside dmg_co_preadv (let's say)
2. Initialize it with qemu_iovec_init()
3. Reset it with qemu_iovec_reset() (this is because we will perform this
action in a loop and thus need to reset it before every loop?)
4. Declare a buffer "uncompressed_buf" and allocate it with
qemu_try_blockalign()
5. Add this buffer to our local_qiov using qemu_iovec_add()
6. Read data from file directly into local_qiov using bdrv_co_preadv()
7. On success concatenate it with the qiov passed into the main
dmg_co_preadv() function.

I think this method only works for uncompressed chunks. For the compressed
ones, I believe we will still need to do it in the existing way, i.e. read
chunk from file -> decompress into output buffer -> use
qemu_iovec_from_buf() because we cannot read directly since data is in
compressed form. Makes sense?


> > I got clearly what you are trying
> > to say, but don't know how to implement it. I think, don't we already do
> > that for the zeroed chunks in DMG in dmg_co_preadv()?
>
> Yes, dmg_co_preadv() directly zeroes the qiov.  It doesn't use
> s->compressed_chunk/s->uncompressed_chunk.
>
> >
> >>
> >>
> >> Compressed reads still buffers.  I suggest the following buffers:
> >>
> >> 1. compressed_buf - compressed data is read into this buffer from file
> >> 2. uncompressed_buf - a place to discard decompressed data while
> >>                       simulating a seek operation
> >
> >
> > Yes, these are the buffers whose size I can hard-code as discussed above?
> > You can suggest the preferred size to me.
>
> Try starting with 256 KB for both buffers.
>

Okay, I will do that. But I think we cannot use these buffer sizes for bz2
chunks (see below)


> >> Data is read from compressed chunks by reading a reasonable amount
> >> (64k?) into compressed_buf.  If the user wishes to read at an offset
> >> into this chunk then a loop decompresses data we are seeking over into
> >> uncompressed_buf (and refills compressed_buf if it becomes empty) until
> >> the desired offset is reached.  Then decompression can continue
> >> directly into the user's qiov and uncompressed_buf isn't used to
> >> decompress the data requested by the user.
> >
> >
> > Yes, this series does exactly that but keeps using the "uncompressed"
> buffer
> > once we reach the desired offset. Once, I understand to use qiov
> directly,
> > we can do this. Also, Kevin did suggest me (as I remember vaguely) that
> in
> > reality we never actually get the read request at a particular offset
> > because DMG driver is generally used with "qemu-img convert", which means
> > all read requests are from the top.
>
> For performance it's fine to assume a sequential access pattern.  The
> block driver should still support random access I/O though.
>

Yes, I agree. But I don't think we can add random access for the bz2 chunks
because they need to be decompressed as a whole and not in parts. I tried
to explain it in my cover letter as an important note (
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg05327.html). Due
to the limitation explained there, I guess we cannot use a loop to seek to
the desired offset inside bz2 chunks. Also, since they need to be
decompressed at once, we will ultimately need to allocate large buffers.

If you fell I am right till now, I suggest that we can allocate small
buffers for other cases as discussed above and separately allocate huge
buffers according to the size of bz2 chunk we are currently reading into.
This can be done because bz2 chunks normally expand to a max size of 46 MB
which is below than our current limit of 64 MB.
See this -> https://en.wikipedia.org/wiki/Bzip2#File_format

Thanks
Ashijeet

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-08-30 13:02       ` Ashijeet Acharya
@ 2017-09-05 10:58         ` Stefan Hajnoczi
  2017-09-05 11:09           ` Ashijeet Acharya
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2017-09-05 10:58 UTC (permalink / raw)
  To: Ashijeet Acharya
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block

On Wed, Aug 30, 2017 at 06:32:52PM +0530, Ashijeet Acharya wrote:
> On Tue, Aug 29, 2017 at 8:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 1:47 PM, Ashijeet Acharya
> > <ashijeetacharya@gmail.com> wrote:
> > > On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefanha@gmail.com>
> > wrote:
> > >>
> > >> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> > >> > This series helps to provide chunk size independence for DMG driver to
> > >> > prevent
> > >> > denial-of-service in cases where untrusted files are being accessed by
> > >> > the user.
> > >>
> > >> The core of the chunk size dependence problem are these lines:
> > >>
> > >>     s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
> > >>                                               ds.max_compressed_size +
> > 1);
> > >>     s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
> > >>                                                 512 *
> > >> ds.max_sectors_per_chunk);
> > >>
> > >> The refactoring needs to eliminate these buffers because their size is
> > >> controlled by the untrusted input file.
> > >
> > >
> > > Oh okay, I understand now. But wouldn't I still need to allocate some
> > memory
> > > for these buffers to be able to use them for the compressed chunks case
> > you
> > > mentioned below. Instead of letting the DMG images control the size of
> > these
> > > buffers, maybe I can hard-code the size of these buffers instead?
> > >
> > >>
> > >>
> > >> After applying your patches these lines remain unchanged and we still
> > >> cannot use input files that have a 250 MB chunk size, for example.  So
> > >> I'm not sure how this series is supposed to work.
> > >>
> > >> Here is the approach I would take:
> > >>
> > >> In order to achieve this dmg_read_chunk() needs to be scrapped.  It is
> > >> designed to read a full chunk.  The new model does not read full chunks
> > >> anymore.
> > >>
> > >> Uncompressed reads or zeroes should operate directly on qiov, not
> > >> s->uncompressed_chunk.  This code will be dropped:
> > >>
> > >>     data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
> > >>     qemu_iovec_from_buf(qiov, i * 512, data, 512);
> > >
> > >
> > > I have never worked with qiov before, are there any places where I can
> > refer
> > > to inside other drivers to get the idea of how to use it directly (I am
> > > searching by myself in the meantime...)?
> >
> > A QEMUIOVector is a utility type for struct iovec iov[] processing.
> > See util/iov.c.  This is called "vectored" or "scatter-gather" I/O.
> >
> > Instead of transferring data to/from a single <buffer, length> tuple,
> > they take an array [<buffer, length>].  For example, the buffer "Hello
> > world" could be split into two elements:
> > [{"Hello ", strlen("Hello ")},
> >  {"world", strlen("world")}]
> >
> > Vectored I/O is often used because it eliminates memory copies.  Say
> > you have a network packet header struct and also a data payload array.
> > Traditionally you would have to allocate a new buffer large enough for
> > both header and payload, copy the header and payload into the buffer,
> > and finally give this temporary buffer to the I/O function.  This is
> > inefficient.  With vectored I/O you can create a vector with two
> > elements, the header and the payload, and the I/O function can process
> > them without needing a temporary buffer copy.
> >
> 
> Thanks for the detailed explanation, I think I understood the concept now
> and how to use qiov efficiently.
> Correct me if I am wrong here. In order to use qiov directly for
> uncompressed chunks:
> 
> 1. Declare a new local_qiov inside dmg_co_preadv (let's say)

No, the operation should use qiov directly if the chunk is uncompressed.

A temporary buffer is only needed if the data needs to be uncompressed.

> 2. Initialize it with qemu_iovec_init()
> 3. Reset it with qemu_iovec_reset() (this is because we will perform this
> action in a loop and thus need to reset it before every loop?)
> 4. Declare a buffer "uncompressed_buf" and allocate it with
> qemu_try_blockalign()
> 5. Add this buffer to our local_qiov using qemu_iovec_add()
> 6. Read data from file directly into local_qiov using bdrv_co_preadv()
> 7. On success concatenate it with the qiov passed into the main
> dmg_co_preadv() function.
> 
> I think this method only works for uncompressed chunks. For the compressed
> ones, I believe we will still need to do it in the existing way, i.e. read
> chunk from file -> decompress into output buffer -> use
> qemu_iovec_from_buf() because we cannot read directly since data is in
> compressed form. Makes sense?
> 
> 
> > > I got clearly what you are trying
> > > to say, but don't know how to implement it. I think, don't we already do
> > > that for the zeroed chunks in DMG in dmg_co_preadv()?
> >
> > Yes, dmg_co_preadv() directly zeroes the qiov.  It doesn't use
> > s->compressed_chunk/s->uncompressed_chunk.
> >
> > >
> > >>
> > >>
> > >> Compressed reads still buffers.  I suggest the following buffers:
> > >>
> > >> 1. compressed_buf - compressed data is read into this buffer from file
> > >> 2. uncompressed_buf - a place to discard decompressed data while
> > >>                       simulating a seek operation
> > >
> > >
> > > Yes, these are the buffers whose size I can hard-code as discussed above?
> > > You can suggest the preferred size to me.
> >
> > Try starting with 256 KB for both buffers.
> >
> 
> Okay, I will do that. But I think we cannot use these buffer sizes for bz2
> chunks (see below)
> 
> 
> > >> Data is read from compressed chunks by reading a reasonable amount
> > >> (64k?) into compressed_buf.  If the user wishes to read at an offset
> > >> into this chunk then a loop decompresses data we are seeking over into
> > >> uncompressed_buf (and refills compressed_buf if it becomes empty) until
> > >> the desired offset is reached.  Then decompression can continue
> > >> directly into the user's qiov and uncompressed_buf isn't used to
> > >> decompress the data requested by the user.
> > >
> > >
> > > Yes, this series does exactly that but keeps using the "uncompressed"
> > buffer
> > > once we reach the desired offset. Once, I understand to use qiov
> > directly,
> > > we can do this. Also, Kevin did suggest me (as I remember vaguely) that
> > in
> > > reality we never actually get the read request at a particular offset
> > > because DMG driver is generally used with "qemu-img convert", which means
> > > all read requests are from the top.
> >
> > For performance it's fine to assume a sequential access pattern.  The
> > block driver should still support random access I/O though.
> >
> 
> Yes, I agree. But I don't think we can add random access for the bz2 chunks
> because they need to be decompressed as a whole and not in parts. I tried
> to explain it in my cover letter as an important note (
> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg05327.html).

Here is what you said:

  "bz2 compressed streams do not allow random access midway through a
  chunk/block as the BZ2_bzDecompress() API in bzlib seeks for the
  magic key "BZh" before starting decompression.[2] This magic key is
  present at the start of every chunk/block only and since our cached
  random access points need not necessarily point to the start of a
  chunk/block, BZ2_bzDecompress() fails with an error value
  BZ_DATA_ERROR_MAGIC"

The block driver can simulate random access.  Take a look at the
BZ2_bzDecompress() API docs:

  "You may provide and remove as little or as much data as you like on
  each call of BZ2_bzDecompress. In the limit, it is acceptable to
  supply and remove data one byte at a time, although this would be
  terribly inefficient. You should always ensure that at least one byte
  of output space is available at each call.

In other words, bz2 supports streaming.  Therefore you can use the
buffer sizes I suggested in a loop instead of reading the whole bz2
block upfront.

If you keep the bz_stream between dmg_uncompress_bz2_do() calls then
sequential reads can be optimized.  They do not need to reread from the
beginning of the bz2 block.  That's important because sequential I/O is
the main access pattern expected by this block driver.

Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
  2017-09-05 10:58         ` Stefan Hajnoczi
@ 2017-09-05 11:09           ` Ashijeet Acharya
  0 siblings, 0 replies; 18+ messages in thread
From: Ashijeet Acharya @ 2017-09-05 11:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, John Snow, Max Reitz, Fam Zheng, Peter Wu,
	QEMU Developers, qemu block

On Tue, Sep 5, 2017 at 4:28 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Wed, Aug 30, 2017 at 06:32:52PM +0530, Ashijeet Acharya wrote:
> > On Tue, Aug 29, 2017 at 8:55 PM, Stefan Hajnoczi <stefanha@gmail.com>
> wrote:
> >
> > > On Sun, Aug 20, 2017 at 1:47 PM, Ashijeet Acharya
> > > <ashijeetacharya@gmail.com> wrote:
> > > > On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefanha@gmail.com>
> > > wrote:
> > > >>
> > > >> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote:
> > > >> > This series helps to provide chunk size independence for DMG
> driver to
> > > >> > prevent
> > > >> > denial-of-service in cases where untrusted files are being
> accessed by
> > > >> > the user.
> > > >>
> > > >> The core of the chunk size dependence problem are these lines:
> > > >>
> > > >>     s->compressed_chunk = qemu_try_blockalign(bs->file->bs,
> > > >>
>  ds.max_compressed_size +
> > > 1);
> > > >>     s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs,
> > > >>                                                 512 *
> > > >> ds.max_sectors_per_chunk);
> > > >>
> > > >> The refactoring needs to eliminate these buffers because their size
> is
> > > >> controlled by the untrusted input file.
> > > >
> > > >
> > > > Oh okay, I understand now. But wouldn't I still need to allocate some
> > > memory
> > > > for these buffers to be able to use them for the compressed chunks
> case
> > > you
> > > > mentioned below. Instead of letting the DMG images control the size
> of
> > > these
> > > > buffers, maybe I can hard-code the size of these buffers instead?
> > > >
> > > >>
> > > >>
> > > >> After applying your patches these lines remain unchanged and we
> still
> > > >> cannot use input files that have a 250 MB chunk size, for example.
> So
> > > >> I'm not sure how this series is supposed to work.
> > > >>
> > > >> Here is the approach I would take:
> > > >>
> > > >> In order to achieve this dmg_read_chunk() needs to be scrapped.  It
> is
> > > >> designed to read a full chunk.  The new model does not read full
> chunks
> > > >> anymore.
> > > >>
> > > >> Uncompressed reads or zeroes should operate directly on qiov, not
> > > >> s->uncompressed_chunk.  This code will be dropped:
> > > >>
> > > >>     data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
> > > >>     qemu_iovec_from_buf(qiov, i * 512, data, 512);
> > > >
> > > >
> > > > I have never worked with qiov before, are there any places where I
> can
> > > refer
> > > > to inside other drivers to get the idea of how to use it directly (I
> am
> > > > searching by myself in the meantime...)?
> > >
> > > A QEMUIOVector is a utility type for struct iovec iov[] processing.
> > > See util/iov.c.  This is called "vectored" or "scatter-gather" I/O.
> > >
> > > Instead of transferring data to/from a single <buffer, length> tuple,
> > > they take an array [<buffer, length>].  For example, the buffer "Hello
> > > world" could be split into two elements:
> > > [{"Hello ", strlen("Hello ")},
> > >  {"world", strlen("world")}]
> > >
> > > Vectored I/O is often used because it eliminates memory copies.  Say
> > > you have a network packet header struct and also a data payload array.
> > > Traditionally you would have to allocate a new buffer large enough for
> > > both header and payload, copy the header and payload into the buffer,
> > > and finally give this temporary buffer to the I/O function.  This is
> > > inefficient.  With vectored I/O you can create a vector with two
> > > elements, the header and the payload, and the I/O function can process
> > > them without needing a temporary buffer copy.
> > >
> >
> > Thanks for the detailed explanation, I think I understood the concept now
> > and how to use qiov efficiently.
> > Correct me if I am wrong here. In order to use qiov directly for
> > uncompressed chunks:
> >
> > 1. Declare a new local_qiov inside dmg_co_preadv (let's say)
>
> No, the operation should use qiov directly if the chunk is uncompressed.
>
> A temporary buffer is only needed if the data needs to be uncompressed.
>

Yes, I had a chat with John and he cleared most of my doubts on how to
approach it correctly now without using a temporary buffer.


>
> > 2. Initialize it with qemu_iovec_init()
> > 3. Reset it with qemu_iovec_reset() (this is because we will perform this
> > action in a loop and thus need to reset it before every loop?)
> > 4. Declare a buffer "uncompressed_buf" and allocate it with
> > qemu_try_blockalign()
> > 5. Add this buffer to our local_qiov using qemu_iovec_add()
> > 6. Read data from file directly into local_qiov using bdrv_co_preadv()
> > 7. On success concatenate it with the qiov passed into the main
> > dmg_co_preadv() function.
> >
> > I think this method only works for uncompressed chunks. For the
> compressed
> > ones, I believe we will still need to do it in the existing way, i.e.
> read
> > chunk from file -> decompress into output buffer -> use
> > qemu_iovec_from_buf() because we cannot read directly since data is in
> > compressed form. Makes sense?
> >
> >
> > > > I got clearly what you are trying
> > > > to say, but don't know how to implement it. I think, don't we
> already do
> > > > that for the zeroed chunks in DMG in dmg_co_preadv()?
> > >
> > > Yes, dmg_co_preadv() directly zeroes the qiov.  It doesn't use
> > > s->compressed_chunk/s->uncompressed_chunk.
> > >
> > > >
> > > >>
> > > >>
> > > >> Compressed reads still buffers.  I suggest the following buffers:
> > > >>
> > > >> 1. compressed_buf - compressed data is read into this buffer from
> file
> > > >> 2. uncompressed_buf - a place to discard decompressed data while
> > > >>                       simulating a seek operation
> > > >
> > > >
> > > > Yes, these are the buffers whose size I can hard-code as discussed
> above?
> > > > You can suggest the preferred size to me.
> > >
> > > Try starting with 256 KB for both buffers.
> > >
> >
> > Okay, I will do that. But I think we cannot use these buffer sizes for
> bz2
> > chunks (see below)
> >
> >
> > > >> Data is read from compressed chunks by reading a reasonable amount
> > > >> (64k?) into compressed_buf.  If the user wishes to read at an offset
> > > >> into this chunk then a loop decompresses data we are seeking over
> into
> > > >> uncompressed_buf (and refills compressed_buf if it becomes empty)
> until
> > > >> the desired offset is reached.  Then decompression can continue
> > > >> directly into the user's qiov and uncompressed_buf isn't used to
> > > >> decompress the data requested by the user.
> > > >
> > > >
> > > > Yes, this series does exactly that but keeps using the "uncompressed"
> > > buffer
> > > > once we reach the desired offset. Once, I understand to use qiov
> > > directly,
> > > > we can do this. Also, Kevin did suggest me (as I remember vaguely)
> that
> > > in
> > > > reality we never actually get the read request at a particular offset
> > > > because DMG driver is generally used with "qemu-img convert", which
> means
> > > > all read requests are from the top.
> > >
> > > For performance it's fine to assume a sequential access pattern.  The
> > > block driver should still support random access I/O though.
> > >
> >
> > Yes, I agree. But I don't think we can add random access for the bz2
> chunks
> > because they need to be decompressed as a whole and not in parts. I tried
> > to explain it in my cover letter as an important note (
> > https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg05327.html).
>
> Here is what you said:
>
>   "bz2 compressed streams do not allow random access midway through a
>   chunk/block as the BZ2_bzDecompress() API in bzlib seeks for the
>   magic key "BZh" before starting decompression.[2] This magic key is
>   present at the start of every chunk/block only and since our cached
>   random access points need not necessarily point to the start of a
>   chunk/block, BZ2_bzDecompress() fails with an error value
>   BZ_DATA_ERROR_MAGIC"
>
> The block driver can simulate random access.  Take a look at the
> BZ2_bzDecompress() API docs:
>
>   "You may provide and remove as little or as much data as you like on
>   each call of BZ2_bzDecompress. In the limit, it is acceptable to
>   supply and remove data one byte at a time, although this would be
>   terribly inefficient. You should always ensure that at least one byte
>   of output space is available at each call.
>
> In other words, bz2 supports streaming.  Therefore you can use the
> buffer sizes I suggested in a loop instead of reading the whole bz2
> block upfront.
>
> If you keep the bz_stream between dmg_uncompress_bz2_do() calls then
> sequential reads can be optimized.  They do not need to reread from the
> beginning of the bz2 block.  That's important because sequential I/O is
> the main access pattern expected by this block driver.
>

Yes, I am trying to implement it exactly like that. It failed the last time
I tried it in v2 but maybe I did something wrong because the docs say
otherwise. As far as the optimization is concerned, I am caching the access
points to resume reading from there in the next call so that should work.

Ashijeet

>
> Stefan
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-09-05 11:10 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-27  8:06 [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 1/7] dmg: Introduce a new struct to cache random access points Ashijeet Acharya
2017-05-05 13:12   ` Stefan Hajnoczi
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 2/7] dmg: New function to help us cache random access point Ashijeet Acharya
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 3/7] dmg: Refactor and prepare dmg_read_chunk() to cache random access points Ashijeet Acharya
2017-05-05 13:33   ` Stefan Hajnoczi
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 4/7] dmg: Handle zlib compressed chunks Ashijeet Acharya
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 5/7] dmg: Handle bz2 compressed/raw/zeroed chunks Ashijeet Acharya
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 6/7] dmg: Refactor dmg_co_preadv() to start reading multiple sectors Ashijeet Acharya
2017-04-27  8:06 ` [Qemu-devel] [PATCH v2 7/7] dmg: Limit the output buffer size to a max of 2MB Ashijeet Acharya
2017-04-27 20:59 ` [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence Ashijeet Acharya
2017-05-05 13:06 ` Stefan Hajnoczi
2017-05-05 13:59 ` Stefan Hajnoczi
2017-08-20 12:47   ` Ashijeet Acharya
2017-08-29 15:25     ` Stefan Hajnoczi
2017-08-30 13:02       ` Ashijeet Acharya
2017-09-05 10:58         ` Stefan Hajnoczi
2017-09-05 11:09           ` Ashijeet Acharya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.