From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37005) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y7O6u-0003TP-Md for qemu-devel@nongnu.org; Sat, 03 Jan 2015 07:47:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y7O6t-00007P-74 for qemu-devel@nongnu.org; Sat, 03 Jan 2015 07:47:16 -0500 Received: from mail.lekensteyn.nl ([2a02:2308::360:1:25]:38343) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y7O6s-00007E-Qi for qemu-devel@nongnu.org; Sat, 03 Jan 2015 07:47:15 -0500 From: Peter Wu Date: Sat, 03 Jan 2015 13:47:11 +0100 Message-ID: <1605759.uV2RWABCx2@al> In-Reply-To: <54A73249.7090002@redhat.com> References: <1419692504-29373-1-git-send-email-peter@lekensteyn.nl> <1419692504-29373-9-git-send-email-peter@lekensteyn.nl> <54A73249.7090002@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi On Friday 02 January 2015 19:05:29 John Snow wrote: > On 12/27/2014 10:01 AM, Peter Wu wrote: > > This patch addresses two issues: > > > > - The data fork offset was not taken into account, resulting in failure > > to read an InstallESD.dmg file (5164763151 bytes) which had a > > non-zero DataForkOffset field. > > - The offset of the previous block ("partition") was unconditionally > > added to the current block because older files would start the input > > offset of a new block at zero. Newer files (including vlc-2.1.5.dmg, > > tuxpaint-0.9.15-macosx.dmg and OS X Yosemite [MAS].dmg) failed in > > reads because these files have a continuous input offset. > > > > What does "continuous input offset" mean? This change is not as clear to > me, see below. By "continuous" I mean that the new files have absolute offsets while the offsets in older files were relative to the previous block. > > Signed-off-by: Peter Wu > > --- > > block/dmg.c | 16 +++++++++++++++- > > 1 file changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/block/dmg.c b/block/dmg.c > > index 984997f..93b597f 100644 > > --- a/block/dmg.c > > +++ b/block/dmg.c > > @@ -179,6 +179,7 @@ static int64_t dmg_find_koly_offset(BlockDriverState *file_bs) > > typedef struct DmgHeaderState { > > /* used internally by dmg_read_mish_block to remember offsets of blocks > > * across calls */ > > + uint64_t data_fork_offset; > > uint64_t last_in_offset; > > uint64_t last_out_offset; > > /* exported for dmg_open */ > > @@ -194,6 +195,7 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds, > > size_t new_size; > > uint32_t chunk_count; > > int64_t offset = 0; > > + uint64_t in_offset = ds->data_fork_offset; > > > > type = buff_read_uint32(buffer, offset); > > /* skip data that is not a valid MISH block (invalid magic or too small) */ > > @@ -246,7 +248,12 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds, > > } > > > > s->offsets[i] = buff_read_uint64(buffer, offset); > > - s->offsets[i] += ds->last_in_offset; > > + /* If this offset is below the previous chunk end, then assume that all > > + * following offsets are after the previous chunks. */ > > + if (s->offsets[i] + in_offset < ds->last_in_offset) { > > + in_offset = ds->last_in_offset; > > + } > > + s->offsets[i] += in_offset; > > I take it that all of the offsets referenced in the mish structures are > relative to the start of the data fork block, which is why we are taking > a value from the koly block and applying it to mish block values. > > correct? Correct, the mish block describes the contents of the data fork. http://newosxbook.com/DMG.html says: typedef struct { // ... uint64_t CompressedOffset; // Start of chunk in data fork uint64_t CompressedLength; // Count of bytes of chunk, in data fork } __attribute__((__packed__)) BLKXChunkEntry; > > offset += 8; > > > > s->lengths[i] = buff_read_uint64(buffer, offset); > > @@ -400,6 +407,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags, > > bs->read_only = 1; > > s->n_chunks = 0; > > s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL; > > + ds.data_fork_offset = 0; > > ds.last_in_offset = 0; > > ds.last_out_offset = 0; > > ds.max_compressed_size = 1; > > @@ -412,6 +420,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags, > > goto fail; > > } > > > > + /* offset of data fork (DataForkOffset) */ > > + ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset); > > + if (ret < 0) { > > + goto fail; > > + } > > + > > /* offset of resource fork (RsrcForkOffset) */ > > ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset); > > if (ret < 0) { > > > > A general question here: > > Are we ever reading the preamble of the mish block? I see we are reading > the 'n' items of 40-byte chunk data, but is there a reason we skip the > first 200 bytes of mish data, or have I misread the documents on DMG > that exist? We only use the Signature field to verify that we indeed have a BLKX entry (required since the XML fork may yield other kind of data which does not have this magic). The UDIFChecksum field is 136 bytes (confirmed by a internet search). Adding the other fields (version..reserved6 and NumberOfBlockChunks) results in 200 (+4 for the Signature). > It looks like there are some good fields here: SectorNumber, > SectorCount, DataOffset, and BlockDescriptors -- can these not be used > to provide a more explicit error-checking of offsets, allowing us to > make less assumptions about where these blocks begin and end? > > Is there some reason they are unreliable? As far as I know this is not checked because nobody added it. I am not aware of incorrect values inside this block. Let's see: - Version: could check this against '1' and fail if unknown? - SectorNumber: looks like this can be taken as the absolute offset, all entries seem to be relative to this one. - SectorCount: looks like the number of sectors which should match the entries (at least for the tuxpaint example dmg file). - DataOffset: 0 in the tuxpaint example. Perhaps it should be added to the data fork offset, but let's ignore it for now. - 0x208 for 3/4 mish blocks, 0 for the last mish block which does not seem to contain data. - NumberOfBlockChunks: 0xFFFFFFFF, 0, 1, 2 respectively for the mish blocks. No idea what this means. Probably the ordering, but we assume that the offsets are sorted AFAIK. I will try to make use of SectorNumber and SectorCount, the others will be ignored for now. -- Kind regards, Peter https://lekensteyn.nl