From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60411) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y7NHR-0001id-GX for qemu-devel@nongnu.org; Sat, 03 Jan 2015 06:54:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y7NHP-0006sY-Uk for qemu-devel@nongnu.org; Sat, 03 Jan 2015 06:54:05 -0500 Received: from mail.lekensteyn.nl ([2a02:2308::360:1:25]:55203) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y7NHP-0006sM-Gw for qemu-devel@nongnu.org; Sat, 03 Jan 2015 06:54:03 -0500 From: Peter Wu Date: Sat, 03 Jan 2015 12:54 +0100 Message-ID: <3699574.WfSB9l0yJF@al> In-Reply-To: <54A73210.5010709@redhat.com> References: <1419692504-29373-1-git-send-email-peter@lekensteyn.nl> <1419692504-29373-7-git-send-email-peter@lekensteyn.nl> <54A73210.5010709@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [Qemu-devel] [PATCH 06/10] block/dmg: process XML plists List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi On Friday 02 January 2015 19:04:32 John Snow wrote: > On 12/27/2014 10:01 AM, Peter Wu wrote: > > The format is simple enough to avoid using a full-blown XML parser. > > The offsets are based on the description at > > http://newosxbook.com/DMG.html > > > > Signed-off-by: Peter Wu > > --- > > block/dmg.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 69 insertions(+) > > > > diff --git a/block/dmg.c b/block/dmg.c > > index 19e4fe2..c03ea01 100644 > > --- a/block/dmg.c > > +++ b/block/dmg.c > > @@ -26,6 +26,7 @@ > > #include "qemu/bswap.h" > > #include "qemu/module.h" > > #include > > +#include > > > > enum { > > /* Limit chunk sizes to prevent unreasonable amounts of memory being used > > @@ -333,12 +334,66 @@ fail: > > return ret; > > } > > > > +static int dmg_read_plist_xml(BlockDriverState *bs, DmgHeaderState *ds, > > + uint64_t info_begin, uint64_t info_length) > > +{ > > + BDRVDMGState *s = bs->opaque; > > + int ret; > > + uint8_t *buffer = NULL; > > + char *data_begin, *data_end; > > + > > + /* Have at least some length to avoid NULL for g_malloc. Attempt to set a > > + * safe upper cap on the data length. A test sample had a XML length of > > + * about 1 MiB. */ > > + if (info_length == 0 || info_length > 16 * 1024 * 1024) { > > + ret = -EINVAL; > > + goto fail; > > + } > > + > > + buffer = g_malloc(info_length + 1); > > + buffer[info_length] = '\0'; > > + ret = bdrv_pread(bs->file, info_begin, buffer, info_length); > > + if (ret != info_length) { > > + ret = -EINVAL; > > + goto fail; > > + } > > + > > + /* look for .... The data is 284 (0x11c) bytes after base64 > > + * decode. The actual data element has 431 (0x1af) bytes which includes tabs > > + * and line feeds. */ > > + data_end = (char *)buffer; > > + while ((data_begin = strstr(data_end, "")) != NULL) { > > + gsize out_len = 0; > > + > > + data_begin += 6; > > + data_end = strstr(data_begin, ""); > > + /* malformed XML? */ > > + if (data_end == NULL) { > > + ret = -EINVAL; > > + goto fail; > > + } > > + *data_end++ = '\0'; > > + g_base64_decode_inplace(data_begin, &out_len); > > + ret = dmg_read_mish_block(s, ds, (uint8_t *)data_begin, > > + (uint32_t)out_len); > > + if (ret < 0) { > > + goto fail; > > + } > > + } > > + ret = 0; > > + > > +fail: > > + g_free(buffer); > > + return ret; > > +} > > + > > This starts to make me a little nervous, because we're ignoring so much > of the XML document structure here and just effectively performing a > regular search for "(.*)". > > Can we guarantee that the ONLY time the data element is used in this > document is when it is being used in the exact context we are expecting > here, where it contains the b64 mish data we expect it to? > > i.e. it is always in a path like this as detailed by > http://newosxbook.com/DMG.html : > > plist/dict/key[text()='resource-fork']/following-sibling::dict/key[text()='blkx']/following-sibling::array/dict/key[text()='data']/following-sibling::data > > I notice that this document says other sections MAY be present, do any > of them ever need to be parsed? Has anyone written about them before? > > Do we know if any use data sections? > > I suppose at the very least, sections of interest are always going to > include the "mish" magic, so that should probably keep us from doing > anything too stupid ... I did not find DMG files with elements at other locations. If it would occur, at worst we would fail to parse a DMG file. I think that introducing a XML parser here would introduce a risk for a minor benefit (being prepared for future cases). Since this is a property list, in theory people could include all kinds of data for different keys (which would then be matched by the current implementation). But how likely is this for a disk image? FWIW, I looked into the dmg2img program and that also looks for the strings "" and "". Nobody has raised a bug for that program so far. Do you think that it is worth to use a XML parser on potentially insecure data? I suggest to keep it as it, and reconsider a different approach in case a problem is encountered. Kind regards, Peter > > static int dmg_open(BlockDriverState *bs, QDict *options, int flags, > > Error **errp) > > { > > BDRVDMGState *s = bs->opaque; > > DmgHeaderState ds; > > uint64_t rsrc_fork_offset, rsrc_fork_length; > > + uint64_t plist_xml_offset, plist_xml_length; > > int64_t offset; > > int ret; > > > > @@ -366,12 +421,26 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags, > > if (ret < 0) { > > goto fail; > > } > > + /* offset of property list (XMLOffset) */ > > + ret = read_uint64(bs, offset + 0xd8, &plist_xml_offset); > > + if (ret < 0) { > > + goto fail; > > + } > > + ret = read_uint64(bs, offset + 0xe0, &plist_xml_length); > > + if (ret < 0) { > > + goto fail; > > + } > > if (rsrc_fork_offset != 0 && rsrc_fork_length != 0) { > > ret = dmg_read_resource_fork(bs, &ds, > > rsrc_fork_offset, rsrc_fork_length); > > if (ret < 0) { > > goto fail; > > } > > + } else if (plist_xml_offset != 0 && plist_xml_length != 0) { > > + ret = dmg_read_plist_xml(bs, &ds, plist_xml_offset, plist_xml_length); > > + if (ret < 0) { > > + goto fail; > > + } > > } else { > > ret = -EINVAL; > > goto fail; > >