All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] add LZ4 kernel decompression support
@ 2013-09-23 14:51 Jan Beulich
  2013-09-23 14:54 ` [PATCH 1/2] xen: add LZ4 " Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Jan Beulich @ 2013-09-23 14:51 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

Linux 3.11 added respective support, so I think we should follow
suit.

1: xen: add LZ4 decompression support
2: libxc: add LZ4 decompression support

Signed-off-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] xen: add LZ4 decompression support
  2013-09-23 14:51 [PATCH 0/2] add LZ4 kernel decompression support Jan Beulich
@ 2013-09-23 14:54 ` Jan Beulich
  2013-09-23 15:00   ` Ian Campbell
  2013-09-23 14:56 ` [PATCH 2/2] libxc: " Jan Beulich
  2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
  2 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-23 14:54 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

[-- Attachment #1: Type: text/plain, Size: 23663 bytes --]

Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
Xen are based on LZ4 implementation by Yann Collet.

Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2

1. ARMv7, 1.5GHz based board
   Kernel: linux 3.4
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
   LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)

2. ARMv7, 1.7GHz based board
   Kernel: linux 3.7
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
   LZ4  6.5MB            86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied

This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].

But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones?  This issue
had been discussed and came to the conclusion [2].

Russell King said that we should have:

 - one decompressor which is the fastest
 - one decompressor for the highest compression ratio
 - one popular decompressor (eg conventional gzip)

If we have a replacement one for one of these, then it should do exactly
that: replace it.

The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel).  Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].

[1] http://code.google.com/p/lz4/ 
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 

LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html 
LZ4 source repository: http://code.google.com/p/lz4/ 

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>

Adopted for Xen.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -49,7 +49,7 @@ obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += lzo.o
 
-obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo,$(n).init.o)
+obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4,$(n).init.o)
 
 obj-$(perfc)       += perfc.o
 obj-$(crash_debug) += gdbstub.o
--- a/xen/common/decompress.c
+++ b/xen/common/decompress.c
@@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig
     if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) )
         return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error);
 
+    if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) )
+	return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error);
+
     return 1;
 }
--- /dev/null
+++ b/xen/common/lz4/decompress.c
@@ -0,0 +1,314 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html 
+ *  - LZ4 source repository : http://code.google.com/p/lz4/ 
+ */
+
+#include "defs.h"
+
+static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest,
+			       int osize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *ref;
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + osize;
+	BYTE *cpy;
+	unsigned token;
+	size_t length;
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	while (1) {
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			size_t len;
+
+			len = *ip++;
+			for (; len == 255; length += 255)
+				len = *ip++;
+			length += len;
+		}
+
+		/* copy literals */
+		cpy = op + length;
+		if (unlikely(cpy > oend - COPYLENGTH)) {
+			/*
+			 * Error: not enough place for another match
+			 * (min 4) + 5 literals
+			 */
+			if (cpy != oend)
+				goto _output_error;
+
+			memcpy(op, ip, length);
+			ip += length;
+			break; /* EOF */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+
+		/* Error: offset create reference outside destination buffer */
+		if (unlikely(ref < (BYTE *const) dest))
+			goto _output_error;
+
+		/* get matchlength */
+		length = token & ML_MASK;
+		if (length == ML_MASK) {
+			for (; *ip == 255; length += 255)
+				ip++;
+			length += *ip++;
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+			op[0] = ref[0];
+			op[1] = ref[1];
+			op[2] = ref[2];
+			op[3] = ref[3];
+			op += 4;
+			ref += 4;
+			ref -= dec32table[op-ref];
+			PUT4(ref, op);
+			op += STEPSIZE - 4;
+			ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE - 4);
+		if (cpy > (oend - COPYLENGTH)) {
+
+			/* Error: request to write beyond destination buffer */
+			if (cpy > oend)
+				goto _output_error;
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (((unsigned char *)ip) - source);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(((unsigned char *)ip) - source));
+}
+
+#ifndef __XEN__
+static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
+				int isize, size_t maxoutputsize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *const iend = ip + isize;
+	const BYTE *ref;
+
+
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + maxoutputsize;
+	BYTE *cpy;
+
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	/* Main Loop */
+	while (ip < iend) {
+
+		unsigned token;
+		size_t length;
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			int s = 255;
+			while ((ip < iend) && (s == 255)) {
+				s = *ip++;
+				length += s;
+			}
+		}
+		/* copy literals */
+		cpy = op + length;
+		if ((cpy > oend - COPYLENGTH) ||
+			(ip + length > iend - COPYLENGTH)) {
+
+			if (cpy > oend)
+				goto _output_error;/* writes beyond buffer */
+
+			if (ip + length != iend)
+				goto _output_error;/*
+						    * Error: LZ4 format requires
+						    * to consume all input
+						    * at this stage
+						    */
+			memcpy(op, ip, length);
+			op += length;
+			break;/* Necessarily EOF, due to parsing restrictions */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+		if (ref < (BYTE * const) dest)
+			goto _output_error;
+			/*
+			 * Error : offset creates reference
+			 * outside of destination buffer
+			 */
+
+		/* get matchlength */
+		length = (token & ML_MASK);
+		if (length == ML_MASK) {
+			while (ip < iend) {
+				int s = *ip++;
+				length += s;
+				if (s == 255)
+					continue;
+				break;
+			}
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+				op[0] = ref[0];
+				op[1] = ref[1];
+				op[2] = ref[2];
+				op[3] = ref[3];
+				op += 4;
+				ref += 4;
+				ref -= dec32table[op - ref];
+				PUT4(ref, op);
+				op += STEPSIZE - 4;
+				ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE-4);
+		if (cpy > oend - COPYLENGTH) {
+			if (cpy > oend)
+				goto _output_error; /* write outside of buf */
+
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (((char *) op) - dest);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(((char *) ip) - source));
+}
+#endif
+
+STATIC int INIT lz4_decompress(const unsigned char *src, size_t *src_len,
+			       unsigned char *dest, size_t actual_dest_len)
+{
+	int ret = -1;
+	int input_len = 0;
+
+	input_len = lz4_uncompress(src, dest, actual_dest_len);
+	if (input_len < 0)
+		goto exit_0;
+	*src_len = input_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#ifndef __XEN__
+int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
+		char *dest, size_t *dest_len)
+{
+	int ret = -1;
+	int out_len = 0;
+
+	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
+					*dest_len);
+	if (out_len < 0)
+		goto exit_0;
+	*dest_len = out_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+#endif
--- /dev/null
+++ b/xen/common/lz4/defs.h
@@ -0,0 +1,184 @@
+/*
+ * lz4defs.h -- architecture specific defines
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifdef __XEN__
+#include <asm/byteorder.h>
+#endif
+
+#ifdef __LITTLE_ENDIAN
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpup(p);
+}
+
+static inline u32 INIT get_unaligned_le32(const void *p)
+{
+	return le32_to_cpup(p);
+}
+#else
+#include <asm/unaligned.h>
+
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpu(__get_unaligned(p, 2));
+}
+
+static inline u32 INIT get_unaligned_le32(void *p)
+{
+	return le32_to_cpu(__get_unaligned(p, 4));
+}
+#endif
+
+/*
+ * Detects 64 bits mode
+ */
+#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
+	|| defined(__ppc64__) || defined(__LP64__))
+#define LZ4_ARCH64 1
+#else
+#define LZ4_ARCH64 0
+#endif
+
+/*
+ * Architecture-specific macros
+ */
+#define BYTE	u8
+typedef struct _U16_S { u16 v; } U16_S;
+typedef struct _U32_S { u32 v; } U32_S;
+typedef struct _U64_S { u64 v; } U64_S;
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)		\
+	|| defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6	\
+	&& defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
+
+#define A16(x) (((U16_S *)(x))->v)
+#define A32(x) (((U32_S *)(x))->v)
+#define A64(x) (((U64_S *)(x))->v)
+
+#define PUT4(s, d) (A32(d) = A32(s))
+#define PUT8(s, d) (A64(d) = A64(s))
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		A16(p) = v; \
+		p += 2; \
+	} while (0)
+#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+
+#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
+#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v))
+#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v))
+
+#define PUT4(s, d) \
+	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
+#define PUT8(s, d) \
+	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
+
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		put_unaligned(v, (u16 *)(p)); \
+		p += 2; \
+	} while (0)
+#endif
+
+#define COPYLENGTH 8
+#define ML_BITS  4
+#define ML_MASK  ((1U << ML_BITS) - 1)
+#define RUN_BITS (8 - ML_BITS)
+#define RUN_MASK ((1U << RUN_BITS) - 1)
+#define MEMORY_USAGE	14
+#define MINMATCH	4
+#define SKIPSTRENGTH	6
+#define LASTLITERALS	5
+#define MFLIMIT		(COPYLENGTH + MINMATCH)
+#define MINLENGTH	(MFLIMIT + 1)
+#define MAXD_LOG	16
+#define MAXD		(1 << MAXD_LOG)
+#define MAXD_MASK	(u32)(MAXD - 1)
+#define MAX_DISTANCE	(MAXD - 1)
+#define HASH_LOG	(MAXD_LOG - 1)
+#define HASHTABLESIZE	(1 << HASH_LOG)
+#define MAX_NB_ATTEMPTS	256
+#define OPTIMAL_ML	(int)((ML_MASK-1)+MINMATCH)
+#define LZ4_64KLIMIT	((1<<16) + (MFLIMIT - 1))
+#define HASHLOG64K	((MEMORY_USAGE - 2) + 1)
+#define HASH64KTABLESIZE	(1U << HASHLOG64K)
+#define LZ4_HASH_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - (MEMORY_USAGE-2)))
+#define LZ4_HASH64K_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASHLOG64K))
+#define HASH_VALUE(p)		(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASH_LOG))
+
+#if LZ4_ARCH64/* 64-bit */
+#define STEPSIZE 8
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT8(s, d);	\
+		d += 8;		\
+		s += 8;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
+
+#define LZ4_SECURECOPY(s, d, e)			\
+	do {					\
+		if (d < e) {			\
+			LZ4_WILDCOPY(s, d, e);	\
+		}				\
+	} while (0)
+#define HTYPE u32
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3)
+#endif
+
+#else	/* 32-bit */
+#define STEPSIZE 4
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT4(s, d);	\
+		d += 4;		\
+		s += 4;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)		\
+	do {				\
+		LZ4_COPYSTEP(s, d);	\
+		LZ4_COPYSTEP(s, d);	\
+	} while (0)
+
+#define LZ4_SECURECOPY	LZ4_WILDCOPY
+#define HTYPE const u8*
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
+#endif
+
+#endif
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+	(d = s - get_unaligned_le16(p))
+
+#define LZ4_WILDCOPY(s, d, e)		\
+	do {				\
+		LZ4_COPYPACKET(s, d);	\
+	} while (d < e)
+
+#define LZ4_BLINDCOPY(s, d, l)	\
+	do {	\
+		u8 *e = (d) + l;	\
+		LZ4_WILDCOPY(s, d, e);	\
+		d = e;	\
+	} while (0)
--- /dev/null
+++ b/xen/common/unlz4.c
@@ -0,0 +1,158 @@
+/*
+ * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "decompress.h"
+#include <xen/lz4.h>
+#include "lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+STATIC int INIT unlz4(unsigned char *input, unsigned int in_len,
+		      int (*fill)(void *, unsigned int),
+		      int (*flush)(void *, unsigned int),
+		      unsigned char *output,
+		      unsigned int *posp,
+		      void (*error)(const char *x))
+{
+	int ret = -1;
+	size_t chunksize = 0;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	u8 *inp;
+	u8 *inp_start;
+	u8 *outp;
+	int size = in_len -= 4;
+	size_t out_len = get_unaligned_le32(input + in_len);
+	size_t dest_len;
+
+
+	if (output) {
+		outp = output;
+	} else if (!flush) {
+		error("NULL output pointer and no flush function provided");
+		goto exit_0;
+	} else {
+		outp = large_malloc(uncomp_chunksize);
+		if (!outp) {
+			error("Could not allocate output buffer");
+			goto exit_0;
+		}
+	}
+
+	if (input && fill) {
+		error("Both input pointer and fill function provided,");
+		goto exit_1;
+	} else if (input) {
+		inp = input;
+	} else if (!fill) {
+		error("NULL input pointer and missing fill function");
+		goto exit_1;
+	} else {
+		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
+		if (!inp) {
+			error("Could not allocate input buffer");
+			goto exit_1;
+		}
+	}
+	inp_start = inp;
+
+	if (posp)
+		*posp = 0;
+
+	if (fill)
+		fill(inp, 4);
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		error("invalid header");
+		goto exit_2;
+	}
+
+	if (posp)
+		*posp += 4;
+
+	for (;;) {
+
+		if (fill)
+			fill(inp, 4);
+
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			if (posp)
+				*posp += 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (posp)
+			*posp += 4;
+
+		if (fill) {
+			if (chunksize > lz4_compressbound(uncomp_chunksize)) {
+				error("chunk length is longer than allocated");
+				goto exit_2;
+			}
+			fill(inp, chunksize);
+		}
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			error("Decoding failed");
+			goto exit_2;
+		}
+
+		if (flush && flush(outp, dest_len) != dest_len)
+			goto exit_2;
+		if (output)
+			outp += dest_len;
+		if (posp)
+			*posp += chunksize;
+
+		size -= chunksize;
+
+		if (size == 0)
+			break;
+		else if (size < 0) {
+			error("data corrupted");
+			goto exit_2;
+		}
+
+		inp += chunksize;
+		if (fill)
+			inp = inp_start;
+	}
+
+	ret = 0;
+exit_2:
+	if (!input)
+		large_free(inp_start);
+exit_1:
+	if (!output)
+		large_free(outp);
+exit_0:
+	return ret;
+}
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -25,6 +25,7 @@
 #define CONFIG_X86_PM_TIMER 1
 #define CONFIG_HPET_TIMER 1
 #define CONFIG_X86_MCE_THERMAL 1
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 #define CONFIG_NUMA 1
 #define CONFIG_DISCONTIGMEM 1
 #define CONFIG_NUMA_EMU 1
--- a/xen/include/xen/decompress.h
+++ b/xen/include/xen/decompress.h
@@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char 
  * dependent).
  */
 
-decompress_fn bunzip2, unxz, unlzma, unlzo;
+decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4;
 
 int decompress(void *inbuf, unsigned int len, void *outbuf);
 
--- /dev/null
+++ b/xen/include/xen/lz4.h
@@ -0,0 +1,90 @@
+#ifndef __LZ4_H__
+#define __LZ4_H__
+
+#include <xen/types.h>
+
+/*
+ * LZ4 Hypervisor Interface
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define LZ4_MEM_COMPRESS	(4096 * sizeof(unsigned char *))
+#define LZ4HC_MEM_COMPRESS	(65538 * sizeof(unsigned char *))
+
+/*
+ * lz4_compressbound()
+ * Provides the maximum size that LZ4 may output in a "worst case" scenario
+ * (input data not compressible)
+ */
+static inline size_t lz4_compressbound(size_t isize)
+{
+	return isize + (isize / 255) + 16;
+}
+
+/*
+ * lz4_compress()
+ *	src     : source address of the original data
+ *	src_len : size of the original data
+ *	dst	: output buffer address of the compressed data
+ *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+ *	dst_len : is the output size, which is returned after compress done
+ *	workmem : address of the working memory.
+ *		This requires 'workmem' of size LZ4_MEM_COMPRESS.
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer and workmem must be already allocated with
+ *		the defined size.
+ */
+int lz4_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+ /*
+  * lz4hc_compress()
+  *	 src	 : source address of the original data
+  *	 src_len : size of the original data
+  *	 dst	 : output buffer address of the compressed data
+  *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+  *	 dst_len : is the output size, which is returned after compress done
+  *	 workmem : address of the working memory.
+  *		This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
+  *	 return  : Success if return 0
+  *		   Error if return (< 0)
+  *	 note :  Destination buffer and workmem must be already allocated with
+  *		 the defined size.
+  */
+int lz4hc_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+/*
+ * lz4_decompress()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, whcih is returned after decompress done
+ *	dest	: output buffer address of the decompressed data
+ *	actual_dest_len: is the size of uncompressed data, supposing it's known
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ *		slightly faster than lz4_decompress_unknownoutputsize()
+ */
+int lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len);
+
+/*
+ * lz4_decompress_unknownoutputsize()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, therefore the compressed size
+ *	dest	: output buffer address of the decompressed data
+ *	dest_len: is the max size of the destination buffer, which is
+ *			returned with actual size of decompressed data after
+ *			decompress done
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ */
+int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
+		char *dest, size_t *dest_len);
+#endif



[-- Attachment #2: unlz4-dom0.patch --]
[-- Type: text/plain, Size: 23690 bytes --]

xen: add LZ4 decompression support

Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
Xen are based on LZ4 implementation by Yann Collet.

Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2

1. ARMv7, 1.5GHz based board
   Kernel: linux 3.4
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
   LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)

2. ARMv7, 1.7GHz based board
   Kernel: linux 3.7
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
   LZ4  6.5MB            86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied

This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].

But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones?  This issue
had been discussed and came to the conclusion [2].

Russell King said that we should have:

 - one decompressor which is the fastest
 - one decompressor for the highest compression ratio
 - one popular decompressor (eg conventional gzip)

If we have a replacement one for one of these, then it should do exactly
that: replace it.

The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel).  Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].

[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347

LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>

Adopted for Xen.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -49,7 +49,7 @@ obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += lzo.o
 
-obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo,$(n).init.o)
+obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4,$(n).init.o)
 
 obj-$(perfc)       += perfc.o
 obj-$(crash_debug) += gdbstub.o
--- a/xen/common/decompress.c
+++ b/xen/common/decompress.c
@@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig
     if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) )
         return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error);
 
+    if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) )
+	return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error);
+
     return 1;
 }
--- /dev/null
+++ b/xen/common/lz4/decompress.c
@@ -0,0 +1,314 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ *  - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+#include "defs.h"
+
+static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest,
+			       int osize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *ref;
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + osize;
+	BYTE *cpy;
+	unsigned token;
+	size_t length;
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	while (1) {
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			size_t len;
+
+			len = *ip++;
+			for (; len == 255; length += 255)
+				len = *ip++;
+			length += len;
+		}
+
+		/* copy literals */
+		cpy = op + length;
+		if (unlikely(cpy > oend - COPYLENGTH)) {
+			/*
+			 * Error: not enough place for another match
+			 * (min 4) + 5 literals
+			 */
+			if (cpy != oend)
+				goto _output_error;
+
+			memcpy(op, ip, length);
+			ip += length;
+			break; /* EOF */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+
+		/* Error: offset create reference outside destination buffer */
+		if (unlikely(ref < (BYTE *const) dest))
+			goto _output_error;
+
+		/* get matchlength */
+		length = token & ML_MASK;
+		if (length == ML_MASK) {
+			for (; *ip == 255; length += 255)
+				ip++;
+			length += *ip++;
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+			op[0] = ref[0];
+			op[1] = ref[1];
+			op[2] = ref[2];
+			op[3] = ref[3];
+			op += 4;
+			ref += 4;
+			ref -= dec32table[op-ref];
+			PUT4(ref, op);
+			op += STEPSIZE - 4;
+			ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE - 4);
+		if (cpy > (oend - COPYLENGTH)) {
+
+			/* Error: request to write beyond destination buffer */
+			if (cpy > oend)
+				goto _output_error;
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (((unsigned char *)ip) - source);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(((unsigned char *)ip) - source));
+}
+
+#ifndef __XEN__
+static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
+				int isize, size_t maxoutputsize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *const iend = ip + isize;
+	const BYTE *ref;
+
+
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + maxoutputsize;
+	BYTE *cpy;
+
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	/* Main Loop */
+	while (ip < iend) {
+
+		unsigned token;
+		size_t length;
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			int s = 255;
+			while ((ip < iend) && (s == 255)) {
+				s = *ip++;
+				length += s;
+			}
+		}
+		/* copy literals */
+		cpy = op + length;
+		if ((cpy > oend - COPYLENGTH) ||
+			(ip + length > iend - COPYLENGTH)) {
+
+			if (cpy > oend)
+				goto _output_error;/* writes beyond buffer */
+
+			if (ip + length != iend)
+				goto _output_error;/*
+						    * Error: LZ4 format requires
+						    * to consume all input
+						    * at this stage
+						    */
+			memcpy(op, ip, length);
+			op += length;
+			break;/* Necessarily EOF, due to parsing restrictions */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+		if (ref < (BYTE * const) dest)
+			goto _output_error;
+			/*
+			 * Error : offset creates reference
+			 * outside of destination buffer
+			 */
+
+		/* get matchlength */
+		length = (token & ML_MASK);
+		if (length == ML_MASK) {
+			while (ip < iend) {
+				int s = *ip++;
+				length += s;
+				if (s == 255)
+					continue;
+				break;
+			}
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+				op[0] = ref[0];
+				op[1] = ref[1];
+				op[2] = ref[2];
+				op[3] = ref[3];
+				op += 4;
+				ref += 4;
+				ref -= dec32table[op - ref];
+				PUT4(ref, op);
+				op += STEPSIZE - 4;
+				ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE-4);
+		if (cpy > oend - COPYLENGTH) {
+			if (cpy > oend)
+				goto _output_error; /* write outside of buf */
+
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (((char *) op) - dest);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(((char *) ip) - source));
+}
+#endif
+
+STATIC int INIT lz4_decompress(const unsigned char *src, size_t *src_len,
+			       unsigned char *dest, size_t actual_dest_len)
+{
+	int ret = -1;
+	int input_len = 0;
+
+	input_len = lz4_uncompress(src, dest, actual_dest_len);
+	if (input_len < 0)
+		goto exit_0;
+	*src_len = input_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#ifndef __XEN__
+int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
+		char *dest, size_t *dest_len)
+{
+	int ret = -1;
+	int out_len = 0;
+
+	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
+					*dest_len);
+	if (out_len < 0)
+		goto exit_0;
+	*dest_len = out_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+#endif
--- /dev/null
+++ b/xen/common/lz4/defs.h
@@ -0,0 +1,184 @@
+/*
+ * lz4defs.h -- architecture specific defines
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifdef __XEN__
+#include <asm/byteorder.h>
+#endif
+
+#ifdef __LITTLE_ENDIAN
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpup(p);
+}
+
+static inline u32 INIT get_unaligned_le32(const void *p)
+{
+	return le32_to_cpup(p);
+}
+#else
+#include <asm/unaligned.h>
+
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpu(__get_unaligned(p, 2));
+}
+
+static inline u32 INIT get_unaligned_le32(void *p)
+{
+	return le32_to_cpu(__get_unaligned(p, 4));
+}
+#endif
+
+/*
+ * Detects 64 bits mode
+ */
+#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
+	|| defined(__ppc64__) || defined(__LP64__))
+#define LZ4_ARCH64 1
+#else
+#define LZ4_ARCH64 0
+#endif
+
+/*
+ * Architecture-specific macros
+ */
+#define BYTE	u8
+typedef struct _U16_S { u16 v; } U16_S;
+typedef struct _U32_S { u32 v; } U32_S;
+typedef struct _U64_S { u64 v; } U64_S;
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)		\
+	|| defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6	\
+	&& defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
+
+#define A16(x) (((U16_S *)(x))->v)
+#define A32(x) (((U32_S *)(x))->v)
+#define A64(x) (((U64_S *)(x))->v)
+
+#define PUT4(s, d) (A32(d) = A32(s))
+#define PUT8(s, d) (A64(d) = A64(s))
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		A16(p) = v; \
+		p += 2; \
+	} while (0)
+#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+
+#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
+#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v))
+#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v))
+
+#define PUT4(s, d) \
+	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
+#define PUT8(s, d) \
+	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
+
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		put_unaligned(v, (u16 *)(p)); \
+		p += 2; \
+	} while (0)
+#endif
+
+#define COPYLENGTH 8
+#define ML_BITS  4
+#define ML_MASK  ((1U << ML_BITS) - 1)
+#define RUN_BITS (8 - ML_BITS)
+#define RUN_MASK ((1U << RUN_BITS) - 1)
+#define MEMORY_USAGE	14
+#define MINMATCH	4
+#define SKIPSTRENGTH	6
+#define LASTLITERALS	5
+#define MFLIMIT		(COPYLENGTH + MINMATCH)
+#define MINLENGTH	(MFLIMIT + 1)
+#define MAXD_LOG	16
+#define MAXD		(1 << MAXD_LOG)
+#define MAXD_MASK	(u32)(MAXD - 1)
+#define MAX_DISTANCE	(MAXD - 1)
+#define HASH_LOG	(MAXD_LOG - 1)
+#define HASHTABLESIZE	(1 << HASH_LOG)
+#define MAX_NB_ATTEMPTS	256
+#define OPTIMAL_ML	(int)((ML_MASK-1)+MINMATCH)
+#define LZ4_64KLIMIT	((1<<16) + (MFLIMIT - 1))
+#define HASHLOG64K	((MEMORY_USAGE - 2) + 1)
+#define HASH64KTABLESIZE	(1U << HASHLOG64K)
+#define LZ4_HASH_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - (MEMORY_USAGE-2)))
+#define LZ4_HASH64K_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASHLOG64K))
+#define HASH_VALUE(p)		(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASH_LOG))
+
+#if LZ4_ARCH64/* 64-bit */
+#define STEPSIZE 8
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT8(s, d);	\
+		d += 8;		\
+		s += 8;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
+
+#define LZ4_SECURECOPY(s, d, e)			\
+	do {					\
+		if (d < e) {			\
+			LZ4_WILDCOPY(s, d, e);	\
+		}				\
+	} while (0)
+#define HTYPE u32
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3)
+#endif
+
+#else	/* 32-bit */
+#define STEPSIZE 4
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT4(s, d);	\
+		d += 4;		\
+		s += 4;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)		\
+	do {				\
+		LZ4_COPYSTEP(s, d);	\
+		LZ4_COPYSTEP(s, d);	\
+	} while (0)
+
+#define LZ4_SECURECOPY	LZ4_WILDCOPY
+#define HTYPE const u8*
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
+#endif
+
+#endif
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+	(d = s - get_unaligned_le16(p))
+
+#define LZ4_WILDCOPY(s, d, e)		\
+	do {				\
+		LZ4_COPYPACKET(s, d);	\
+	} while (d < e)
+
+#define LZ4_BLINDCOPY(s, d, l)	\
+	do {	\
+		u8 *e = (d) + l;	\
+		LZ4_WILDCOPY(s, d, e);	\
+		d = e;	\
+	} while (0)
--- /dev/null
+++ b/xen/common/unlz4.c
@@ -0,0 +1,158 @@
+/*
+ * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "decompress.h"
+#include <xen/lz4.h>
+#include "lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+STATIC int INIT unlz4(unsigned char *input, unsigned int in_len,
+		      int (*fill)(void *, unsigned int),
+		      int (*flush)(void *, unsigned int),
+		      unsigned char *output,
+		      unsigned int *posp,
+		      void (*error)(const char *x))
+{
+	int ret = -1;
+	size_t chunksize = 0;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	u8 *inp;
+	u8 *inp_start;
+	u8 *outp;
+	int size = in_len -= 4;
+	size_t out_len = get_unaligned_le32(input + in_len);
+	size_t dest_len;
+
+
+	if (output) {
+		outp = output;
+	} else if (!flush) {
+		error("NULL output pointer and no flush function provided");
+		goto exit_0;
+	} else {
+		outp = large_malloc(uncomp_chunksize);
+		if (!outp) {
+			error("Could not allocate output buffer");
+			goto exit_0;
+		}
+	}
+
+	if (input && fill) {
+		error("Both input pointer and fill function provided,");
+		goto exit_1;
+	} else if (input) {
+		inp = input;
+	} else if (!fill) {
+		error("NULL input pointer and missing fill function");
+		goto exit_1;
+	} else {
+		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
+		if (!inp) {
+			error("Could not allocate input buffer");
+			goto exit_1;
+		}
+	}
+	inp_start = inp;
+
+	if (posp)
+		*posp = 0;
+
+	if (fill)
+		fill(inp, 4);
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		error("invalid header");
+		goto exit_2;
+	}
+
+	if (posp)
+		*posp += 4;
+
+	for (;;) {
+
+		if (fill)
+			fill(inp, 4);
+
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			if (posp)
+				*posp += 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (posp)
+			*posp += 4;
+
+		if (fill) {
+			if (chunksize > lz4_compressbound(uncomp_chunksize)) {
+				error("chunk length is longer than allocated");
+				goto exit_2;
+			}
+			fill(inp, chunksize);
+		}
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			error("Decoding failed");
+			goto exit_2;
+		}
+
+		if (flush && flush(outp, dest_len) != dest_len)
+			goto exit_2;
+		if (output)
+			outp += dest_len;
+		if (posp)
+			*posp += chunksize;
+
+		size -= chunksize;
+
+		if (size == 0)
+			break;
+		else if (size < 0) {
+			error("data corrupted");
+			goto exit_2;
+		}
+
+		inp += chunksize;
+		if (fill)
+			inp = inp_start;
+	}
+
+	ret = 0;
+exit_2:
+	if (!input)
+		large_free(inp_start);
+exit_1:
+	if (!output)
+		large_free(outp);
+exit_0:
+	return ret;
+}
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -25,6 +25,7 @@
 #define CONFIG_X86_PM_TIMER 1
 #define CONFIG_HPET_TIMER 1
 #define CONFIG_X86_MCE_THERMAL 1
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 #define CONFIG_NUMA 1
 #define CONFIG_DISCONTIGMEM 1
 #define CONFIG_NUMA_EMU 1
--- a/xen/include/xen/decompress.h
+++ b/xen/include/xen/decompress.h
@@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char 
  * dependent).
  */
 
-decompress_fn bunzip2, unxz, unlzma, unlzo;
+decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4;
 
 int decompress(void *inbuf, unsigned int len, void *outbuf);
 
--- /dev/null
+++ b/xen/include/xen/lz4.h
@@ -0,0 +1,90 @@
+#ifndef __LZ4_H__
+#define __LZ4_H__
+
+#include <xen/types.h>
+
+/*
+ * LZ4 Hypervisor Interface
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define LZ4_MEM_COMPRESS	(4096 * sizeof(unsigned char *))
+#define LZ4HC_MEM_COMPRESS	(65538 * sizeof(unsigned char *))
+
+/*
+ * lz4_compressbound()
+ * Provides the maximum size that LZ4 may output in a "worst case" scenario
+ * (input data not compressible)
+ */
+static inline size_t lz4_compressbound(size_t isize)
+{
+	return isize + (isize / 255) + 16;
+}
+
+/*
+ * lz4_compress()
+ *	src     : source address of the original data
+ *	src_len : size of the original data
+ *	dst	: output buffer address of the compressed data
+ *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+ *	dst_len : is the output size, which is returned after compress done
+ *	workmem : address of the working memory.
+ *		This requires 'workmem' of size LZ4_MEM_COMPRESS.
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer and workmem must be already allocated with
+ *		the defined size.
+ */
+int lz4_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+ /*
+  * lz4hc_compress()
+  *	 src	 : source address of the original data
+  *	 src_len : size of the original data
+  *	 dst	 : output buffer address of the compressed data
+  *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+  *	 dst_len : is the output size, which is returned after compress done
+  *	 workmem : address of the working memory.
+  *		This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
+  *	 return  : Success if return 0
+  *		   Error if return (< 0)
+  *	 note :  Destination buffer and workmem must be already allocated with
+  *		 the defined size.
+  */
+int lz4hc_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+/*
+ * lz4_decompress()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, whcih is returned after decompress done
+ *	dest	: output buffer address of the decompressed data
+ *	actual_dest_len: is the size of uncompressed data, supposing it's known
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ *		slightly faster than lz4_decompress_unknownoutputsize()
+ */
+int lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len);
+
+/*
+ * lz4_decompress_unknownoutputsize()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, therefore the compressed size
+ *	dest	: output buffer address of the decompressed data
+ *	dest_len: is the max size of the destination buffer, which is
+ *			returned with actual size of decompressed data after
+ *			decompress done
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ */
+int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
+		char *dest, size_t *dest_len);
+#endif

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-23 14:51 [PATCH 0/2] add LZ4 kernel decompression support Jan Beulich
  2013-09-23 14:54 ` [PATCH 1/2] xen: add LZ4 " Jan Beulich
@ 2013-09-23 14:56 ` Jan Beulich
  2013-09-23 15:00   ` Ian Campbell
  2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
  2 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-23 14:56 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

[-- Attachment #1: Type: text/plain, Size: 4653 bytes --]

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing. Perhaps this should therefore
be considered RFC.

--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -573,6 +573,124 @@ int xc_try_xz_decode(struct xc_dom_image
 
 #endif /* !__MINIOS__ */
 
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#define STATIC static
+#define u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+#define INIT
+#define unlikely(x) (x)
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/common/lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+static int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_size, out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_size = out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_size;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+exit_0:
+	return ret;
+}
+
 struct setup_header {
     uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
     uint8_t  setup_sects;
@@ -733,6 +851,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -151,7 +151,7 @@ _output_error:
 	return (int) (-(((unsigned char *)ip) - source));
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
 				int isize, size_t maxoutputsize)
 {
@@ -294,7 +294,7 @@ exit_0:
 	return ret;
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
 		char *dest, size_t *dest_len)
 {



[-- Attachment #2: unlz4-domU.patch --]
[-- Type: text/plain, Size: 4639 bytes --]

libxc: add LZ4 decompression support

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing.

--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -573,6 +573,124 @@ int xc_try_xz_decode(struct xc_dom_image
 
 #endif /* !__MINIOS__ */
 
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#define STATIC static
+#define u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+#define INIT
+#define unlikely(x) (x)
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/common/lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+static int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_size, out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_size = out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_size;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+exit_0:
+	return ret;
+}
+
 struct setup_header {
     uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
     uint8_t  setup_sects;
@@ -733,6 +851,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -151,7 +151,7 @@ _output_error:
 	return (int) (-(((unsigned char *)ip) - source));
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
 				int isize, size_t maxoutputsize)
 {
@@ -294,7 +294,7 @@ exit_0:
 	return ret;
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
 		char *dest, size_t *dest_len)
 {

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-23 14:56 ` [PATCH 2/2] libxc: " Jan Beulich
@ 2013-09-23 15:00   ` Ian Campbell
  2013-09-23 15:22     ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-09-23 15:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote:
> Since there's no shared or static library to link against, this simply
> re-uses the hypervisor side code. However, I only audited the code
> added here for possible security issues, not the referenced code in
> the hypervisor tree.

With that in mind, you aren't adding a
tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff
in pvgrub domains.

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I intentionally retained the tab indentation in the code cloned from
> its hypervisor original (which in turn retained it as being a clone
> of the Linux original), to ease diff-ing. Perhaps this should therefore
> be considered RFC.

That's fine IMHO.

> 
> --- a/tools/libxc/xc_dom_bzimageloader.c
> +++ b/tools/libxc/xc_dom_bzimageloader.c
> @@ -573,6 +573,124 @@ int xc_try_xz_decode(struct xc_dom_image
>  
>  #endif /* !__MINIOS__ */
>  
> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#define STATIC static
> +#define u8 uint8_t
> +#define u16 uint16_t
> +#define u32 uint32_t
> +#define u64 uint64_t
> +#define INIT
> +#define unlikely(x) (x)
> +
> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
> +{
> +    return buf[0] | (buf[1] << 8);
> +}
> +
> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
> +{
> +    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
> +}
> +
> +#include "../../xen/common/lz4/decompress.c"
> +
> +/*
> + * Note: Uncompressed chunk size is used in the compressor side
> + * (userspace side for compression).
> + * It is hardcoded because there is not proper way to extract it
> + * from the binary stream which is generated by the preliminary
> + * version of LZ4 tool so far.
> + */
> +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
> +#define ARCHIVE_MAGICNUMBER 0x184C2102
> +
> +static int xc_try_lz4_decode(
> +	struct xc_dom_image *dom, void **blob, size_t *psize)
> +{
> +	int ret = -1;
> +	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
> +	unsigned char *inp = *blob, *output, *outp;
> +	ssize_t size = *psize - 4;
> +	size_t out_size, out_len, dest_len, chunksize;
> +	const char *msg;
> +
> +	if (size < 4) {
> +		msg = "input too small";
> +		goto exit_0;
> +	}
> +
> +	out_size = out_len = get_unaligned_le32(inp + size);
> +	if (xc_dom_kernel_check_size(dom, out_len)) {
> +		msg = "Decompressed image too large";
> +		goto exit_0;
> +	}
> +
> +	output = malloc(out_len);
> +	if (!output) {
> +		msg = "Could not allocate output buffer";
> +		goto exit_0;
> +	}
> +	outp = output;
> +
> +	chunksize = get_unaligned_le32(inp);
> +	if (chunksize == ARCHIVE_MAGICNUMBER) {
> +		inp += 4;
> +		size -= 4;
> +	} else {
> +		msg = "invalid header";
> +		goto exit_2;
> +	}
> +
> +	for (;;) {
> +		if (size < 4) {
> +			msg = "missing data";
> +			goto exit_2;
> +		}
> +		chunksize = get_unaligned_le32(inp);
> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
> +			inp += 4;
> +			size -= 4;
> +			continue;
> +		}
> +		inp += 4;
> +		size -= 4;
> +
> +		if (out_len >= uncomp_chunksize) {
> +			dest_len = uncomp_chunksize;
> +			out_len -= dest_len;
> +		} else
> +			dest_len = out_len;
> +		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
> +		if (ret < 0) {
> +			msg = "decoding failed";
> +			goto exit_2;
> +		}
> +
> +		outp += dest_len;
> +		size -= chunksize;
> +
> +		if (size == 0)
> +		{
> +			*blob = output;
> +			*psize = out_size;
> +			return 0;
> +		}
> +
> +		if (size < 0) {
> +			msg = "data corrupted";
> +			goto exit_2;
> +		}
> +
> +		inp += chunksize;
> +	}
> +
> +exit_2:
> +	free(output);
> +	DOMPRINTF("LZ4 decompression error: %s\n", msg);
> +exit_0:
> +	return ret;
> +}
> +
>  struct setup_header {
>      uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
>      uint8_t  setup_sects;
> @@ -733,6 +851,17 @@ static int xc_dom_probe_bzimage_kernel(s
>              return -EINVAL;
>          }
>      }
> +    else if ( check_magic(dom, "\x02\x21", 2) )
> +    {
> +        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
> +        if ( ret < 0 )
> +        {
> +            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> +                         "%s unable to LZ4 decompress kernel\n",
> +                         __FUNCTION__);
> +            return -EINVAL;
> +        }
> +    }
>      else
>      {
>          xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> --- a/xen/common/lz4/decompress.c
> +++ b/xen/common/lz4/decompress.c
> @@ -151,7 +151,7 @@ _output_error:
>  	return (int) (-(((unsigned char *)ip) - source));
>  }
>  
> -#ifndef __XEN__
> +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
>  static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
>  				int isize, size_t maxoutputsize)
>  {
> @@ -294,7 +294,7 @@ exit_0:
>  	return ret;
>  }
>  
> -#ifndef __XEN__
> +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
>  int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
>  		char *dest, size_t *dest_len)
>  {
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] xen: add LZ4 decompression support
  2013-09-23 14:54 ` [PATCH 1/2] xen: add LZ4 " Jan Beulich
@ 2013-09-23 15:00   ` Ian Campbell
  2013-09-23 15:10     ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-09-23 15:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Mon, 2013-09-23 at 15:54 +0100, Jan Beulich wrote:
> Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
> Xen are based on LZ4 implementation by Yann Collet.
> 
> Benchmark Results(PATCH v3)
> Compiler: Linaro ARM gcc 4.6.2
> 
> 1. ARMv7, 1.5GHz based board
> [...]
> 2. ARMv7, 1.7GHz based board

BTW Xen doesn't use these decompressors on ARM, the kernel self extracts
in guest context. I see you aren't adding this to the ARM build anyway,
just thought I'd explain ;-)

[...]
> Russell King said that we should have:
> 
>  - one decompressor which is the fastest
>  - one decompressor for the highest compression ratio
>  - one popular decompressor (eg conventional gzip)
> 
> If we have a replacement one for one of these, then it should do exactly
> that: replace it.

What a shame this logic doesn't apply to Xen's use too since we need to
support old kernels :-(

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] xen: add LZ4 decompression support
  2013-09-23 15:00   ` Ian Campbell
@ 2013-09-23 15:10     ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2013-09-23 15:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2013-09-23 at 15:54 +0100, Jan Beulich wrote:
>> Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
>> Xen are based on LZ4 implementation by Yann Collet.
>> 
>> Benchmark Results(PATCH v3)
>> Compiler: Linaro ARM gcc 4.6.2
>> 
>> 1. ARMv7, 1.5GHz based board
>> [...]
>> 2. ARMv7, 1.7GHz based board
> 
> BTW Xen doesn't use these decompressors on ARM, the kernel self extracts
> in guest context. I see you aren't adding this to the ARM build anyway,
> just thought I'd explain ;-)

Yeah, I just copied the Linux commit message mostly verbatim.

>> Russell King said that we should have:
>> 
>>  - one decompressor which is the fastest
>>  - one decompressor for the highest compression ratio
>>  - one popular decompressor (eg conventional gzip)
>> 
>> If we have a replacement one for one of these, then it should do exactly
>> that: replace it.
> 
> What a shame this logic doesn't apply to Xen's use too since we need to
> support old kernels :-(

And they didn't rip out the LZO one either...

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-23 15:00   ` Ian Campbell
@ 2013-09-23 15:22     ` Jan Beulich
  2013-09-24  9:34       ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-23 15:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote:
>> Since there's no shared or static library to link against, this simply
>> re-uses the hypervisor side code. However, I only audited the code
>> added here for possible security issues, not the referenced code in
>> the hypervisor tree.
> 
> With that in mind, you aren't adding a
> tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff
> in pvgrub domains.

I'd prefer to leave this to someone who knows what he's doing;
I don't think I even know when the respective build portion gets
enabled. Does this hang off the stubdom build (which I try to
avoid because of the big number of build warnings it produces,
all giving the impression that I'm not doing this right)? Despite it
being guarded by __MINIOS__ checks, I don't think the mini-os
build alone triggers this.

And then again I specifically placed the newly added code outside
the __MINIOS__ guarded region, as from looking at the code I
concluded that these "unsafe" decompressors do nothing else
than what I did here: Include the hypervisor sources. So I was
sort of hoping that the extra stuff wouldn't even be necessary
here.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-23 15:22     ` Jan Beulich
@ 2013-09-24  9:34       ` Ian Campbell
  2013-09-24  9:49         ` Jan Beulich
  2013-09-24 10:05         ` Ian Campbell
  0 siblings, 2 replies; 29+ messages in thread
From: Ian Campbell @ 2013-09-24  9:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:
> >>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote:
> >> Since there's no shared or static library to link against, this simply
> >> re-uses the hypervisor side code. However, I only audited the code
> >> added here for possible security issues, not the referenced code in
> >> the hypervisor tree.
> > 
> > With that in mind, you aren't adding a
> > tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff
> > in pvgrub domains.
> 
> I'd prefer to leave this to someone who knows what he's doing;
> I don't think I even know when the respective build portion gets
> enabled. Does this hang off the stubdom build

Yes.

>  (which I try to
> avoid because of the big number of build warnings it produces,
> all giving the impression that I'm not doing this right)?

The warnings are "normal". I think they mostly come from the third party
stuff which stubdom builds.

>  Despite it
> being guarded by __MINIOS__ checks, I don't think the mini-os
> build alone triggers this.

No, you would need one of the stubdoms which enables libxc.

> And then again I specifically placed the newly added code outside
> the __MINIOS__ guarded region, as from looking at the code I
> concluded that these "unsafe" decompressors do nothing else
> than what I did here: Include the hypervisor sources. So I was
> sort of hoping that the extra stuff wouldn't even be necessary
> here.

I'll give it a go and see what happens...

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24  9:34       ` Ian Campbell
@ 2013-09-24  9:49         ` Jan Beulich
  2013-09-24 10:01           ` Ian Campbell
  2013-09-24 10:05         ` Ian Campbell
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-24  9:49 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:
>>  (which I try to
>> avoid because of the big number of build warnings it produces,
>> all giving the impression that I'm not doing this right)?
> 
> The warnings are "normal". I think they mostly come from the third party
> stuff which stubdom builds.

I realize they're "normal", but that way "not normal" ones are
generally hidden in the noise of all the existing ones, which
results in a build that I personally don't have much faith in (i.e.
seeing a successful build doesn't really mean much).

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24  9:49         ` Jan Beulich
@ 2013-09-24 10:01           ` Ian Campbell
  2013-09-24 10:17             ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-09-24 10:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote:
> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:
> >>  (which I try to
> >> avoid because of the big number of build warnings it produces,
> >> all giving the impression that I'm not doing this right)?
> > 
> > The warnings are "normal". I think they mostly come from the third party
> > stuff which stubdom builds.
> 
> I realize they're "normal", but that way "not normal" ones are
> generally hidden in the noise of all the existing ones, which
> results in a build that I personally don't have much faith in (i.e.
> seeing a successful build doesn't really mean much).

Under the circumstances I'd normally diff the before and after logs.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24  9:34       ` Ian Campbell
  2013-09-24  9:49         ` Jan Beulich
@ 2013-09-24 10:05         ` Ian Campbell
  2013-09-24 15:02           ` Jan Beulich
  1 sibling, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-09-24 10:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Tue, 2013-09-24 at 10:34 +0100, Ian Campbell wrote:
> I'll give it a go and see what happens...

gcc [...] -c -o xc_dom_bzimageloader.o xc_dom_bzimageloader.c 
In file included from ../../xen/common/lz4/decompress.c:40:0,
                 from xc_dom_bzimageloader.c:595:
../../xen/common/lz4/defs.h:26:27: fatal error: asm/unaligned.h: No such file or directory
compilation terminated.
make[2]: *** [xc_dom_bzimageloader.o] Error 1
make[2]: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'
make[1]: *** [build] Error 2
make[1]: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'
make: *** [libxc-x86_64/libxenctrl.a] Error 2
make: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom'

This incremental hunk fixes it.  xen/common/unlzo.c uses an uglier #if
1, but this seems correct to me.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Feel free to fold it in.

diff --git a/xen/common/lz4/defs.h b/xen/common/lz4/defs.h
index f46df08..dca9519 100644
--- a/xen/common/lz4/defs.h
+++ b/xen/common/lz4/defs.h
@@ -12,6 +12,10 @@
 #include <asm/byteorder.h>
 #endif
 
+#ifdef __MINIOS__
+#include <mini-os/byteorder.h>
+#endif
+
 #ifdef __LITTLE_ENDIAN
 static inline u16 INIT get_unaligned_le16(const void *p)
 {

With that and a hack to not inline xc_try_lz4_decode I see the expected
symbols in stubdom/mini-os-x86_64-grub/mini-os (I've not actually run
it)

BTW, I noticed that xc_try_lz4_decode does not prefix its errors with
"LZ4" like most of the other such functions do.

Ian.

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24 10:01           ` Ian Campbell
@ 2013-09-24 10:17             ` Jan Beulich
  2013-09-24 10:18               ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-24 10:17 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 24.09.13 at 12:01, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote:
>> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:
>> >>  (which I try to
>> >> avoid because of the big number of build warnings it produces,
>> >> all giving the impression that I'm not doing this right)?
>> > 
>> > The warnings are "normal". I think they mostly come from the third party
>> > stuff which stubdom builds.
>> 
>> I realize they're "normal", but that way "not normal" ones are
>> generally hidden in the noise of all the existing ones, which
>> results in a build that I personally don't have much faith in (i.e.
>> seeing a successful build doesn't really mean much).
> 
> Under the circumstances I'd normally diff the before and after logs.

With parallel builds that usually isn't very helpful.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24 10:17             ` Jan Beulich
@ 2013-09-24 10:18               ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-09-24 10:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Tue, 2013-09-24 at 11:17 +0100, Jan Beulich wrote:
> >>> On 24.09.13 at 12:01, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote:
> >> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:
> >> >>  (which I try to
> >> >> avoid because of the big number of build warnings it produces,
> >> >> all giving the impression that I'm not doing this right)?
> >> > 
> >> > The warnings are "normal". I think they mostly come from the third party
> >> > stuff which stubdom builds.
> >> 
> >> I realize they're "normal", but that way "not normal" ones are
> >> generally hidden in the noise of all the existing ones, which
> >> results in a build that I personally don't have much faith in (i.e.
> >> seeing a successful build doesn't really mean much).
> > 
> > Under the circumstances I'd normally diff the before and after logs.
> 
> With parallel builds that usually isn't very helpful.

I usually either don't do a parallel build in such cases or I sort the
log first, which is enough to spot any new message...

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] libxc: add LZ4 decompression support
  2013-09-24 10:05         ` Ian Campbell
@ 2013-09-24 15:02           ` Jan Beulich
  2013-09-24 16:29             ` [PATCH v2] " Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-24 15:02 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 24.09.13 at 12:05, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2013-09-24 at 10:34 +0100, Ian Campbell wrote:
>> I'll give it a go and see what happens...
> 
> gcc [...] -c -o xc_dom_bzimageloader.o xc_dom_bzimageloader.c 
> In file included from ../../xen/common/lz4/decompress.c:40:0,
>                  from xc_dom_bzimageloader.c:595:
> ../../xen/common/lz4/defs.h:26:27: fatal error: asm/unaligned.h: No such 
> file or directory
> compilation terminated.
> make[2]: *** [xc_dom_bzimageloader.o] Error 1
> make[2]: Leaving directory 
> `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'
> make[1]: *** [build] Error 2
> make[1]: Leaving directory 
> `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'
> make: *** [libxc-x86_64/libxenctrl.a] Error 2
> make: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom'
> 
> This incremental hunk fixes it.  xen/common/unlzo.c uses an uglier #if
> 1, but this seems correct to me.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Feel free to fold it in.

Done, albeit not in the xen/ subtree but instead in
tools/libxc/xc_dom_bzimageloader.c itself. Thanks for helping
out with this!

> --- a/xen/common/lz4/defs.h
> +++ b/xen/common/lz4/defs.h
> @@ -12,6 +12,10 @@
>  #include <asm/byteorder.h>
>  #endif
>  
> +#ifdef __MINIOS__
> +#include <mini-os/byteorder.h>
> +#endif
> +
>  #ifdef __LITTLE_ENDIAN
>  static inline u16 INIT get_unaligned_le16(const void *p)
>  {
> 
> With that and a hack to not inline xc_try_lz4_decode I see the expected
> symbols in stubdom/mini-os-x86_64-grub/mini-os (I've not actually run
> it)
> 
> BTW, I noticed that xc_try_lz4_decode does not prefix its errors with
> "LZ4" like most of the other such functions do.

Just like the LZO code it simply stores a message in a local variable
and has a single point where DOMPRINTF() gets used, and that
single point does have a proper prefix.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] libxc: add LZ4 decompression support
  2013-09-24 15:02           ` Jan Beulich
@ 2013-09-24 16:29             ` Jan Beulich
  2013-09-24 16:53               ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-24 16:29 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

[-- Attachment #1: Type: text/plain, Size: 5125 bytes --]

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Fold in the addition of a previously missing include in the Mini-OS
    case - provided by Ian Campbell - to allow stubdom compilation to
    succeed.
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing.

--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode(
 
 #else /* __MINIOS__ */
 
+#include <mini-os/byteorder.h>
+
 int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t *size);
 int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t *size);
 int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t *size);
@@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
 
 #endif /* !__MINIOS__ */
 
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#define STATIC static
+#define u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+#define INIT
+#define unlikely(x) (x)
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/common/lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+static int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_size, out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_size = out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_size;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+exit_0:
+	return ret;
+}
+
 struct setup_header {
     uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
     uint8_t  setup_sects;
@@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -151,7 +151,7 @@ _output_error:
 	return (int) (-(((unsigned char *)ip) - source));
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
 				int isize, size_t maxoutputsize)
 {
@@ -294,7 +294,7 @@ exit_0:
 	return ret;
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
 		char *dest, size_t *dest_len)
 {



[-- Attachment #2: unlz4-domU.patch --]
[-- Type: text/plain, Size: 5161 bytes --]

libxc: add LZ4 decompression support

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Fold in the addition of a previously missing include in the Mini-OS
    case - provided by Ian Campbell - to allow stubdom compilation to
    succeed.
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing.

--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode(
 
 #else /* __MINIOS__ */
 
+#include <mini-os/byteorder.h>
+
 int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t *size);
 int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t *size);
 int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t *size);
@@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
 
 #endif /* !__MINIOS__ */
 
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#define STATIC static
+#define u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+#define INIT
+#define unlikely(x) (x)
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/common/lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+static int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_size, out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_size = out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_size;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+exit_0:
+	return ret;
+}
+
 struct setup_header {
     uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
     uint8_t  setup_sects;
@@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- a/xen/common/lz4/decompress.c
+++ b/xen/common/lz4/decompress.c
@@ -151,7 +151,7 @@ _output_error:
 	return (int) (-(((unsigned char *)ip) - source));
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
 				int isize, size_t maxoutputsize)
 {
@@ -294,7 +294,7 @@ exit_0:
 	return ret;
 }
 
-#ifndef __XEN__
+#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
 int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
 		char *dest, size_t *dest_len)
 {

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-24 16:29             ` [PATCH v2] " Jan Beulich
@ 2013-09-24 16:53               ` Ian Campbell
  2013-09-24 19:55                 ` Yann Collet
  2013-09-25  7:23                 ` Jan Beulich
  0 siblings, 2 replies; 29+ messages in thread
From: Ian Campbell @ 2013-09-24 16:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote:
> Since there's no shared or static library to link against, this simply
> re-uses the hypervisor side code. However, I only audited the code
> added here for possible security issues, not the referenced code in
> the hypervisor tree.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Fold in the addition of a previously missing include in the Mini-OS
>     case - provided by Ian Campbell - to allow stubdom compilation to
>     succeed.
> ---
> I intentionally retained the tab indentation in the code cloned from
> its hypervisor original (which in turn retained it as being a clone
> of the Linux original), to ease diff-ing.
> 
> --- a/tools/libxc/xc_dom_bzimageloader.c
> +++ b/tools/libxc/xc_dom_bzimageloader.c
> @@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode(
>  
>  #else /* __MINIOS__ */
>  
> +#include <mini-os/byteorder.h>
> +
>  int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t *size);
>  int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t *size);
>  int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t *size);
> @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
>  
>  #endif /* !__MINIOS__ */
>  
> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#define STATIC static
> +#define u8 uint8_t
> +#define u16 uint16_t
> +#define u32 uint32_t
> +#define u64 uint64_t
> +#define INIT
> +#define unlikely(x) (x)

I think rather than pollute this file with this sort of thing (which
might have unexpected consequences in the future) this would be better
of placed in a separate file compiled for both regular and stub use.

> +
> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
> +{
> +    return buf[0] | (buf[1] << 8);
> +}
> +
> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
> +{
> +    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
> +}
> +
> +#include "../../xen/common/lz4/decompress.c"
> +
> +/*
> + * Note: Uncompressed chunk size is used in the compressor side
> + * (userspace side for compression).
> + * It is hardcoded because there is not proper way to extract it
> + * from the binary stream which is generated by the preliminary
> + * version of LZ4 tool so far.
> + */
> +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
> +#define ARCHIVE_MAGICNUMBER 0x184C2102
> +
> +static int xc_try_lz4_decode(
> +	struct xc_dom_image *dom, void **blob, size_t *psize)
> +{
> +	int ret = -1;
> +	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
> +	unsigned char *inp = *blob, *output, *outp;
> +	ssize_t size = *psize - 4;
> +	size_t out_size, out_len, dest_len, chunksize;
> +	const char *msg;
> +
> +	if (size < 4) {
> +		msg = "input too small";
> +		goto exit_0;

The exit_0 loop is after the error print, so these errors don't get
printed.

I don't really like exit_0 and exit_2 as names (what happened to
exit_1?). "err" and "err_free" would be more usual.


> +	}
> +
> +	out_size = out_len = get_unaligned_le32(inp + size);
> +	if (xc_dom_kernel_check_size(dom, out_len)) {
> +		msg = "Decompressed image too large";
> +		goto exit_0;
> +	}
> +
> +	output = malloc(out_len);
> +	if (!output) {
> +		msg = "Could not allocate output buffer";
> +		goto exit_0;
> +	}
> +	outp = output;
> +
> +	chunksize = get_unaligned_le32(inp);
> +	if (chunksize == ARCHIVE_MAGICNUMBER) {
> +		inp += 4;
> +		size -= 4;
> +	} else {
> +		msg = "invalid header";
> +		goto exit_2;
> +	}
> +
> +	for (;;) {
> +		if (size < 4) {
> +			msg = "missing data";
> +			goto exit_2;
> +		}
> +		chunksize = get_unaligned_le32(inp);
> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
> +			inp += 4;
> +			size -= 4;
> +			continue;
> +		}
> +		inp += 4;
> +		size -= 4;

We know size is at least 4 from the check at the head of the loop, but
now we subtracted 4 so size could be 0, but lz4_decompress doesn't take
size as an argument so whatever it does it isn't basing the return value
in &chunksize on it, so I assume it reads over the end of the buffer via
inp.

I didn't look to see what the minimum number of bytes which
lz4_decompress will consume is, but I think we need a check of some
description.

> +
> +		if (out_len >= uncomp_chunksize) {
> +			dest_len = uncomp_chunksize;
> +			out_len -= dest_len;
> +		} else
> +			dest_len = out_len;
> +		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
> +		if (ret < 0) {
> +			msg = "decoding failed";
> +			goto exit_2;
> +		}
> +
> +		outp += dest_len;
> +		size -= chunksize;
> +
> +		if (size == 0)
> +		{
> +			*blob = output;
> +			*psize = out_size;
> +			return 0;
> +		}
> +
> +		if (size < 0) {
> +			msg = "data corrupted";
> +			goto exit_2;
> +		}
> +
> +		inp += chunksize;
> +	}
> +
> +exit_2:
> +	free(output);
> +	DOMPRINTF("LZ4 decompression error: %s\n", msg);
> +exit_0:
> +	return ret;
> +}
> +
>  struct setup_header {
>      uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
>      uint8_t  setup_sects;
> @@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s
>              return -EINVAL;
>          }
>      }
> +    else if ( check_magic(dom, "\x02\x21", 2) )
> +    {
> +        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
> +        if ( ret < 0 )
> +        {
> +            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> +                         "%s unable to LZ4 decompress kernel\n",
> +                         __FUNCTION__);
> +            return -EINVAL;
> +        }
> +    }
>      else
>      {
>          xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> --- a/xen/common/lz4/decompress.c
> +++ b/xen/common/lz4/decompress.c
> @@ -151,7 +151,7 @@ _output_error:
>  	return (int) (-(((unsigned char *)ip) - source));
>  }
>  
> -#ifndef __XEN__
> +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
>  static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
>  				int isize, size_t maxoutputsize)
>  {
> @@ -294,7 +294,7 @@ exit_0:
>  	return ret;
>  }
>  
> -#ifndef __XEN__
> +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
>  int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
>  		char *dest, size_t *dest_len)
>  {
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-24 16:53               ` Ian Campbell
@ 2013-09-24 19:55                 ` Yann Collet
  2013-09-25  7:37                   ` Ian Campbell
  2013-09-25  7:23                 ` Jan Beulich
  1 sibling, 1 reply; 29+ messages in thread
From: Yann Collet @ 2013-09-24 19:55 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Keir Fraser, Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 8735 bytes --]

Hi,


Just for your information, regarding limit case of size 0 :

lz4 decompression functions are routinely tested to decode a compressed
block of null original size, and they handle this use case correctly.

Note that, due to format definition, a block of size 0 is compressed into a
compressed-block of size 1 (yeah, I know...). That byte, by the way, will
be 0.

Consequently, a compressed-block of size 0 is not supposed to exist (break
format). For such a situation, the decompression functions should generate
an error (negative function result).


Now, I realize that what I just said apply to the generic version, hosted
at http://code.google.com/p/lz4/

Since you are probably using the Linux kernel version, it could be
interesting to check if this corner case still holds. Obviously,
compressing a null-sized block is a questionable use case, and most lz4
implementations ensure this never happens . But well, there must be a
definition for it, just in case it does nonetheless happen.
The linux kernel version, while heavily based on original version, is a
slightly modified one, most probably to respect some kernel rule
requirement. That work was done by Kyungsik Lee, so you may want to include
him in the discussion if you want to ask questions on his version.


Best Regards

Yann


2013/9/24 Ian Campbell <Ian.Campbell@citrix.com>

> On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote:
> > Since there's no shared or static library to link against, this simply
> > re-uses the hypervisor side code. However, I only audited the code
> > added here for possible security issues, not the referenced code in
> > the hypervisor tree.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > ---
> > v2: Fold in the addition of a previously missing include in the Mini-OS
> >     case - provided by Ian Campbell - to allow stubdom compilation to
> >     succeed.
> > ---
> > I intentionally retained the tab indentation in the code cloned from
> > its hypervisor original (which in turn retained it as being a clone
> > of the Linux original), to ease diff-ing.
> >
> > --- a/tools/libxc/xc_dom_bzimageloader.c
> > +++ b/tools/libxc/xc_dom_bzimageloader.c
> > @@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode(
> >
> >  #else /* __MINIOS__ */
> >
> > +#include <mini-os/byteorder.h>
> > +
> >  int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t
> *size);
> >  int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t
> *size);
> >  int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t
> *size);
> > @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
> >
> >  #endif /* !__MINIOS__ */
> >
> > +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> > +#define STATIC static
> > +#define u8 uint8_t
> > +#define u16 uint16_t
> > +#define u32 uint32_t
> > +#define u64 uint64_t
> > +#define INIT
> > +#define unlikely(x) (x)
>
> I think rather than pollute this file with this sort of thing (which
> might have unexpected consequences in the future) this would be better
> of placed in a separate file compiled for both regular and stub use.
>
> > +
> > +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
> > +{
> > +    return buf[0] | (buf[1] << 8);
> > +}
> > +
> > +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
> > +{
> > +    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
> > +}
> > +
> > +#include "../../xen/common/lz4/decompress.c"
> > +
> > +/*
> > + * Note: Uncompressed chunk size is used in the compressor side
> > + * (userspace side for compression).
> > + * It is hardcoded because there is not proper way to extract it
> > + * from the binary stream which is generated by the preliminary
> > + * version of LZ4 tool so far.
> > + */
> > +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
> > +#define ARCHIVE_MAGICNUMBER 0x184C2102
> > +
> > +static int xc_try_lz4_decode(
> > +     struct xc_dom_image *dom, void **blob, size_t *psize)
> > +{
> > +     int ret = -1;
> > +     size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
> > +     unsigned char *inp = *blob, *output, *outp;
> > +     ssize_t size = *psize - 4;
> > +     size_t out_size, out_len, dest_len, chunksize;
> > +     const char *msg;
> > +
> > +     if (size < 4) {
> > +             msg = "input too small";
> > +             goto exit_0;
>
> The exit_0 loop is after the error print, so these errors don't get
> printed.
>
> I don't really like exit_0 and exit_2 as names (what happened to
> exit_1?). "err" and "err_free" would be more usual.
>
>
> > +     }
> > +
> > +     out_size = out_len = get_unaligned_le32(inp + size);
> > +     if (xc_dom_kernel_check_size(dom, out_len)) {
> > +             msg = "Decompressed image too large";
> > +             goto exit_0;
> > +     }
> > +
> > +     output = malloc(out_len);
> > +     if (!output) {
> > +             msg = "Could not allocate output buffer";
> > +             goto exit_0;
> > +     }
> > +     outp = output;
> > +
> > +     chunksize = get_unaligned_le32(inp);
> > +     if (chunksize == ARCHIVE_MAGICNUMBER) {
> > +             inp += 4;
> > +             size -= 4;
> > +     } else {
> > +             msg = "invalid header";
> > +             goto exit_2;
> > +     }
> > +
> > +     for (;;) {
> > +             if (size < 4) {
> > +                     msg = "missing data";
> > +                     goto exit_2;
> > +             }
> > +             chunksize = get_unaligned_le32(inp);
> > +             if (chunksize == ARCHIVE_MAGICNUMBER) {
> > +                     inp += 4;
> > +                     size -= 4;
> > +                     continue;
> > +             }
> > +             inp += 4;
> > +             size -= 4;
>
> We know size is at least 4 from the check at the head of the loop, but
> now we subtracted 4 so size could be 0, but lz4_decompress doesn't take
> size as an argument so whatever it does it isn't basing the return value
> in &chunksize on it, so I assume it reads over the end of the buffer via
> inp.
>
> I didn't look to see what the minimum number of bytes which
> lz4_decompress will consume is, but I think we need a check of some
> description.
>
> > +
> > +             if (out_len >= uncomp_chunksize) {
> > +                     dest_len = uncomp_chunksize;
> > +                     out_len -= dest_len;
> > +             } else
> > +                     dest_len = out_len;
> > +             ret = lz4_decompress(inp, &chunksize, outp, dest_len);
> > +             if (ret < 0) {
> > +                     msg = "decoding failed";
> > +                     goto exit_2;
> > +             }
> > +
> > +             outp += dest_len;
> > +             size -= chunksize;
> > +
> > +             if (size == 0)
> > +             {
> > +                     *blob = output;
> > +                     *psize = out_size;
> > +                     return 0;
> > +             }
> > +
> > +             if (size < 0) {
> > +                     msg = "data corrupted";
> > +                     goto exit_2;
> > +             }
> > +
> > +             inp += chunksize;
> > +     }
> > +
> > +exit_2:
> > +     free(output);
> > +     DOMPRINTF("LZ4 decompression error: %s\n", msg);
> > +exit_0:
> > +     return ret;
> > +}
> > +
> >  struct setup_header {
> >      uint8_t  _pad0[0x1f1];  /* skip uninteresting stuff */
> >      uint8_t  setup_sects;
> > @@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s
> >              return -EINVAL;
> >          }
> >      }
> > +    else if ( check_magic(dom, "\x02\x21", 2) )
> > +    {
> > +        ret = xc_try_lz4_decode(dom, &dom->kernel_blob,
> &dom->kernel_size);
> > +        if ( ret < 0 )
> > +        {
> > +            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> > +                         "%s unable to LZ4 decompress kernel\n",
> > +                         __FUNCTION__);
> > +            return -EINVAL;
> > +        }
> > +    }
> >      else
> >      {
> >          xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
> > --- a/xen/common/lz4/decompress.c
> > +++ b/xen/common/lz4/decompress.c
> > @@ -151,7 +151,7 @@ _output_error:
> >       return (int) (-(((unsigned char *)ip) - source));
> >  }
> >
> > -#ifndef __XEN__
> > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
> >  static int lz4_uncompress_unknownoutputsize(const char *source, char
> *dest,
> >                               int isize, size_t maxoutputsize)
> >  {
> > @@ -294,7 +294,7 @@ exit_0:
> >       return ret;
> >  }
> >
> > -#ifndef __XEN__
> > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__)
> >  int lz4_decompress_unknownoutputsize(const char *src, size_t src_len,
> >               char *dest, size_t *dest_len)
> >  {
> >
> >
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 11124 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-24 16:53               ` Ian Campbell
  2013-09-24 19:55                 ` Yann Collet
@ 2013-09-25  7:23                 ` Jan Beulich
  2013-09-25  7:35                   ` Ian Campbell
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-25  7:23 UTC (permalink / raw)
  To: Ian Campbell, yann.collet.73; +Cc: xen-devel, Keir Fraser, Ian Jackson

>>> On 24.09.13 at 18:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote:
>> @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
>>  
>>  #endif /* !__MINIOS__ */
>>  
>> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>> +#define STATIC static
>> +#define u8 uint8_t
>> +#define u16 uint16_t
>> +#define u32 uint32_t
>> +#define u64 uint64_t
>> +#define INIT
>> +#define unlikely(x) (x)
> 
> I think rather than pollute this file with this sort of thing (which
> might have unexpected consequences in the future) this would be better
> of placed in a separate file compiled for both regular and stub use.

So you mean to move the whole thing quoted above down to ...

>> +
>> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
>> +{
>> +    return buf[0] | (buf[1] << 8);
>> +}
>> +
>> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
>> +{
>> +    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
>> +}
>> +
>> +#include "../../xen/common/lz4/decompress.c"

... here out into a separate file. I can surely do that. Any
preference for the name (xc_dom_decompress_unsafe_lz4.c
would seem the prime candidate, albeit the other similarly
named files have a slightly different purpose).

The main problem doing so is going to be the uses of
get_unaligned_le32() in xc_try_lz4_decode() - they'd need to
be adjusted, increasing the amount of change from the original.
And lz4_decompress() would the also need to be declared in
some (new?) header. All of which didn't seem very desirable...

>> +
>> +/*
>> + * Note: Uncompressed chunk size is used in the compressor side
>> + * (userspace side for compression).
>> + * It is hardcoded because there is not proper way to extract it
>> + * from the binary stream which is generated by the preliminary
>> + * version of LZ4 tool so far.
>> + */
>> +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
>> +#define ARCHIVE_MAGICNUMBER 0x184C2102
>> +
>> +static int xc_try_lz4_decode(
>> +	struct xc_dom_image *dom, void **blob, size_t *psize)
>> +{
>> +	int ret = -1;
>> +	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
>> +	unsigned char *inp = *blob, *output, *outp;
>> +	ssize_t size = *psize - 4;
>> +	size_t out_size, out_len, dest_len, chunksize;
>> +	const char *msg;
>> +
>> +	if (size < 4) {
>> +		msg = "input too small";
>> +		goto exit_0;
> 
> The exit_0 loop is after the error print, so these errors don't get
> printed.

Oops - fixed.

> I don't really like exit_0 and exit_2 as names (what happened to
> exit_1?). "err" and "err_free" would be more usual.

I didn't like those too, but tried to modify the original as little
as possible. exit_1 disappeared as having got orphaned in the
process of the adaptation.

>> +	}
>> +
>> +	out_size = out_len = get_unaligned_le32(inp + size);
>> +	if (xc_dom_kernel_check_size(dom, out_len)) {
>> +		msg = "Decompressed image too large";
>> +		goto exit_0;
>> +	}
>> +
>> +	output = malloc(out_len);
>> +	if (!output) {
>> +		msg = "Could not allocate output buffer";
>> +		goto exit_0;
>> +	}
>> +	outp = output;
>> +
>> +	chunksize = get_unaligned_le32(inp);
>> +	if (chunksize == ARCHIVE_MAGICNUMBER) {
>> +		inp += 4;
>> +		size -= 4;
>> +	} else {
>> +		msg = "invalid header";
>> +		goto exit_2;
>> +	}
>> +
>> +	for (;;) {
>> +		if (size < 4) {
>> +			msg = "missing data";
>> +			goto exit_2;
>> +		}
>> +		chunksize = get_unaligned_le32(inp);
>> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
>> +			inp += 4;
>> +			size -= 4;
>> +			continue;
>> +		}
>> +		inp += 4;
>> +		size -= 4;
> 
> We know size is at least 4 from the check at the head of the loop, but
> now we subtracted 4 so size could be 0, but lz4_decompress doesn't take
> size as an argument so whatever it does it isn't basing the return value
> in &chunksize on it, so I assume it reads over the end of the buffer via
> inp.
> 
> I didn't look to see what the minimum number of bytes which
> lz4_decompress will consume is, but I think we need a check of some
> description.

Sadly there doesn't appear to be a minimum - the main loop in
lz4_uncompress() has just a single exit point, which isn't
dependent upon the input size at all (the function doesn't even
know how much input there is).

Immediately before the call to lz4_decompress() chunksize gets
read, but I can't tell whether we could expect that to be the
required amount of bytes to follow. Yann - can you clarify this?

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-25  7:23                 ` Jan Beulich
@ 2013-09-25  7:35                   ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-09-25  7:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Wed, 2013-09-25 at 08:23 +0100, Jan Beulich wrote:
> >>> On 24.09.13 at 18:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote:
> >> @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image
> >>  
> >>  #endif /* !__MINIOS__ */
> >>  
> >> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> >> +#define STATIC static
> >> +#define u8 uint8_t
> >> +#define u16 uint16_t
> >> +#define u32 uint32_t
> >> +#define u64 uint64_t
> >> +#define INIT
> >> +#define unlikely(x) (x)
> > 
> > I think rather than pollute this file with this sort of thing (which
> > might have unexpected consequences in the future) this would be better
> > of placed in a separate file compiled for both regular and stub use.
> 
> So you mean to move the whole thing quoted above down to ...
> 
> >> +
> >> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
> >> +{
> >> +    return buf[0] | (buf[1] << 8);
> >> +}
> >> +
> >> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
> >> +{
> >> +    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
> >> +}
> >> +
> >> +#include "../../xen/common/lz4/decompress.c"
> 
> ... here out into a separate file. I can surely do that.

I was thinking to include xc_try_lz4_decode too, but this would work
well, whichever you prefer.

> Any preference for the name (xc_dom_decompress_unsafe_lz4.c
> would seem the prime candidate, albeit the other similarly
> named files have a slightly different purpose).

Following the pattern used by the existing unsafe decompressors would
seem reasonable.

> The main problem doing so is going to be the uses of
> get_unaligned_le32() in xc_try_lz4_decode() - they'd need to
> be adjusted, increasing the amount of change from the original.
> And lz4_decompress() would the also need to be declared in
> some (new?) header. All of which didn't seem very desirable...

In which case maybe moving xc_try.. too is the right answer, there is
already a list of prototypes for the others (albeit in a minios ifdef)

> > I don't really like exit_0 and exit_2 as names (what happened to
> > exit_1?). "err" and "err_free" would be more usual.
> 
> I didn't like those too, but tried to modify the original as little
> as possible. exit_1 disappeared as having got orphaned in the
> process of the adaptation.

I didn't realise this had come from elsewhere rather than being newly
written xc glue code. I suppose its fine then.

> 
> >> +	}
> >> +
> >> +	out_size = out_len = get_unaligned_le32(inp + size);
> >> +	if (xc_dom_kernel_check_size(dom, out_len)) {
> >> +		msg = "Decompressed image too large";
> >> +		goto exit_0;
> >> +	}
> >> +
> >> +	output = malloc(out_len);
> >> +	if (!output) {
> >> +		msg = "Could not allocate output buffer";
> >> +		goto exit_0;
> >> +	}
> >> +	outp = output;
> >> +
> >> +	chunksize = get_unaligned_le32(inp);
> >> +	if (chunksize == ARCHIVE_MAGICNUMBER) {
> >> +		inp += 4;
> >> +		size -= 4;
> >> +	} else {
> >> +		msg = "invalid header";
> >> +		goto exit_2;
> >> +	}
> >> +
> >> +	for (;;) {
> >> +		if (size < 4) {
> >> +			msg = "missing data";
> >> +			goto exit_2;
> >> +		}
> >> +		chunksize = get_unaligned_le32(inp);
> >> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
> >> +			inp += 4;
> >> +			size -= 4;
> >> +			continue;
> >> +		}
> >> +		inp += 4;
> >> +		size -= 4;
> > 
> > We know size is at least 4 from the check at the head of the loop, but
> > now we subtracted 4 so size could be 0, but lz4_decompress doesn't take
> > size as an argument so whatever it does it isn't basing the return value
> > in &chunksize on it, so I assume it reads over the end of the buffer via
> > inp.
> > 
> > I didn't look to see what the minimum number of bytes which
> > lz4_decompress will consume is, but I think we need a check of some
> > description.
> 
> Sadly there doesn't appear to be a minimum - the main loop in
> lz4_uncompress() has just a single exit point, which isn't
> dependent upon the input size at all (the function doesn't even
> know how much input there is).
> 
> Immediately before the call to lz4_decompress() chunksize gets
> read, but I can't tell whether we could expect that to be the
> required amount of bytes to follow. Yann - can you clarify this?

Whatever expectations we might have from the encoder a malicious
attacker can construct a byte stream which violates it and has
ARCHIVE_MAGICNUMBER as the last thing.

Under the circumstances I don't see how lz4_decompress() can be safely
used on untrusted data since it has no knowledge of the remaining input
length and therefore no way to avoid running off the end of the data.
Digging down a couple of layers of callchain it seems like this
knowledge isn't use anywhere (i.e. it's not just lost in the outermost
API.

This is a blocker for inclusion IMHO.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-24 19:55                 ` Yann Collet
@ 2013-09-25  7:37                   ` Ian Campbell
  2013-09-25  8:06                     ` Yann Collet
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-09-25  7:37 UTC (permalink / raw)
  To: Yann Collet; +Cc: xen-devel, Keir Fraser, Ian Jackson, Jan Beulich

On Tue, 2013-09-24 at 21:55 +0200, Yann Collet wrote:
> Consequently, a compressed-block of size 0 is not supposed to exist
> (break format).

Unfortunately an attacker is free to break the format. The question is
what will the decoder do when faced with such invalid inputs?

It seems like such concerns were not considered at all during
implementation? (which is fair enough, since the data stream is
implicitly trusted in the original target use case)

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-25  7:37                   ` Ian Campbell
@ 2013-09-25  8:06                     ` Yann Collet
  2013-09-25  8:36                       ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Yann Collet @ 2013-09-25  8:06 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Keir Fraser, Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1491 bytes --]

There are 2 families of decoding functions within LZ4 :

1) LZ4_decompress_fast* : These decoding functions must be used with
trusted sources only. They only guarantee that they will write exactly the
size of output buffer, but cannot guarantee anything regarding input
buffer, since its size is unknown. (by the way, the amount of bytes read
into input buffer is the result of the function).

2) LZ4_decompress_safe* : These decoding functions are protected against
malicious input. It resists fuzzer attack. This is the recommended choice
for "general decompression usage".

Looking at the kernel code, at
https://github.com/torvalds/linux/blob/master/lib/lz4/lz4_decompress.c
the naming seems different, but both variants are still there :

lz4_decompress is the equivalent of LZ4_decompress_fast.
lz4_decompress_unknownoutputsize is the equivalent of LZ4_decompress_safe.

I would recommend to use the second one for untrusted sources.


Regards



2013/9/25 Ian Campbell <Ian.Campbell@citrix.com>

> On Tue, 2013-09-24 at 21:55 +0200, Yann Collet wrote:
> > Consequently, a compressed-block of size 0 is not supposed to exist
> > (break format).
>
> Unfortunately an attacker is free to break the format. The question is
> what will the decoder do when faced with such invalid inputs?
>
> It seems like such concerns were not considered at all during
> implementation? (which is fair enough, since the data stream is
> implicitly trusted in the original target use case)
>
> Ian.
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 2683 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] libxc: add LZ4 decompression support
  2013-09-25  8:06                     ` Yann Collet
@ 2013-09-25  8:36                       ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2013-09-25  8:36 UTC (permalink / raw)
  To: Ian Campbell, Yann Collet; +Cc: xen-devel, Keir Fraser, Ian Jackson

>>> On 25.09.13 at 10:06, Yann Collet <yann.collet.73@gmail.com> wrote:
> There are 2 families of decoding functions within LZ4 :
> 
> 1) LZ4_decompress_fast* : These decoding functions must be used with
> trusted sources only. They only guarantee that they will write exactly the
> size of output buffer, but cannot guarantee anything regarding input
> buffer, since its size is unknown. (by the way, the amount of bytes read
> into input buffer is the result of the function).
> 
> 2) LZ4_decompress_safe* : These decoding functions are protected against
> malicious input. It resists fuzzer attack. This is the recommended choice
> for "general decompression usage".
> 
> Looking at the kernel code, at
> https://github.com/torvalds/linux/blob/master/lib/lz4/lz4_decompress.c 
> the naming seems different, but both variants are still there :
> 
> lz4_decompress is the equivalent of LZ4_decompress_fast.
> lz4_decompress_unknownoutputsize is the equivalent of LZ4_decompress_safe.
> 
> I would recommend to use the second one for untrusted sources.

Ah, okay. Will re-do the DomU patch then.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 0/2] add LZ4 kernel decompression support
  2013-09-23 14:51 [PATCH 0/2] add LZ4 kernel decompression support Jan Beulich
  2013-09-23 14:54 ` [PATCH 1/2] xen: add LZ4 " Jan Beulich
  2013-09-23 14:56 ` [PATCH 2/2] libxc: " Jan Beulich
@ 2013-09-30 11:40 ` Jan Beulich
  2013-09-30 11:44   ` [PATCH v3 1/2] xen: add LZ4 " Jan Beulich
                     ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Jan Beulich @ 2013-09-30 11:40 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

Linux 3.11 added respective support, so I think we should follow
suit.

1: xen: add LZ4 decompression support
2: libxc: add LZ4 decompression support

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Tools side adjustments: Fix placement of DOMPRINTF()
    invocation. Split implementation into its own source file. Use the
    safe decompressor variant in non-Mini-OS libxc
    (lz4_decompress_unknownoutputsize()). Minor adjustments to
    hypervisor side avoid the tools side patch having to touch the
    former.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/2] xen: add LZ4 decompression support
  2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
@ 2013-09-30 11:44   ` Jan Beulich
  2013-09-30 11:45   ` [PATCH v3 2/2] libxc: " Jan Beulich
  2013-09-30 12:10   ` [PATCH v3 0/2] add LZ4 kernel " Keir Fraser
  2 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2013-09-30 11:44 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

[-- Attachment #1: Type: text/plain, Size: 24173 bytes --]

xen: add LZ4 decompression support

Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
Xen are based on LZ4 implementation by Yann Collet.

Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2

1. ARMv7, 1.5GHz based board
   Kernel: linux 3.4
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
   LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)

2. ARMv7, 1.7GHz based board
   Kernel: linux 3.7
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
   LZ4  6.5MB            86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied

This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].

But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones?  This issue
had been discussed and came to the conclusion [2].

Russell King said that we should have:

 - one decompressor which is the fastest
 - one decompressor for the highest compression ratio
 - one popular decompressor (eg conventional gzip)

If we have a replacement one for one of these, then it should do exactly
that: replace it.

The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel).  Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].

[1] http://code.google.com/p/lz4/ 
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 

LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html 
LZ4 source repository: http://code.google.com/p/lz4/ 

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Minor adjustments to avoid the tools side patch having to touch
    hypervisor side sources.

--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -49,7 +49,7 @@ obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += lzo.o
 
-obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo earlycpio,$(n).init.o)
+obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
 obj-$(perfc)       += perfc.o
 obj-$(crash_debug) += gdbstub.o
--- a/xen/common/decompress.c
+++ b/xen/common/decompress.c
@@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig
     if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) )
         return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error);
 
+    if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) )
+	return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error);
+
     return 1;
 }
--- /dev/null
+++ b/xen/common/lz4/decompress.c
@@ -0,0 +1,323 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html 
+ *  - LZ4 source repository : http://code.google.com/p/lz4/ 
+ */
+
+#include "defs.h"
+
+#if defined(__XEN__) || defined(__MINIOS__)
+
+static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest,
+			       int osize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *ref;
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + osize;
+	BYTE *cpy;
+	unsigned token;
+	size_t length;
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	while (1) {
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			size_t len;
+
+			len = *ip++;
+			for (; len == 255; length += 255)
+				len = *ip++;
+			length += len;
+		}
+
+		/* copy literals */
+		cpy = op + length;
+		if (unlikely(cpy > oend - COPYLENGTH)) {
+			/*
+			 * Error: not enough place for another match
+			 * (min 4) + 5 literals
+			 */
+			if (cpy != oend)
+				goto _output_error;
+
+			memcpy(op, ip, length);
+			ip += length;
+			break; /* EOF */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+
+		/* Error: offset create reference outside destination buffer */
+		if (unlikely(ref < (BYTE *const) dest))
+			goto _output_error;
+
+		/* get matchlength */
+		length = token & ML_MASK;
+		if (length == ML_MASK) {
+			for (; *ip == 255; length += 255)
+				ip++;
+			length += *ip++;
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+			op[0] = ref[0];
+			op[1] = ref[1];
+			op[2] = ref[2];
+			op[3] = ref[3];
+			op += 4;
+			ref += 4;
+			ref -= dec32table[op-ref];
+			PUT4(ref, op);
+			op += STEPSIZE - 4;
+			ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE - 4);
+		if (cpy > (oend - COPYLENGTH)) {
+
+			/* Error: request to write beyond destination buffer */
+			if (cpy > oend)
+				goto _output_error;
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (ip - source);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(ip - source));
+}
+
+#else /* defined(__XEN__) || defined(__MINIOS__) */
+
+static int lz4_uncompress_unknownoutputsize(const unsigned char *source,
+				unsigned char *dest, int isize,
+				size_t maxoutputsize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *const iend = ip + isize;
+	const BYTE *ref;
+
+
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + maxoutputsize;
+	BYTE *cpy;
+
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	/* Main Loop */
+	while (ip < iend) {
+
+		unsigned token;
+		size_t length;
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			int s = 255;
+			while ((ip < iend) && (s == 255)) {
+				s = *ip++;
+				length += s;
+			}
+		}
+		/* copy literals */
+		cpy = op + length;
+		if ((cpy > oend - COPYLENGTH) ||
+			(ip + length > iend - COPYLENGTH)) {
+
+			if (cpy > oend)
+				goto _output_error;/* writes beyond buffer */
+
+			if (ip + length != iend)
+				goto _output_error;/*
+						    * Error: LZ4 format requires
+						    * to consume all input
+						    * at this stage
+						    */
+			memcpy(op, ip, length);
+			op += length;
+			break;/* Necessarily EOF, due to parsing restrictions */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+		if (ref < (BYTE * const) dest)
+			goto _output_error;
+			/*
+			 * Error : offset creates reference
+			 * outside of destination buffer
+			 */
+
+		/* get matchlength */
+		length = (token & ML_MASK);
+		if (length == ML_MASK) {
+			while (ip < iend) {
+				int s = *ip++;
+				length += s;
+				if (s == 255)
+					continue;
+				break;
+			}
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+				op[0] = ref[0];
+				op[1] = ref[1];
+				op[2] = ref[2];
+				op[3] = ref[3];
+				op += 4;
+				ref += 4;
+				ref -= dec32table[op - ref];
+				PUT4(ref, op);
+				op += STEPSIZE - 4;
+				ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE-4);
+		if (cpy > oend - COPYLENGTH) {
+			if (cpy > oend)
+				goto _output_error; /* write outside of buf */
+
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (op - dest);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(ip - source));
+}
+
+#endif
+
+#if defined(__XEN__) || defined(__MINIOS__)
+
+int INIT lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len)
+{
+	int ret = -1;
+	int input_len = 0;
+
+	input_len = lz4_uncompress(src, dest, actual_dest_len);
+	if (input_len < 0)
+		goto exit_0;
+	*src_len = input_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#else /* defined(__XEN__) || defined(__MINIOS__) */
+
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
+		 unsigned char *dest, size_t *dest_len)
+{
+	int ret = -1;
+	int out_len = 0;
+
+	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
+					*dest_len);
+	if (out_len < 0)
+		goto exit_0;
+	*dest_len = out_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#endif
--- /dev/null
+++ b/xen/common/lz4/defs.h
@@ -0,0 +1,184 @@
+/*
+ * lz4defs.h -- architecture specific defines
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifdef __XEN__
+#include <asm/byteorder.h>
+#endif
+
+#ifdef __LITTLE_ENDIAN
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpup(p);
+}
+
+static inline u32 INIT get_unaligned_le32(const void *p)
+{
+	return le32_to_cpup(p);
+}
+#else
+#include <asm/unaligned.h>
+
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpu(__get_unaligned(p, 2));
+}
+
+static inline u32 INIT get_unaligned_le32(void *p)
+{
+	return le32_to_cpu(__get_unaligned(p, 4));
+}
+#endif
+
+/*
+ * Detects 64 bits mode
+ */
+#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
+	|| defined(__ppc64__) || defined(__LP64__))
+#define LZ4_ARCH64 1
+#else
+#define LZ4_ARCH64 0
+#endif
+
+/*
+ * Architecture-specific macros
+ */
+#define BYTE	u8
+typedef struct _U16_S { u16 v; } U16_S;
+typedef struct _U32_S { u32 v; } U32_S;
+typedef struct _U64_S { u64 v; } U64_S;
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)		\
+	|| defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6	\
+	&& defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
+
+#define A16(x) (((U16_S *)(x))->v)
+#define A32(x) (((U32_S *)(x))->v)
+#define A64(x) (((U64_S *)(x))->v)
+
+#define PUT4(s, d) (A32(d) = A32(s))
+#define PUT8(s, d) (A64(d) = A64(s))
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		A16(p) = v; \
+		p += 2; \
+	} while (0)
+#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+
+#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
+#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v))
+#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v))
+
+#define PUT4(s, d) \
+	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
+#define PUT8(s, d) \
+	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
+
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		put_unaligned(v, (u16 *)(p)); \
+		p += 2; \
+	} while (0)
+#endif
+
+#define COPYLENGTH 8
+#define ML_BITS  4
+#define ML_MASK  ((1U << ML_BITS) - 1)
+#define RUN_BITS (8 - ML_BITS)
+#define RUN_MASK ((1U << RUN_BITS) - 1)
+#define MEMORY_USAGE	14
+#define MINMATCH	4
+#define SKIPSTRENGTH	6
+#define LASTLITERALS	5
+#define MFLIMIT		(COPYLENGTH + MINMATCH)
+#define MINLENGTH	(MFLIMIT + 1)
+#define MAXD_LOG	16
+#define MAXD		(1 << MAXD_LOG)
+#define MAXD_MASK	(u32)(MAXD - 1)
+#define MAX_DISTANCE	(MAXD - 1)
+#define HASH_LOG	(MAXD_LOG - 1)
+#define HASHTABLESIZE	(1 << HASH_LOG)
+#define MAX_NB_ATTEMPTS	256
+#define OPTIMAL_ML	(int)((ML_MASK-1)+MINMATCH)
+#define LZ4_64KLIMIT	((1<<16) + (MFLIMIT - 1))
+#define HASHLOG64K	((MEMORY_USAGE - 2) + 1)
+#define HASH64KTABLESIZE	(1U << HASHLOG64K)
+#define LZ4_HASH_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - (MEMORY_USAGE-2)))
+#define LZ4_HASH64K_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASHLOG64K))
+#define HASH_VALUE(p)		(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASH_LOG))
+
+#if LZ4_ARCH64/* 64-bit */
+#define STEPSIZE 8
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT8(s, d);	\
+		d += 8;		\
+		s += 8;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
+
+#define LZ4_SECURECOPY(s, d, e)			\
+	do {					\
+		if (d < e) {			\
+			LZ4_WILDCOPY(s, d, e);	\
+		}				\
+	} while (0)
+#define HTYPE u32
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3)
+#endif
+
+#else	/* 32-bit */
+#define STEPSIZE 4
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT4(s, d);	\
+		d += 4;		\
+		s += 4;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)		\
+	do {				\
+		LZ4_COPYSTEP(s, d);	\
+		LZ4_COPYSTEP(s, d);	\
+	} while (0)
+
+#define LZ4_SECURECOPY	LZ4_WILDCOPY
+#define HTYPE const u8*
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
+#endif
+
+#endif
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+	(d = s - get_unaligned_le16(p))
+
+#define LZ4_WILDCOPY(s, d, e)		\
+	do {				\
+		LZ4_COPYPACKET(s, d);	\
+	} while (d < e)
+
+#define LZ4_BLINDCOPY(s, d, l)	\
+	do {	\
+		u8 *e = (d) + l;	\
+		LZ4_WILDCOPY(s, d, e);	\
+		d = e;	\
+	} while (0)
--- /dev/null
+++ b/xen/common/unlz4.c
@@ -0,0 +1,166 @@
+/*
+ * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "decompress.h"
+#include <xen/lz4.h>
+#include "lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+STATIC int INIT unlz4(unsigned char *input, unsigned int in_len,
+		      int (*fill)(void *, unsigned int),
+		      int (*flush)(void *, unsigned int),
+		      unsigned char *output,
+		      unsigned int *posp,
+		      void (*error)(const char *x))
+{
+	int ret = -1;
+	size_t chunksize = 0;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	u8 *inp;
+	u8 *inp_start;
+	u8 *outp;
+	int size = in_len -= 4;
+#if defined(__XEN__) || defined(__MINIOS__)
+	size_t out_len = get_unaligned_le32(input + in_len);
+#endif
+	size_t dest_len;
+
+
+	if (output) {
+		outp = output;
+	} else if (!flush) {
+		error("NULL output pointer and no flush function provided");
+		goto exit_0;
+	} else {
+		outp = large_malloc(uncomp_chunksize);
+		if (!outp) {
+			error("Could not allocate output buffer");
+			goto exit_0;
+		}
+	}
+
+	if (input && fill) {
+		error("Both input pointer and fill function provided,");
+		goto exit_1;
+	} else if (input) {
+		inp = input;
+	} else if (!fill) {
+		error("NULL input pointer and missing fill function");
+		goto exit_1;
+	} else {
+		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
+		if (!inp) {
+			error("Could not allocate input buffer");
+			goto exit_1;
+		}
+	}
+	inp_start = inp;
+
+	if (posp)
+		*posp = 0;
+
+	if (fill)
+		fill(inp, 4);
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		error("invalid header");
+		goto exit_2;
+	}
+
+	if (posp)
+		*posp += 4;
+
+	for (;;) {
+
+		if (fill)
+			fill(inp, 4);
+
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			if (posp)
+				*posp += 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (posp)
+			*posp += 4;
+
+		if (fill) {
+			if (chunksize > lz4_compressbound(uncomp_chunksize)) {
+				error("chunk length is longer than allocated");
+				goto exit_2;
+			}
+			fill(inp, chunksize);
+		}
+#if defined(__XEN__) || defined(__MINIOS__)
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+#else
+		dest_len = uncomp_chunksize;
+		ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp,
+				&dest_len);
+#endif
+		if (ret < 0) {
+			error("Decoding failed");
+			goto exit_2;
+		}
+
+		if (flush && flush(outp, dest_len) != dest_len)
+			goto exit_2;
+		if (output)
+			outp += dest_len;
+		if (posp)
+			*posp += chunksize;
+
+		size -= chunksize;
+
+		if (size == 0)
+			break;
+		else if (size < 0) {
+			error("data corrupted");
+			goto exit_2;
+		}
+
+		inp += chunksize;
+		if (fill)
+			inp = inp_start;
+	}
+
+	ret = 0;
+exit_2:
+	if (!input)
+		large_free(inp_start);
+exit_1:
+	if (!output)
+		large_free(outp);
+exit_0:
+	return ret;
+}
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -25,6 +25,7 @@
 #define CONFIG_X86_PM_TIMER 1
 #define CONFIG_HPET_TIMER 1
 #define CONFIG_X86_MCE_THERMAL 1
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 #define CONFIG_NUMA 1
 #define CONFIG_DISCONTIGMEM 1
 #define CONFIG_NUMA_EMU 1
--- a/xen/include/xen/decompress.h
+++ b/xen/include/xen/decompress.h
@@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char 
  * dependent).
  */
 
-decompress_fn bunzip2, unxz, unlzma, unlzo;
+decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4;
 
 int decompress(void *inbuf, unsigned int len, void *outbuf);
 
--- /dev/null
+++ b/xen/include/xen/lz4.h
@@ -0,0 +1,88 @@
+#ifndef __LZ4_H__
+#define __LZ4_H__
+
+/*
+ * LZ4 Kernel Interface
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define LZ4_MEM_COMPRESS	(4096 * sizeof(unsigned char *))
+#define LZ4HC_MEM_COMPRESS	(65538 * sizeof(unsigned char *))
+
+/*
+ * lz4_compressbound()
+ * Provides the maximum size that LZ4 may output in a "worst case" scenario
+ * (input data not compressible)
+ */
+static inline size_t lz4_compressbound(size_t isize)
+{
+	return isize + (isize / 255) + 16;
+}
+
+/*
+ * lz4_compress()
+ *	src     : source address of the original data
+ *	src_len : size of the original data
+ *	dst	: output buffer address of the compressed data
+ *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+ *	dst_len : is the output size, which is returned after compress done
+ *	workmem : address of the working memory.
+ *		This requires 'workmem' of size LZ4_MEM_COMPRESS.
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer and workmem must be already allocated with
+ *		the defined size.
+ */
+int lz4_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+ /*
+  * lz4hc_compress()
+  *	 src	 : source address of the original data
+  *	 src_len : size of the original data
+  *	 dst	 : output buffer address of the compressed data
+  *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+  *	 dst_len : is the output size, which is returned after compress done
+  *	 workmem : address of the working memory.
+  *		This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
+  *	 return  : Success if return 0
+  *		   Error if return (< 0)
+  *	 note :  Destination buffer and workmem must be already allocated with
+  *		 the defined size.
+  */
+int lz4hc_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+/*
+ * lz4_decompress()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, whcih is returned after decompress done
+ *	dest	: output buffer address of the decompressed data
+ *	actual_dest_len: is the size of uncompressed data, supposing it's known
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ *		slightly faster than lz4_decompress_unknownoutputsize()
+ */
+int lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len);
+
+/*
+ * lz4_decompress_unknownoutputsize()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, therefore the compressed size
+ *	dest	: output buffer address of the decompressed data
+ *	dest_len: is the max size of the destination buffer, which is
+ *			returned with actual size of decompressed data after
+ *			decompress done
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ */
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
+		unsigned char *dest, size_t *dest_len);
+#endif



[-- Attachment #2: unlz4-dom0.patch --]
[-- Type: text/plain, Size: 24162 bytes --]

xen: add LZ4 decompression support

Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
Xen are based on LZ4 implementation by Yann Collet.

Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2

1. ARMv7, 1.5GHz based board
   Kernel: linux 3.4
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
   LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)

2. ARMv7, 1.7GHz based board
   Kernel: linux 3.7
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
   LZ4  6.5MB            86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied

This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].

But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones?  This issue
had been discussed and came to the conclusion [2].

Russell King said that we should have:

 - one decompressor which is the fastest
 - one decompressor for the highest compression ratio
 - one popular decompressor (eg conventional gzip)

If we have a replacement one for one of these, then it should do exactly
that: replace it.

The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel).  Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].

[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347

LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Minor adjustments to avoid the tools side patch having to touch
    hypervisor side sources.

--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -49,7 +49,7 @@ obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += lzo.o
 
-obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo earlycpio,$(n).init.o)
+obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
 obj-$(perfc)       += perfc.o
 obj-$(crash_debug) += gdbstub.o
--- a/xen/common/decompress.c
+++ b/xen/common/decompress.c
@@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig
     if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) )
         return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error);
 
+    if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) )
+	return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error);
+
     return 1;
 }
--- /dev/null
+++ b/xen/common/lz4/decompress.c
@@ -0,0 +1,323 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ *  - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+#include "defs.h"
+
+#if defined(__XEN__) || defined(__MINIOS__)
+
+static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest,
+			       int osize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *ref;
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + osize;
+	BYTE *cpy;
+	unsigned token;
+	size_t length;
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	while (1) {
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			size_t len;
+
+			len = *ip++;
+			for (; len == 255; length += 255)
+				len = *ip++;
+			length += len;
+		}
+
+		/* copy literals */
+		cpy = op + length;
+		if (unlikely(cpy > oend - COPYLENGTH)) {
+			/*
+			 * Error: not enough place for another match
+			 * (min 4) + 5 literals
+			 */
+			if (cpy != oend)
+				goto _output_error;
+
+			memcpy(op, ip, length);
+			ip += length;
+			break; /* EOF */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+
+		/* Error: offset create reference outside destination buffer */
+		if (unlikely(ref < (BYTE *const) dest))
+			goto _output_error;
+
+		/* get matchlength */
+		length = token & ML_MASK;
+		if (length == ML_MASK) {
+			for (; *ip == 255; length += 255)
+				ip++;
+			length += *ip++;
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+			op[0] = ref[0];
+			op[1] = ref[1];
+			op[2] = ref[2];
+			op[3] = ref[3];
+			op += 4;
+			ref += 4;
+			ref -= dec32table[op-ref];
+			PUT4(ref, op);
+			op += STEPSIZE - 4;
+			ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE - 4);
+		if (cpy > (oend - COPYLENGTH)) {
+
+			/* Error: request to write beyond destination buffer */
+			if (cpy > oend)
+				goto _output_error;
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (ip - source);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(ip - source));
+}
+
+#else /* defined(__XEN__) || defined(__MINIOS__) */
+
+static int lz4_uncompress_unknownoutputsize(const unsigned char *source,
+				unsigned char *dest, int isize,
+				size_t maxoutputsize)
+{
+	const BYTE *ip = (const BYTE *) source;
+	const BYTE *const iend = ip + isize;
+	const BYTE *ref;
+
+
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + maxoutputsize;
+	BYTE *cpy;
+
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	/* Main Loop */
+	while (ip < iend) {
+
+		unsigned token;
+		size_t length;
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			int s = 255;
+			while ((ip < iend) && (s == 255)) {
+				s = *ip++;
+				length += s;
+			}
+		}
+		/* copy literals */
+		cpy = op + length;
+		if ((cpy > oend - COPYLENGTH) ||
+			(ip + length > iend - COPYLENGTH)) {
+
+			if (cpy > oend)
+				goto _output_error;/* writes beyond buffer */
+
+			if (ip + length != iend)
+				goto _output_error;/*
+						    * Error: LZ4 format requires
+						    * to consume all input
+						    * at this stage
+						    */
+			memcpy(op, ip, length);
+			op += length;
+			break;/* Necessarily EOF, due to parsing restrictions */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op - cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+		if (ref < (BYTE * const) dest)
+			goto _output_error;
+			/*
+			 * Error : offset creates reference
+			 * outside of destination buffer
+			 */
+
+		/* get matchlength */
+		length = (token & ML_MASK);
+		if (length == ML_MASK) {
+			while (ip < iend) {
+				int s = *ip++;
+				length += s;
+				if (s == 255)
+					continue;
+				break;
+			}
+		}
+
+		/* copy repeated sequence */
+		if (unlikely((op - ref) < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+				op[0] = ref[0];
+				op[1] = ref[1];
+				op[2] = ref[2];
+				op[3] = ref[3];
+				op += 4;
+				ref += 4;
+				ref -= dec32table[op - ref];
+				PUT4(ref, op);
+				op += STEPSIZE - 4;
+				ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE-4);
+		if (cpy > oend - COPYLENGTH) {
+			if (cpy > oend)
+				goto _output_error; /* write outside of buf */
+
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (op - dest);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(ip - source));
+}
+
+#endif
+
+#if defined(__XEN__) || defined(__MINIOS__)
+
+int INIT lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len)
+{
+	int ret = -1;
+	int input_len = 0;
+
+	input_len = lz4_uncompress(src, dest, actual_dest_len);
+	if (input_len < 0)
+		goto exit_0;
+	*src_len = input_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#else /* defined(__XEN__) || defined(__MINIOS__) */
+
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
+		 unsigned char *dest, size_t *dest_len)
+{
+	int ret = -1;
+	int out_len = 0;
+
+	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
+					*dest_len);
+	if (out_len < 0)
+		goto exit_0;
+	*dest_len = out_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#endif
--- /dev/null
+++ b/xen/common/lz4/defs.h
@@ -0,0 +1,184 @@
+/*
+ * lz4defs.h -- architecture specific defines
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifdef __XEN__
+#include <asm/byteorder.h>
+#endif
+
+#ifdef __LITTLE_ENDIAN
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpup(p);
+}
+
+static inline u32 INIT get_unaligned_le32(const void *p)
+{
+	return le32_to_cpup(p);
+}
+#else
+#include <asm/unaligned.h>
+
+static inline u16 INIT get_unaligned_le16(const void *p)
+{
+	return le16_to_cpu(__get_unaligned(p, 2));
+}
+
+static inline u32 INIT get_unaligned_le32(void *p)
+{
+	return le32_to_cpu(__get_unaligned(p, 4));
+}
+#endif
+
+/*
+ * Detects 64 bits mode
+ */
+#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
+	|| defined(__ppc64__) || defined(__LP64__))
+#define LZ4_ARCH64 1
+#else
+#define LZ4_ARCH64 0
+#endif
+
+/*
+ * Architecture-specific macros
+ */
+#define BYTE	u8
+typedef struct _U16_S { u16 v; } U16_S;
+typedef struct _U32_S { u32 v; } U32_S;
+typedef struct _U64_S { u64 v; } U64_S;
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)		\
+	|| defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6	\
+	&& defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
+
+#define A16(x) (((U16_S *)(x))->v)
+#define A32(x) (((U32_S *)(x))->v)
+#define A64(x) (((U64_S *)(x))->v)
+
+#define PUT4(s, d) (A32(d) = A32(s))
+#define PUT8(s, d) (A64(d) = A64(s))
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		A16(p) = v; \
+		p += 2; \
+	} while (0)
+#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+
+#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
+#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v))
+#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v))
+
+#define PUT4(s, d) \
+	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
+#define PUT8(s, d) \
+	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
+
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
+	do {	\
+		put_unaligned(v, (u16 *)(p)); \
+		p += 2; \
+	} while (0)
+#endif
+
+#define COPYLENGTH 8
+#define ML_BITS  4
+#define ML_MASK  ((1U << ML_BITS) - 1)
+#define RUN_BITS (8 - ML_BITS)
+#define RUN_MASK ((1U << RUN_BITS) - 1)
+#define MEMORY_USAGE	14
+#define MINMATCH	4
+#define SKIPSTRENGTH	6
+#define LASTLITERALS	5
+#define MFLIMIT		(COPYLENGTH + MINMATCH)
+#define MINLENGTH	(MFLIMIT + 1)
+#define MAXD_LOG	16
+#define MAXD		(1 << MAXD_LOG)
+#define MAXD_MASK	(u32)(MAXD - 1)
+#define MAX_DISTANCE	(MAXD - 1)
+#define HASH_LOG	(MAXD_LOG - 1)
+#define HASHTABLESIZE	(1 << HASH_LOG)
+#define MAX_NB_ATTEMPTS	256
+#define OPTIMAL_ML	(int)((ML_MASK-1)+MINMATCH)
+#define LZ4_64KLIMIT	((1<<16) + (MFLIMIT - 1))
+#define HASHLOG64K	((MEMORY_USAGE - 2) + 1)
+#define HASH64KTABLESIZE	(1U << HASHLOG64K)
+#define LZ4_HASH_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - (MEMORY_USAGE-2)))
+#define LZ4_HASH64K_VALUE(p)	(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASHLOG64K))
+#define HASH_VALUE(p)		(((A32(p)) * 2654435761U) >> \
+				((MINMATCH * 8) - HASH_LOG))
+
+#if LZ4_ARCH64/* 64-bit */
+#define STEPSIZE 8
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT8(s, d);	\
+		d += 8;		\
+		s += 8;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
+
+#define LZ4_SECURECOPY(s, d, e)			\
+	do {					\
+		if (d < e) {			\
+			LZ4_WILDCOPY(s, d, e);	\
+		}				\
+	} while (0)
+#define HTYPE u32
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3)
+#endif
+
+#else	/* 32-bit */
+#define STEPSIZE 4
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {			\
+		PUT4(s, d);	\
+		d += 4;		\
+		s += 4;		\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)		\
+	do {				\
+		LZ4_COPYSTEP(s, d);	\
+		LZ4_COPYSTEP(s, d);	\
+	} while (0)
+
+#define LZ4_SECURECOPY	LZ4_WILDCOPY
+#define HTYPE const u8*
+
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
+#endif
+
+#endif
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+	(d = s - get_unaligned_le16(p))
+
+#define LZ4_WILDCOPY(s, d, e)		\
+	do {				\
+		LZ4_COPYPACKET(s, d);	\
+	} while (d < e)
+
+#define LZ4_BLINDCOPY(s, d, l)	\
+	do {	\
+		u8 *e = (d) + l;	\
+		LZ4_WILDCOPY(s, d, e);	\
+		d = e;	\
+	} while (0)
--- /dev/null
+++ b/xen/common/unlz4.c
@@ -0,0 +1,166 @@
+/*
+ * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "decompress.h"
+#include <xen/lz4.h>
+#include "lz4/decompress.c"
+
+/*
+ * Note: Uncompressed chunk size is used in the compressor side
+ * (userspace side for compression).
+ * It is hardcoded because there is not proper way to extract it
+ * from the binary stream which is generated by the preliminary
+ * version of LZ4 tool so far.
+ */
+#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+STATIC int INIT unlz4(unsigned char *input, unsigned int in_len,
+		      int (*fill)(void *, unsigned int),
+		      int (*flush)(void *, unsigned int),
+		      unsigned char *output,
+		      unsigned int *posp,
+		      void (*error)(const char *x))
+{
+	int ret = -1;
+	size_t chunksize = 0;
+	size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE;
+	u8 *inp;
+	u8 *inp_start;
+	u8 *outp;
+	int size = in_len -= 4;
+#if defined(__XEN__) || defined(__MINIOS__)
+	size_t out_len = get_unaligned_le32(input + in_len);
+#endif
+	size_t dest_len;
+
+
+	if (output) {
+		outp = output;
+	} else if (!flush) {
+		error("NULL output pointer and no flush function provided");
+		goto exit_0;
+	} else {
+		outp = large_malloc(uncomp_chunksize);
+		if (!outp) {
+			error("Could not allocate output buffer");
+			goto exit_0;
+		}
+	}
+
+	if (input && fill) {
+		error("Both input pointer and fill function provided,");
+		goto exit_1;
+	} else if (input) {
+		inp = input;
+	} else if (!fill) {
+		error("NULL input pointer and missing fill function");
+		goto exit_1;
+	} else {
+		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
+		if (!inp) {
+			error("Could not allocate input buffer");
+			goto exit_1;
+		}
+	}
+	inp_start = inp;
+
+	if (posp)
+		*posp = 0;
+
+	if (fill)
+		fill(inp, 4);
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		error("invalid header");
+		goto exit_2;
+	}
+
+	if (posp)
+		*posp += 4;
+
+	for (;;) {
+
+		if (fill)
+			fill(inp, 4);
+
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			if (posp)
+				*posp += 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (posp)
+			*posp += 4;
+
+		if (fill) {
+			if (chunksize > lz4_compressbound(uncomp_chunksize)) {
+				error("chunk length is longer than allocated");
+				goto exit_2;
+			}
+			fill(inp, chunksize);
+		}
+#if defined(__XEN__) || defined(__MINIOS__)
+		if (out_len >= uncomp_chunksize) {
+			dest_len = uncomp_chunksize;
+			out_len -= dest_len;
+		} else
+			dest_len = out_len;
+		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+#else
+		dest_len = uncomp_chunksize;
+		ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp,
+				&dest_len);
+#endif
+		if (ret < 0) {
+			error("Decoding failed");
+			goto exit_2;
+		}
+
+		if (flush && flush(outp, dest_len) != dest_len)
+			goto exit_2;
+		if (output)
+			outp += dest_len;
+		if (posp)
+			*posp += chunksize;
+
+		size -= chunksize;
+
+		if (size == 0)
+			break;
+		else if (size < 0) {
+			error("data corrupted");
+			goto exit_2;
+		}
+
+		inp += chunksize;
+		if (fill)
+			inp = inp_start;
+	}
+
+	ret = 0;
+exit_2:
+	if (!input)
+		large_free(inp_start);
+exit_1:
+	if (!output)
+		large_free(outp);
+exit_0:
+	return ret;
+}
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -25,6 +25,7 @@
 #define CONFIG_X86_PM_TIMER 1
 #define CONFIG_HPET_TIMER 1
 #define CONFIG_X86_MCE_THERMAL 1
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 #define CONFIG_NUMA 1
 #define CONFIG_DISCONTIGMEM 1
 #define CONFIG_NUMA_EMU 1
--- a/xen/include/xen/decompress.h
+++ b/xen/include/xen/decompress.h
@@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char 
  * dependent).
  */
 
-decompress_fn bunzip2, unxz, unlzma, unlzo;
+decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4;
 
 int decompress(void *inbuf, unsigned int len, void *outbuf);
 
--- /dev/null
+++ b/xen/include/xen/lz4.h
@@ -0,0 +1,88 @@
+#ifndef __LZ4_H__
+#define __LZ4_H__
+
+/*
+ * LZ4 Kernel Interface
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define LZ4_MEM_COMPRESS	(4096 * sizeof(unsigned char *))
+#define LZ4HC_MEM_COMPRESS	(65538 * sizeof(unsigned char *))
+
+/*
+ * lz4_compressbound()
+ * Provides the maximum size that LZ4 may output in a "worst case" scenario
+ * (input data not compressible)
+ */
+static inline size_t lz4_compressbound(size_t isize)
+{
+	return isize + (isize / 255) + 16;
+}
+
+/*
+ * lz4_compress()
+ *	src     : source address of the original data
+ *	src_len : size of the original data
+ *	dst	: output buffer address of the compressed data
+ *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+ *	dst_len : is the output size, which is returned after compress done
+ *	workmem : address of the working memory.
+ *		This requires 'workmem' of size LZ4_MEM_COMPRESS.
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer and workmem must be already allocated with
+ *		the defined size.
+ */
+int lz4_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+ /*
+  * lz4hc_compress()
+  *	 src	 : source address of the original data
+  *	 src_len : size of the original data
+  *	 dst	 : output buffer address of the compressed data
+  *		This requires 'dst' of size LZ4_COMPRESSBOUND.
+  *	 dst_len : is the output size, which is returned after compress done
+  *	 workmem : address of the working memory.
+  *		This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
+  *	 return  : Success if return 0
+  *		   Error if return (< 0)
+  *	 note :  Destination buffer and workmem must be already allocated with
+  *		 the defined size.
+  */
+int lz4hc_compress(const unsigned char *src, size_t src_len,
+		unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+/*
+ * lz4_decompress()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, whcih is returned after decompress done
+ *	dest	: output buffer address of the decompressed data
+ *	actual_dest_len: is the size of uncompressed data, supposing it's known
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ *		slightly faster than lz4_decompress_unknownoutputsize()
+ */
+int lz4_decompress(const unsigned char *src, size_t *src_len,
+		unsigned char *dest, size_t actual_dest_len);
+
+/*
+ * lz4_decompress_unknownoutputsize()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, therefore the compressed size
+ *	dest	: output buffer address of the decompressed data
+ *	dest_len: is the max size of the destination buffer, which is
+ *			returned with actual size of decompressed data after
+ *			decompress done
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ */
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
+		unsigned char *dest, size_t *dest_len);
+#endif

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 2/2] libxc: add LZ4 decompression support
  2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
  2013-09-30 11:44   ` [PATCH v3 1/2] xen: add LZ4 " Jan Beulich
@ 2013-09-30 11:45   ` Jan Beulich
  2013-10-04 11:16     ` Ian Campbell
  2013-09-30 12:10   ` [PATCH v3 0/2] add LZ4 kernel " Keir Fraser
  2 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-09-30 11:45 UTC (permalink / raw)
  To: xen-devel; +Cc: yann.collet.73, Keir Fraser, Ian Jackson, Ian Campbell

[-- Attachment #1: Type: text/plain, Size: 5023 bytes --]

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Fix placement of DOMPRINTF() invocation. Split implementation into
    its own source file. Use the safe decompressor variant in non-
    Mini-OS libxc (lz4_decompress_unknownoutputsize()).
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing.

--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -63,6 +63,7 @@ $(patsubst %.c,%.opic,$(ELF_SRCS-y)): CF
 GUEST_SRCS-y                 += xc_dom_core.c xc_dom_boot.c
 GUEST_SRCS-y                 += xc_dom_elfloader.c
 GUEST_SRCS-$(CONFIG_X86)     += xc_dom_bzimageloader.c
+GUEST_SRCS-$(CONFIG_X86)     += xc_dom_decompress_lz4.c
 GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_armzimageloader.c
 GUEST_SRCS-y                 += xc_dom_binloader.c
 GUEST_SRCS-y                 += xc_dom_compat_linux.c
--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -33,7 +33,7 @@
 #include <inttypes.h>
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xc_dom_decompress.h"
 
 #ifndef __MINIOS__
 
@@ -733,6 +733,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- /dev/null
+++ b/tools/libxc/xc_dom_decompress.h
@@ -0,0 +1,8 @@
+#ifndef __MINIOS__
+# include "xc_dom.h"
+#else
+# include "xc_dom_decompress_unsafe.h"
+#endif
+
+int xc_try_lz4_decode(struct xc_dom_image *dom, void **blob, size_t *size);
+
--- /dev/null
+++ b/tools/libxc/xc_dom_decompress_lz4.c
@@ -0,0 +1,132 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <endian.h>
+#include <stdint.h>
+
+#include "xg_private.h"
+#include "xc_dom_decompress.h"
+
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+
+typedef uint8_t u8;
+typedef uint16_t u16;
+typedef uint32_t u32;
+typedef uint64_t u64;
+
+#define likely(a) a
+#define unlikely(a) a
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/include/xen/lz4.h"
+#include "../../xen/common/decompress.h"
+
+#ifndef __MINIOS__
+
+#include "../../xen/common/lz4/decompress.c"
+
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		dest_len = out_len - (outp - output);
+		ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp,
+				&dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_len;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+exit_0:
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+	return ret;
+}
+
+#else /* __MINIOS__ */
+
+#include "../../xen/common/unlz4.c"
+
+int xc_try_lz4_decode(
+    struct xc_dom_image *dom, void **blob, size_t *size)
+{
+    return xc_dom_decompress_unsafe(unlz4, dom, blob, size);
+}
+
+#endif



[-- Attachment #2: unlz4-domU.patch --]
[-- Type: text/plain, Size: 5059 bytes --]

libxc: add LZ4 decompression support

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Fix placement of DOMPRINTF() invocation. Split implementation into
    its own source file. Use the safe decompressor variant in non-
    Mini-OS libxc (lz4_decompress_unknownoutputsize()).
---
I intentionally retained the tab indentation in the code cloned from
its hypervisor original (which in turn retained it as being a clone
of the Linux original), to ease diff-ing.

--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -63,6 +63,7 @@ $(patsubst %.c,%.opic,$(ELF_SRCS-y)): CF
 GUEST_SRCS-y                 += xc_dom_core.c xc_dom_boot.c
 GUEST_SRCS-y                 += xc_dom_elfloader.c
 GUEST_SRCS-$(CONFIG_X86)     += xc_dom_bzimageloader.c
+GUEST_SRCS-$(CONFIG_X86)     += xc_dom_decompress_lz4.c
 GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_armzimageloader.c
 GUEST_SRCS-y                 += xc_dom_binloader.c
 GUEST_SRCS-y                 += xc_dom_compat_linux.c
--- a/tools/libxc/xc_dom_bzimageloader.c
+++ b/tools/libxc/xc_dom_bzimageloader.c
@@ -33,7 +33,7 @@
 #include <inttypes.h>
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xc_dom_decompress.h"
 
 #ifndef __MINIOS__
 
@@ -733,6 +733,17 @@ static int xc_dom_probe_bzimage_kernel(s
             return -EINVAL;
         }
     }
+    else if ( check_magic(dom, "\x02\x21", 2) )
+    {
+        ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size);
+        if ( ret < 0 )
+        {
+            xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                         "%s unable to LZ4 decompress kernel\n",
+                         __FUNCTION__);
+            return -EINVAL;
+        }
+    }
     else
     {
         xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
--- /dev/null
+++ b/tools/libxc/xc_dom_decompress.h
@@ -0,0 +1,8 @@
+#ifndef __MINIOS__
+# include "xc_dom.h"
+#else
+# include "xc_dom_decompress_unsafe.h"
+#endif
+
+int xc_try_lz4_decode(struct xc_dom_image *dom, void **blob, size_t *size);
+
--- /dev/null
+++ b/tools/libxc/xc_dom_decompress_lz4.c
@@ -0,0 +1,132 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <endian.h>
+#include <stdint.h>
+
+#include "xg_private.h"
+#include "xc_dom_decompress.h"
+
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+
+typedef uint8_t u8;
+typedef uint16_t u16;
+typedef uint32_t u32;
+typedef uint64_t u64;
+
+#define likely(a) a
+#define unlikely(a) a
+
+static inline uint_fast16_t le16_to_cpup(const unsigned char *buf)
+{
+    return buf[0] | (buf[1] << 8);
+}
+
+static inline uint_fast32_t le32_to_cpup(const unsigned char *buf)
+{
+    return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16);
+}
+
+#include "../../xen/include/xen/lz4.h"
+#include "../../xen/common/decompress.h"
+
+#ifndef __MINIOS__
+
+#include "../../xen/common/lz4/decompress.c"
+
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+int xc_try_lz4_decode(
+	struct xc_dom_image *dom, void **blob, size_t *psize)
+{
+	int ret = -1;
+	unsigned char *inp = *blob, *output, *outp;
+	ssize_t size = *psize - 4;
+	size_t out_len, dest_len, chunksize;
+	const char *msg;
+
+	if (size < 4) {
+		msg = "input too small";
+		goto exit_0;
+	}
+
+	out_len = get_unaligned_le32(inp + size);
+	if (xc_dom_kernel_check_size(dom, out_len)) {
+		msg = "Decompressed image too large";
+		goto exit_0;
+	}
+
+	output = malloc(out_len);
+	if (!output) {
+		msg = "Could not allocate output buffer";
+		goto exit_0;
+	}
+	outp = output;
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		msg = "invalid header";
+		goto exit_2;
+	}
+
+	for (;;) {
+		if (size < 4) {
+			msg = "missing data";
+			goto exit_2;
+		}
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		dest_len = out_len - (outp - output);
+		ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp,
+				&dest_len);
+		if (ret < 0) {
+			msg = "decoding failed";
+			goto exit_2;
+		}
+
+		outp += dest_len;
+		size -= chunksize;
+
+		if (size == 0)
+		{
+			*blob = output;
+			*psize = out_len;
+			return 0;
+		}
+
+		if (size < 0) {
+			msg = "data corrupted";
+			goto exit_2;
+		}
+
+		inp += chunksize;
+	}
+
+exit_2:
+	free(output);
+exit_0:
+	DOMPRINTF("LZ4 decompression error: %s\n", msg);
+	return ret;
+}
+
+#else /* __MINIOS__ */
+
+#include "../../xen/common/unlz4.c"
+
+int xc_try_lz4_decode(
+    struct xc_dom_image *dom, void **blob, size_t *size)
+{
+    return xc_dom_decompress_unsafe(unlz4, dom, blob, size);
+}
+
+#endif

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/2] add LZ4 kernel decompression support
  2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
  2013-09-30 11:44   ` [PATCH v3 1/2] xen: add LZ4 " Jan Beulich
  2013-09-30 11:45   ` [PATCH v3 2/2] libxc: " Jan Beulich
@ 2013-09-30 12:10   ` Keir Fraser
  2 siblings, 0 replies; 29+ messages in thread
From: Keir Fraser @ 2013-09-30 12:10 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: yann.collet.73, Ian Jackson, Ian Campbell

On 30/09/2013 12:40, "Jan Beulich" <JBeulich@suse.com> wrote:

> Linux 3.11 added respective support, so I think we should follow
> suit.
> 
> 1: xen: add LZ4 decompression support
> 2: libxc: add LZ4 decompression support
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Keir Fraser <keir@xen.org>

> ---
> v3: Tools side adjustments: Fix placement of DOMPRINTF()
>     invocation. Split implementation into its own source file. Use the
>     safe decompressor variant in non-Mini-OS libxc
>     (lz4_decompress_unknownoutputsize()). Minor adjustments to
>     hypervisor side avoid the tools side patch having to touch the
>     former.
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/2] libxc: add LZ4 decompression support
  2013-09-30 11:45   ` [PATCH v3 2/2] libxc: " Jan Beulich
@ 2013-10-04 11:16     ` Ian Campbell
  2013-10-04 14:19       ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-10-04 11:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote:

> +		chunksize = get_unaligned_le32(inp);
> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
> +			inp += 4;
> +			size -= 4;
> +			continue;
> +		}
> +		inp += 4;
> +		size -= 4;

I think you want a chunksize < size check here, in case a malicious file
tries to run off the end.

The rest looks good, so with e.g. this inserted:
		if (chunksize < size) {
			msg = "insufficient input data";
			goto exit_2;
		}

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/2] libxc: add LZ4 decompression support
  2013-10-04 11:16     ` Ian Campbell
@ 2013-10-04 14:19       ` Jan Beulich
  2013-10-04 14:30         ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2013-10-04 14:19 UTC (permalink / raw)
  To: Ian Campbell; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

>>> On 04.10.13 at 13:16, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote:
> 
>> +		chunksize = get_unaligned_le32(inp);
>> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
>> +			inp += 4;
>> +			size -= 4;
>> +			continue;
>> +		}
>> +		inp += 4;
>> +		size -= 4;
> 
> I think you want a chunksize < size check here, in case a malicious file
> tries to run off the end.
> 
> The rest looks good, so with e.g. this inserted:
> 		if (chunksize < size) {

Did you perhaps mean the opposite

 		if (chunksize > size) {

Jan

> 			msg = "insufficient input data";
> 			goto exit_2;
> 		}
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/2] libxc: add LZ4 decompression support
  2013-10-04 14:19       ` Jan Beulich
@ 2013-10-04 14:30         ` Ian Campbell
  0 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-10-04 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: yann.collet.73, xen-devel, Keir Fraser, Ian Jackson

On Fri, 2013-10-04 at 15:19 +0100, Jan Beulich wrote:
> >>> On 04.10.13 at 13:16, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote:
> > 
> >> +		chunksize = get_unaligned_le32(inp);
> >> +		if (chunksize == ARCHIVE_MAGICNUMBER) {
> >> +			inp += 4;
> >> +			size -= 4;
> >> +			continue;
> >> +		}
> >> +		inp += 4;
> >> +		size -= 4;
> > 
> > I think you want a chunksize < size check here, in case a malicious file
> > tries to run off the end.
> > 
> > The rest looks good, so with e.g. this inserted:
> > 		if (chunksize < size) {
> 
> Did you perhaps mean the opposite
> 
>  		if (chunksize > size) {

Yes. Doh!

> 
> Jan
> 
> > 			msg = "insufficient input data";
> > 			goto exit_2;
> > 		}
> > 
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-10-04 14:30 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-23 14:51 [PATCH 0/2] add LZ4 kernel decompression support Jan Beulich
2013-09-23 14:54 ` [PATCH 1/2] xen: add LZ4 " Jan Beulich
2013-09-23 15:00   ` Ian Campbell
2013-09-23 15:10     ` Jan Beulich
2013-09-23 14:56 ` [PATCH 2/2] libxc: " Jan Beulich
2013-09-23 15:00   ` Ian Campbell
2013-09-23 15:22     ` Jan Beulich
2013-09-24  9:34       ` Ian Campbell
2013-09-24  9:49         ` Jan Beulich
2013-09-24 10:01           ` Ian Campbell
2013-09-24 10:17             ` Jan Beulich
2013-09-24 10:18               ` Ian Campbell
2013-09-24 10:05         ` Ian Campbell
2013-09-24 15:02           ` Jan Beulich
2013-09-24 16:29             ` [PATCH v2] " Jan Beulich
2013-09-24 16:53               ` Ian Campbell
2013-09-24 19:55                 ` Yann Collet
2013-09-25  7:37                   ` Ian Campbell
2013-09-25  8:06                     ` Yann Collet
2013-09-25  8:36                       ` Jan Beulich
2013-09-25  7:23                 ` Jan Beulich
2013-09-25  7:35                   ` Ian Campbell
2013-09-30 11:40 ` [PATCH v3 0/2] add LZ4 kernel " Jan Beulich
2013-09-30 11:44   ` [PATCH v3 1/2] xen: add LZ4 " Jan Beulich
2013-09-30 11:45   ` [PATCH v3 2/2] libxc: " Jan Beulich
2013-10-04 11:16     ` Ian Campbell
2013-10-04 14:19       ` Jan Beulich
2013-10-04 14:30         ` Ian Campbell
2013-09-30 12:10   ` [PATCH v3 0/2] add LZ4 kernel " Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.