linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Add support for LZ4-compressed kernels
@ 2013-01-26  5:50 Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 1/4] decompressors: add lz4 decompressor module Kyungsik Lee
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Kyungsik Lee @ 2013-01-26  5:50 UTC (permalink / raw)
  To: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86
  Cc: Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee, Kyungsik Lee

This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
the x86 and ARM architectures.

According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
compression algorithm and also features an extremely fast decoder.

Kernel Decompression APIs are based on implementation by Yann Collet
(http://code.google.com/p/lz4/source/checkout).
De/compression Tools are also provided from the site above.

The initial test result on ARM(v7) based board shows that the size of kernel
with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
speed is faster(especially under the enabled unaligned memory access).

Test: 3.4 based kernel built with many modules
Uncompressed kernel size: 13MB
lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

It seems that it’s worth trying LZ4 compressed kernel image or ramdisk 
for making the kernel boot more faster.

Thanks,
Kyungsik


Kyungsik Lee (4):
  decompressors: add lz4 decompressor module
  lib: add support for LZ4-compressed kernels
  arm: add support for LZ4-compressed kernels
  x86: add support for LZ4-compressed kernels

 arch/arm/Kconfig                      |   1 +
 arch/arm/boot/compressed/.gitignore   |   1 +
 arch/arm/boot/compressed/Makefile     |   3 +-
 arch/arm/boot/compressed/decompress.c |   4 +
 arch/arm/boot/compressed/piggy.lz4.S  |   6 +
 arch/x86/Kconfig                      |   1 +
 arch/x86/boot/compressed/Makefile     |   5 +-
 arch/x86/boot/compressed/misc.c       |   4 +
 include/linux/decompress/unlz4.h      |  10 ++
 include/linux/lz4.h                   |  62 +++++++++++
 init/Kconfig                          |  13 ++-
 lib/Kconfig                           |   7 ++
 lib/Makefile                          |   2 +
 lib/decompress.c                      |   5 +
 lib/decompress_unlz4.c                | 199 ++++++++++++++++++++++++++++++++++
 lib/lz4/Makefile                      |   1 +
 lib/lz4/lz4_decompress.c              | 199 ++++++++++++++++++++++++++++++++++
 lib/lz4/lz4defs.h                     | 129 ++++++++++++++++++++++
 scripts/Makefile.lib                  |   5 +
 usr/Kconfig                           |   9 ++
 20 files changed, 663 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/boot/compressed/piggy.lz4.S
 create mode 100644 include/linux/decompress/unlz4.h
 create mode 100644 include/linux/lz4.h
 create mode 100644 lib/decompress_unlz4.c
 create mode 100644 lib/lz4/Makefile
 create mode 100644 lib/lz4/lz4_decompress.c
 create mode 100644 lib/lz4/lz4defs.h

-- 
1.8.0.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH 1/4] decompressors: add lz4 decompressor module
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
@ 2013-01-26  5:50 ` Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 2/4] lib: add support for LZ4-compressed kernels Kyungsik Lee
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Kyungsik Lee @ 2013-01-26  5:50 UTC (permalink / raw)
  To: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86
  Cc: Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee, Kyungsik Lee

This patch adds support for LZ4 decompression in the kernel.
LZ4 Decompression APIs for kernel are based on LZ4 implementation
by Yann Collet.

LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository : http://code.google.com/p/lz4/

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
---
 include/linux/lz4.h      |  62 +++++++++++++++
 lib/lz4/lz4_decompress.c | 199 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/lz4/lz4defs.h        | 129 ++++++++++++++++++++++++++++++
 3 files changed, 390 insertions(+)
 create mode 100644 include/linux/lz4.h
 create mode 100644 lib/lz4/lz4_decompress.c
 create mode 100644 lib/lz4/lz4defs.h

diff --git a/include/linux/lz4.h b/include/linux/lz4.h
new file mode 100644
index 0000000..df03dd8
--- /dev/null
+++ b/include/linux/lz4.h
@@ -0,0 +1,62 @@
+#ifndef __LZ4_H__
+#define __LZ4_H__
+/*
+ * LZ4 Decompressor Kernel Interface
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You can contact the author at :
+ * - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ * - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+
+/*
+ * LZ4_COMPRESSBOUND()
+ * Provides the maximum size that LZ4 may output in a "worst case" scenario
+ * (input data not compressible)
+ */
+#define LZ4_COMPRESSBOUND(isize) (isize + ((isize)/255) + 16)
+
+/*
+ * lz4_decompress()
+ *	src     : source address of the compressed data
+ *	src_len : is the input size, therefore the compressed size
+ *	dest	: output buffer address of the decompressed data
+ *	dest_len: is the size of the destination buffer
+ *			(which must be already allocated)
+ *	return  : Success if return 0
+ *		  Error if return (< 0)
+ *	note :  Destination buffer must be already allocated.
+ */
+int lz4_decompress(const char *src, size_t src_len, char *dest,
+			size_t *dest_len);
+#endif
diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
new file mode 100644
index 0000000..e8beb6b
--- /dev/null
+++ b/lib/lz4/lz4_decompress.c
@@ -0,0 +1,199 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013 LG Electronics Co., Ltd. (http://www.lge.com/)
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ *  - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+#ifndef STATIC
+#include <linux/module.h>
+#include <linux/kernel.h>
+#endif
+
+#include <asm/unaligned.h>
+#include <linux/lz4.h>
+#include "lz4defs.h"
+
+
+int lz4_uncompress_unknownoutputsize(
+				const char *source,
+				char *dest,
+				int isize,
+				size_t maxoutputsize)
+{
+	const BYTE * restrict ip = (const BYTE*) source;
+	const BYTE * const iend = ip + isize;
+	const BYTE *ref;
+
+
+	BYTE *op = (BYTE *) dest;
+	BYTE * const oend = op + maxoutputsize;
+	BYTE *cpy;
+
+	size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
+#if LZ4_ARCH64
+	size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+#endif
+
+	/* Main Loop */
+	while (ip < iend) {
+
+		unsigned token;
+		size_t length;
+
+		/* get runlength */
+		token = *ip++;
+		length = (token >> ML_BITS);
+		if (length == RUN_MASK) {
+			int s = 255;
+			while ((ip < iend) && (s == 255)) {
+				s = *ip++;
+				length += s;
+			}
+		}
+		/* copy literals */
+		cpy = op + length;
+		if ((cpy > oend - COPYLENGTH) ||
+			(ip + length > iend - COPYLENGTH)) {
+
+			if (cpy > oend)
+				goto _output_error;/* writes beyond buffer */
+
+			if (ip + length != iend)
+				goto _output_error;/*
+						    * Error: LZ4 format requires
+						    * to consume all input
+						    * at this stage
+						    */
+			memcpy(op, ip, length);
+			op += length;
+			break;/* Necessarily EOF, due to parsing restrictions */
+		}
+		LZ4_WILDCOPY(ip, op, cpy);
+		ip -= (op-cpy);
+		op = cpy;
+
+		/* get offset */
+		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
+		ip += 2;
+		if (ref < (BYTE * const)dest)
+			goto _output_error;
+			/*
+			 * Error : offset creates reference
+			 * outside of destination buffer
+			 */
+
+		/* get matchlength */
+		length = (token & ML_MASK);
+		if (length == ML_MASK) {
+			while (ip < iend) {
+				int s = *ip++;
+				length += s;
+				if (s == 255)
+					continue;
+				break;
+			}
+		}
+
+		/* copy repeated sequence */
+		if (unlikely(op - ref < STEPSIZE)) {
+#if LZ4_ARCH64
+			size_t dec64 = dec64table[op - ref];
+#else
+			const int dec64 = 0;
+#endif
+				op[0] = ref[0];
+				op[1] = ref[1];
+				op[2] = ref[2];
+				op[3] = ref[3];
+				op += 4;
+				ref += 4;
+				ref -= dec32table[op - ref];
+				PUT4(ref, op);
+				op += STEPSIZE-4; ref -= dec64;
+		} else {
+			LZ4_COPYSTEP(ref, op);
+		}
+		cpy = op + length - (STEPSIZE-4);
+		if (cpy > oend - COPYLENGTH) {
+			if (cpy > oend)
+				goto _output_error; /* write outside of buf */
+
+			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
+			while (op < cpy)
+				*op++ = *ref++;
+			op = cpy;
+			/*
+			 * Check EOF (should never happen, since last 5 bytes
+			 * are supposed to be literals)
+			 */
+			if (op == oend)
+				goto _output_error;
+			continue;
+		}
+		LZ4_SECURECOPY(ref, op, cpy);
+		op = cpy; /* correction */
+	}
+	/* end of decoding */
+	return (int) (((char *)op)-dest);
+
+	/* write overflow error detected */
+_output_error:
+	return (int) (-(((char *)ip)-source));
+}
+
+int lz4_decompress(const char *src, size_t src_len, char *dest,
+		size_t *dest_len)
+{
+	int ret = -1;
+	int out_len = 0;
+
+	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
+					*dest_len);
+	if (out_len < 0)
+		goto exit_0;
+	*dest_len = out_len;
+
+	return 0;
+exit_0:
+	return ret;
+}
+
+#ifndef STATIC
+EXPORT_SYMBOL_GPL(lz4_decompress);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("LZ4 Decompressor");
+#endif
diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
new file mode 100644
index 0000000..5b9666b
--- /dev/null
+++ b/lib/lz4/lz4defs.h
@@ -0,0 +1,129 @@
+/*
+ * LZ4 Decompressor for Linux kernel
+ *
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ *  - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+/*
+ * Detects 64 bits mode
+ */
+#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
+	|| defined(__ppc64__) || defined(__LP64__))
+#define LZ4_ARCH64 1
+#else
+#define LZ4_ARCH64 0
+#endif
+
+/*
+ * Compiler Options
+ */
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
+/* "restrict" is a known keyword */
+#else
+#define restrict	/* Disable restrict */
+#endif
+
+/*
+ * Architecture-specific macros
+ */
+#define BYTE	u8
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
+typedef struct _U32_S { u32 v; } U32_S;
+typedef struct _U64_S { u64 v; } U64_S;
+
+#define A32(x) (((U32_S *)(x))->v)
+#define A64(x) (((U64_S *)(x))->v)
+
+#define PUT4(s, d) (A32(d) = A32(s))
+#define PUT8(s, d) (A64(d) = A64(s))
+#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+
+#define PUT4(s, d) \
+	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
+#define PUT8(s, d) \
+	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
+#endif
+
+#define COPYLENGTH 8
+#define ML_BITS  4
+#define ML_MASK  ((1U << ML_BITS) - 1)
+#define RUN_BITS (8 - ML_BITS)
+#define RUN_MASK ((1U << RUN_BITS) - 1)
+
+#if LZ4_ARCH64/* 64-bit */
+#define STEPSIZE 8
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {	\
+		PUT8(s, d);	\
+		d += 8;	\
+		s += 8;	\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
+
+#define LZ4_SECURECOPY(s, d, e)	\
+	do {				\
+		if (d < e) {		\
+			LZ4_WILDCOPY(s, d, e);	\
+		}	\
+	} while (0)
+
+#else	/* 32-bit */
+#define STEPSIZE 4
+
+#define LZ4_COPYSTEP(s, d)	\
+	do {	\
+		PUT4(s, d);	\
+		d += 4;	\
+		s += 4;	\
+	} while (0)
+
+#define LZ4_COPYPACKET(s, d)	\
+	do {			\
+		LZ4_COPYSTEP(s, d);	\
+		LZ4_COPYSTEP(s, d);	\
+	} while (0)
+
+#define LZ4_SECURECOPY	LZ4_WILDCOPY
+#endif
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+	(d = s - get_unaligned_le16(p))
+#define LZ4_WILDCOPY(s, d, e)	\
+	do {				\
+		LZ4_COPYPACKET(s, d);	\
+	} while (d < e)
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 2/4] lib: add support for LZ4-compressed kernels
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 1/4] decompressors: add lz4 decompressor module Kyungsik Lee
@ 2013-01-26  5:50 ` Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 3/4] arm: " Kyungsik Lee
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Kyungsik Lee @ 2013-01-26  5:50 UTC (permalink / raw)
  To: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86
  Cc: Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee, Kyungsik Lee

This patch adds support for extracting LZ4-compressed kernel images,
as well as LZ4-compressed ramdisk images in the kernel boot process.

This depends on the patch below
decompressors: add lz4 decompressor module

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
---
 include/linux/decompress/unlz4.h |  10 ++
 init/Kconfig                     |  13 ++-
 lib/Kconfig                      |   7 ++
 lib/Makefile                     |   2 +
 lib/decompress.c                 |   5 +
 lib/decompress_unlz4.c           | 199 +++++++++++++++++++++++++++++++++++++++
 lib/lz4/Makefile                 |   1 +
 lib/lz4/lz4_decompress.c         |   2 +-
 scripts/Makefile.lib             |   5 +
 usr/Kconfig                      |   9 ++
 10 files changed, 251 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/decompress/unlz4.h
 create mode 100644 lib/decompress_unlz4.c
 create mode 100644 lib/lz4/Makefile

diff --git a/include/linux/decompress/unlz4.h b/include/linux/decompress/unlz4.h
new file mode 100644
index 0000000..d5b68bf
--- /dev/null
+++ b/include/linux/decompress/unlz4.h
@@ -0,0 +1,10 @@
+#ifndef DECOMPRESS_UNLZ4_H
+#define DECOMPRESS_UNLZ4_H
+
+int unlz4(unsigned char *inbuf, int len,
+	int(*fill)(void*, unsigned int),
+	int(*flush)(void*, unsigned int),
+	unsigned char *output,
+	int *pos,
+	void(*error)(char *x));
+#endif
diff --git a/init/Kconfig b/init/Kconfig
index 1aefe1a..be3753e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -102,10 +102,13 @@ config HAVE_KERNEL_XZ
 config HAVE_KERNEL_LZO
 	bool
 
+config HAVE_KERNEL_LZ4
+	bool
+
 choice
 	prompt "Kernel compression mode"
 	default KERNEL_GZIP
-	depends on HAVE_KERNEL_GZIP || HAVE_KERNEL_BZIP2 || HAVE_KERNEL_LZMA || HAVE_KERNEL_XZ || HAVE_KERNEL_LZO
+	depends on HAVE_KERNEL_GZIP || HAVE_KERNEL_BZIP2 || HAVE_KERNEL_LZMA || HAVE_KERNEL_XZ || HAVE_KERNEL_LZO || HAVE_KERNEL_LZ4
 	help
 	  The linux kernel is a kind of self-extracting executable.
 	  Several compression algorithms are available, which differ
@@ -172,6 +175,14 @@ config KERNEL_LZO
 	  size is about 10% bigger than gzip; however its speed
 	  (both compression and decompression) is the fastest.
 
+config KERNEL_LZ4
+	bool "LZ4"
+	depends on HAVE_KERNEL_LZ4
+	help
+	  Its compression ratio is worse than LZO. The size of the kernel
+	  is about 5% bigger than LZO. But the decompression speed is
+	  faster than LZO.
+
 endchoice
 
 config DEFAULT_HOSTNAME
diff --git a/lib/Kconfig b/lib/Kconfig
index 75cdb77..b108047 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -189,6 +189,9 @@ config LZO_COMPRESS
 config LZO_DECOMPRESS
 	tristate
 
+config LZ4_DECOMPRESS
+	tristate
+
 source "lib/xz/Kconfig"
 
 #
@@ -213,6 +216,10 @@ config DECOMPRESS_LZO
 	select LZO_DECOMPRESS
 	tristate
 
+config DECOMPRESS_LZ4
+	select LZ4_DECOMPRESS
+	tristate
+
 #
 # Generic allocator support is selected if needed
 #
diff --git a/lib/Makefile b/lib/Makefile
index 02ed6c0..c2073bf 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -72,6 +72,7 @@ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
 obj-$(CONFIG_BCH) += bch.o
 obj-$(CONFIG_LZO_COMPRESS) += lzo/
 obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
+obj-$(CONFIG_LZ4_DECOMPRESS) += lz4/
 obj-$(CONFIG_XZ_DEC) += xz/
 obj-$(CONFIG_RAID6_PQ) += raid6/
 
@@ -80,6 +81,7 @@ lib-$(CONFIG_DECOMPRESS_BZIP2) += decompress_bunzip2.o
 lib-$(CONFIG_DECOMPRESS_LZMA) += decompress_unlzma.o
 lib-$(CONFIG_DECOMPRESS_XZ) += decompress_unxz.o
 lib-$(CONFIG_DECOMPRESS_LZO) += decompress_unlzo.o
+lib-$(CONFIG_DECOMPRESS_LZ4) += decompress_unlz4.o
 
 obj-$(CONFIG_TEXTSEARCH) += textsearch.o
 obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o
diff --git a/lib/decompress.c b/lib/decompress.c
index 31a8042..c70810e 100644
--- a/lib/decompress.c
+++ b/lib/decompress.c
@@ -11,6 +11,7 @@
 #include <linux/decompress/unxz.h>
 #include <linux/decompress/inflate.h>
 #include <linux/decompress/unlzo.h>
+#include <linux/decompress/unlz4.h>
 
 #include <linux/types.h>
 #include <linux/string.h>
@@ -31,6 +32,9 @@
 #ifndef CONFIG_DECOMPRESS_LZO
 # define unlzo NULL
 #endif
+#ifndef CONFIG_DECOMPRESS_LZ4
+# define unlz4 NULL
+#endif
 
 struct compress_format {
 	unsigned char magic[2];
@@ -45,6 +49,7 @@ static const struct compress_format compressed_formats[] __initdata = {
 	{ {0x5d, 0x00}, "lzma", unlzma },
 	{ {0xfd, 0x37}, "xz", unxz },
 	{ {0x89, 0x4c}, "lzo", unlzo },
+	{ {0x02, 0x21}, "lz4", unlz4 },
 	{ {0, 0}, NULL, NULL }
 };
 
diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
new file mode 100644
index 0000000..6b6a8d0
--- /dev/null
+++ b/lib/decompress_unlz4.c
@@ -0,0 +1,199 @@
+/*
+ * LZ4 decompressor for the Linux kernel.
+ *
+ * Linux kernel adaptation:
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+ *
+ * Based on LZ4 implementation by Yann Collet.
+ *
+ * LZ4 - Fast LZ compression algorithm
+ * Copyright (C) 2011-2012, Yann Collet.
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above
+ *   copyright notice, this list of conditions and the following disclaimer
+ *   in the documentation and/or other materials provided with the
+ *   distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  You can contact the author at :
+ *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
+ *  - LZ4 source repository : http://code.google.com/p/lz4/
+ */
+
+#ifdef STATIC
+#define PREBOOT
+#include "lz4/lz4_decompress.c"
+#else
+#include <linux/decompress/unlz4.h>
+#endif
+
+#include <linux/types.h>
+#include <linux/lz4.h>
+#include <linux/decompress/mm.h>
+
+#include <linux/compiler.h>
+#include <asm/unaligned.h>
+
+
+#define LZ4_CHUNK_SIZE (8<<20)
+#define ARCHIVE_MAGICNUMBER 0x184C2102
+
+STATIC inline int INIT unlz4(u8 *input, int in_len,
+				int (*fill) (void *, unsigned int),
+				int (*flush) (void *, unsigned int),
+				u8 *output, int *posp,
+				void (*error) (char *x))
+{
+	int ret = -1;
+	u32 chunksize = 0;
+	u8 *inp;
+	u8 *inp_start;
+	u8 *outp;
+	int size = in_len;
+	size_t dest_len;
+
+
+	if (output) {
+		outp = output;
+	} else if (!flush) {
+		error("NULL output pointer and no flush function provided");
+		goto exit_0;
+	} else {
+		outp = large_malloc(LZ4_CHUNK_SIZE);
+		if (!outp) {
+			error("Could not allocate output buffer");
+			goto exit_0;
+		}
+	}
+
+	if (input && fill) {
+		error("Both input pointer and fill function provided,");
+		goto exit_1;
+	} else if (input) {
+		inp = input;
+	} else if (!fill) {
+		error("NULL input pointer and missing fill function");
+		goto exit_1;
+	} else {
+		inp = large_malloc(LZ4_COMPRESSBOUND(LZ4_CHUNK_SIZE));
+		if (!inp) {
+			error("Could not allocate input buffer");
+			goto exit_1;
+		}
+	}
+	inp_start = inp;
+
+	if (posp)
+		*posp = 0;
+
+	if (fill)
+		fill(inp, 4);
+
+	chunksize = get_unaligned_le32(inp);
+	if (chunksize == ARCHIVE_MAGICNUMBER) {
+		inp += 4;
+		size -= 4;
+	} else {
+		error("invalid header");
+		goto exit_2;
+	}
+
+	if (posp)
+		*posp += 4;
+
+	for (;;) {
+
+		if (fill)
+			fill(inp, 4);
+
+		chunksize = get_unaligned_le32(inp);
+		if (chunksize == ARCHIVE_MAGICNUMBER) {
+			inp += 4;
+			size -= 4;
+			if (posp)
+				*posp += 4;
+			continue;
+		}
+		inp += 4;
+		size -= 4;
+
+		if (posp)
+			*posp += 4;
+
+		if (fill) {
+			if (chunksize > LZ4_COMPRESSBOUND(LZ4_CHUNK_SIZE)) {
+				error("chunk length is longer than allocated");
+				goto exit_2;
+			}
+			fill(inp, chunksize);
+		}
+		dest_len = LZ4_CHUNK_SIZE;
+		ret = lz4_decompress(inp, chunksize, outp, &dest_len);
+		if (ret < 0) {
+			error("Decoding failed");
+			goto exit_2;
+		}
+
+		if (flush && flush(outp, dest_len) != dest_len)
+			goto exit_2;
+		if (output)
+			outp += dest_len;
+		if (posp)
+			*posp += chunksize;
+
+		inp += chunksize;
+		size -= chunksize;
+
+		if (size == 0)
+			break;
+		else if (size < 0) {
+			error("data corrupted");
+			goto exit_2;
+		}
+
+		if (fill)
+			inp = inp_start;
+	}
+
+	ret = 0;
+exit_2:
+	if (!input)
+		large_free(inp_start);
+exit_1:
+	if (!output)
+		large_free(outp);
+
+exit_0:
+	return ret;
+}
+
+#ifdef PREBOOT
+STATIC int INIT decompress(unsigned char *buf, int in_len,
+			      int(*fill)(void*, unsigned int),
+			      int(*flush)(void*, unsigned int),
+			      unsigned char *output,
+			      int *posp,
+			      void(*error)(char *x)
+	)
+{
+	return unlz4(buf, in_len - 4, fill, flush, output, posp, error);
+}
+#endif
diff --git a/lib/lz4/Makefile b/lib/lz4/Makefile
new file mode 100644
index 0000000..7f548c6
--- /dev/null
+++ b/lib/lz4/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_LZ4_DECOMPRESS) += lz4_decompress.o
diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index e8beb6b..c89467a 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -1,7 +1,7 @@
 /*
  * LZ4 Decompressor for Linux kernel
  *
- * Copyright (C) 2013 LG Electronics Co., Ltd. (http://www.lge.com/)
+ * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
  *
  * Based on LZ4 implementation by Yann Collet.
  *
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index bdf42fd..9293ca1 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -307,6 +307,11 @@ cmd_lzo = (cat $(filter-out FORCE,$^) | \
 	lzop -9 && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
 	(rm -f $@ ; false)
 
+quiet_cmd_lz4 = LZ4     $@
+cmd_lz4 = (cat $(filter-out FORCE,$^) | \
+	lz4 -c1 stdin stdout && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
+	(rm -f $@ ; false)
+
 # U-Boot mkimage
 # ---------------------------------------------------------------------------
 
diff --git a/usr/Kconfig b/usr/Kconfig
index 085872b..642f503 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -90,6 +90,15 @@ config RD_LZO
 	  Support loading of a LZO encoded initial ramdisk or cpio buffer
 	  If unsure, say N.
 
+config RD_LZ4
+	bool "Support initial ramdisks compressed using LZ4" if EXPERT
+	default !EXPERT
+	depends on BLK_DEV_INITRD
+	select DECOMPRESS_LZ4
+	help
+	  Support loading of a LZ4 encoded initial ramdisk or cpio buffer
+	  If unsure, say N.
+
 choice
 	prompt "Built-in initramfs compression mode" if INITRAMFS_SOURCE!=""
 	help
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 3/4] arm: add support for LZ4-compressed kernels
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 1/4] decompressors: add lz4 decompressor module Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 2/4] lib: add support for LZ4-compressed kernels Kyungsik Lee
@ 2013-01-26  5:50 ` Kyungsik Lee
  2013-01-26  5:50 ` [RFC PATCH 4/4] x86: " Kyungsik Lee
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Kyungsik Lee @ 2013-01-26  5:50 UTC (permalink / raw)
  To: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86
  Cc: Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee, Kyungsik Lee

This patch integrates the LZ4 decompression code to the arm pre-boot code.
And it depends on two patchs below

lib: add support for LZ4-compressed kernels
decompressors: add lz4 decompressor module

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
---
 arch/arm/Kconfig                      | 1 +
 arch/arm/boot/compressed/.gitignore   | 1 +
 arch/arm/boot/compressed/Makefile     | 3 ++-
 arch/arm/boot/compressed/decompress.c | 4 ++++
 arch/arm/boot/compressed/piggy.lz4.S  | 6 ++++++
 5 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/compressed/piggy.lz4.S

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 91f8d78..1b3621d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -37,6 +37,7 @@ config ARM
 	select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7))
 	select HAVE_IDE if PCI || ISA || PCMCIA
 	select HAVE_KERNEL_GZIP
+	select HAVE_KERNEL_LZ4
 	select HAVE_KERNEL_LZMA
 	select HAVE_KERNEL_LZO
 	select HAVE_KERNEL_XZ
diff --git a/arch/arm/boot/compressed/.gitignore b/arch/arm/boot/compressed/.gitignore
index f79a08e..47279aa 100644
--- a/arch/arm/boot/compressed/.gitignore
+++ b/arch/arm/boot/compressed/.gitignore
@@ -6,6 +6,7 @@ piggy.gzip
 piggy.lzo
 piggy.lzma
 piggy.xzkern
+piggy.lz4
 vmlinux
 vmlinux.lds
 
diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
index 5cad8a6..8b5c79a 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -88,6 +88,7 @@ suffix_$(CONFIG_KERNEL_GZIP) = gzip
 suffix_$(CONFIG_KERNEL_LZO)  = lzo
 suffix_$(CONFIG_KERNEL_LZMA) = lzma
 suffix_$(CONFIG_KERNEL_XZ)   = xzkern
+suffix_$(CONFIG_KERNEL_LZ4)  = lz4
 
 # Borrowed libfdt files for the ATAG compatibility mode
 
@@ -112,7 +113,7 @@ targets       := vmlinux vmlinux.lds \
 		 font.o font.c head.o misc.o $(OBJS)
 
 # Make sure files are removed during clean
-extra-y       += piggy.gzip piggy.lzo piggy.lzma piggy.xzkern \
+extra-y       += piggy.gzip piggy.lzo piggy.lzma piggy.xzkern piggy.lz4 \
 		 lib1funcs.S ashldi3.S $(libfdt) $(libfdt_hdrs)
 
 ifeq ($(CONFIG_FUNCTION_TRACER),y)
diff --git a/arch/arm/boot/compressed/decompress.c b/arch/arm/boot/compressed/decompress.c
index 9deb56a..a95f071 100644
--- a/arch/arm/boot/compressed/decompress.c
+++ b/arch/arm/boot/compressed/decompress.c
@@ -53,6 +53,10 @@ extern char * strstr(const char * s1, const char *s2);
 #include "../../../../lib/decompress_unxz.c"
 #endif
 
+#ifdef CONFIG_KERNEL_LZ4
+#include "../../../../lib/decompress_unlz4.c"
+#endif
+
 int do_decompress(u8 *input, int len, u8 *output, void (*error)(char *x))
 {
 	return decompress(input, len, NULL, NULL, output, NULL, error);
diff --git a/arch/arm/boot/compressed/piggy.lz4.S b/arch/arm/boot/compressed/piggy.lz4.S
new file mode 100644
index 0000000..3d9a575
--- /dev/null
+++ b/arch/arm/boot/compressed/piggy.lz4.S
@@ -0,0 +1,6 @@
+	.section .piggydata,#alloc
+	.globl	input_data
+input_data:
+	.incbin	"arch/arm/boot/compressed/piggy.lz4"
+	.globl	input_data_end
+input_data_end:
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 4/4] x86: add support for LZ4-compressed kernels
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
                   ` (2 preceding siblings ...)
  2013-01-26  5:50 ` [RFC PATCH 3/4] arm: " Kyungsik Lee
@ 2013-01-26  5:50 ` Kyungsik Lee
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
  2013-01-29 22:55 ` David Sterba
  5 siblings, 0 replies; 30+ messages in thread
From: Kyungsik Lee @ 2013-01-26  5:50 UTC (permalink / raw)
  To: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86
  Cc: Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee, Kyungsik Lee

This patch integrates the LZ4 decompression code to the x86 pre-boot code.
And it depends on two patchs below

lib: add support for LZ4-compressed kernels
decompressors: add lz4 decompressor module

Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
---
 arch/x86/Kconfig                  | 1 +
 arch/x86/boot/compressed/Makefile | 5 ++++-
 arch/x86/boot/compressed/misc.c   | 4 ++++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8c185d0..7142bef 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -62,6 +62,7 @@ config X86
 	select HAVE_KERNEL_LZMA
 	select HAVE_KERNEL_XZ
 	select HAVE_KERNEL_LZO
+	select HAVE_KERNEL_LZ4
 	select HAVE_HW_BREAKPOINT
 	select HAVE_MIXED_BREAKPOINTS_REGS
 	select PERF_EVENTS
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 8a84501..c275db5 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -4,7 +4,7 @@
 # create a compressed vmlinux image from the original vmlinux
 #
 
-targets := vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo head_$(BITS).o misc.o string.o cmdline.o early_serial_console.o piggy.o
+targets := vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4 head_$(BITS).o misc.o string.o cmdline.o early_serial_console.o piggy.o
 
 KBUILD_CFLAGS := -m$(BITS) -D__KERNEL__ $(LINUX_INCLUDE) -O2
 KBUILD_CFLAGS += -fno-strict-aliasing -fPIC
@@ -64,12 +64,15 @@ $(obj)/vmlinux.bin.xz: $(vmlinux.bin.all-y) FORCE
 	$(call if_changed,xzkern)
 $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
 	$(call if_changed,lzo)
+$(obj)/vmlinux.bin.lz4: $(vmlinux.bin.all-y) FORCE
+	$(call if_changed,lz4)
 
 suffix-$(CONFIG_KERNEL_GZIP)	:= gz
 suffix-$(CONFIG_KERNEL_BZIP2)	:= bz2
 suffix-$(CONFIG_KERNEL_LZMA)	:= lzma
 suffix-$(CONFIG_KERNEL_XZ)	:= xz
 suffix-$(CONFIG_KERNEL_LZO) 	:= lzo
+suffix-$(CONFIG_KERNEL_LZ4) 	:= lz4
 
 quiet_cmd_mkpiggy = MKPIGGY $@
       cmd_mkpiggy = $(obj)/mkpiggy $< > $@ || ( rm -f $@ ; false )
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 88f7ff6..166a0a8 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -145,6 +145,10 @@ static int lines, cols;
 #include "../../../../lib/decompress_unlzo.c"
 #endif
 
+#ifdef CONFIG_KERNEL_LZ4
+#include "../../../../lib/decompress_unlz4.c"
+#endif
+
 static void scroll(void)
 {
 	int i;
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
                   ` (3 preceding siblings ...)
  2013-01-26  5:50 ` [RFC PATCH 4/4] x86: " Kyungsik Lee
@ 2013-01-28 22:25 ` Andrew Morton
  2013-01-29  1:16   ` kyungsik.lee
                     ` (4 more replies)
  2013-01-29 22:55 ` David Sterba
  5 siblings, 5 replies; 30+ messages in thread
From: Andrew Morton @ 2013-01-28 22:25 UTC (permalink / raw)
  To: Kyungsik Lee
  Cc: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List

On Sat, 26 Jan 2013 14:50:43 +0900
Kyungsik Lee <kyungsik.lee@lge.com> wrote:

> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
> the x86 and ARM architectures.
> 
> According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
> compression algorithm and also features an extremely fast decoder.
> 
> Kernel Decompression APIs are based on implementation by Yann Collet
> (http://code.google.com/p/lz4/source/checkout).
> De/compression Tools are also provided from the site above.
> 
> The initial test result on ARM(v7) based board shows that the size of kernel
> with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
> speed is faster(especially under the enabled unaligned memory access).
> 
> Test: 3.4 based kernel built with many modules
> Uncompressed kernel size: 13MB
> lzo: 6.3MB, 301ms
> lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
> 
> It seems that it___s worth trying LZ4 compressed kernel image or ramdisk 
> for making the kernel boot more faster.
>
> ...
>
>  20 files changed, 663 insertions(+), 3 deletions(-)
>
> ...
>

What's this "with enabled unaligned memory access" thing?  You mean "if
the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
that's only x86, which isn't really in the target market for this
patch, yes?

It's a lot of code for a 50ms boot-time improvement.  Does anyone have
any opinions on whether or not the benefits are worth the cost?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
@ 2013-01-29  1:16   ` kyungsik.lee
  2013-01-29  4:29   ` Nicolas Pitre
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: kyungsik.lee @ 2013-01-29  1:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List,
	Dave Martin

On 2013-01-29 오전 7:25, Andrew Morton wrote:
> On Sat, 26 Jan 2013 14:50:43 +0900
> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>
>> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
>> the x86 and ARM architectures.
>>
>> According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
>> compression algorithm and also features an extremely fast decoder.
>>
>> Kernel Decompression APIs are based on implementation by Yann Collet
>> (http://code.google.com/p/lz4/source/checkout).
>> De/compression Tools are also provided from the site above.
>>
>> The initial test result on ARM(v7) based board shows that the size of kernel
>> with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
>> speed is faster(especially under the enabled unaligned memory access).
>>
>> Test: 3.4 based kernel built with many modules
>> Uncompressed kernel size: 13MB
>> lzo: 6.3MB, 301ms
>> lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
>>
>> It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
>> for making the kernel boot more faster.
>>
>> ...
>>
>>   20 files changed, 663 insertions(+), 3 deletions(-)
>>
>> ...
>>
> What's this "with enabled unaligned memory access" thing?  You mean "if
> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> that's only x86, which isn't really in the target market for this
> patch, yes?

Yes, exactly. If the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS,

then it is expected more boot-time improvement by LZ4-decompressor.

Currently there are two architectures which support it in mainline; x86 and powerpc.
And it is expected that ARM arch(v6 or above) also support it since the commit below.
Commit ID: 5010192d5
ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above
by Dave Martin

The test results(167ms) come from the ARM(v7 arch), MSM8960 based board with
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.
  

>
> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> any opinions on whether or not the benefits are worth the cost?
>

Not only for the kernel but also the ramdisk can be compressed with LZ4 so
the boot-time would be more improved. The test case above didn't include
the decompressing time result for LZ4-compressed ramdisk.

So far the implementation is applicable to boot-time improvement for
LZ4-compressed kernel and ramdisk images but the decompressor module is
exported as an interface for other usages like LZO.
With LZ4 compressor(not yet implemented for the kernel), it is expected
that it will be used in many places in kernel such as crypto and fs(squashfs, btrfs).

Thanks,
Kyungsik

  





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
  2013-01-29  1:16   ` kyungsik.lee
@ 2013-01-29  4:29   ` Nicolas Pitre
  2013-01-29  6:18     ` H. Peter Anvin
  2013-01-30 10:23     ` Johannes Stezenbach
  2013-01-29  7:26   ` Richard Cochran
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2013-01-29  4:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kyungsik Lee, Russell King, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Michal Marek, linux-arm-kernel, linux-kernel,
	linux-kbuild, x86, Nitin Gupta, Richard Purdie, Josh Triplett,
	Joe Millenbach, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee,
	CE Linux Developers List

On Mon, 28 Jan 2013, Andrew Morton wrote:

> On Sat, 26 Jan 2013 14:50:43 +0900
> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
> 
> > This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
> > the x86 and ARM architectures.
> > 
> > According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
> > compression algorithm and also features an extremely fast decoder.
> > 
> > Kernel Decompression APIs are based on implementation by Yann Collet
> > (http://code.google.com/p/lz4/source/checkout).
> > De/compression Tools are also provided from the site above.
> > 
> > The initial test result on ARM(v7) based board shows that the size of kernel
> > with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
> > speed is faster(especially under the enabled unaligned memory access).
> > 
> > Test: 3.4 based kernel built with many modules
> > Uncompressed kernel size: 13MB
> > lzo: 6.3MB, 301ms
> > lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
> > 
> > It seems that it___s worth trying LZ4 compressed kernel image or ramdisk 
> > for making the kernel boot more faster.
> >
> > ...
> >
> >  20 files changed, 663 insertions(+), 3 deletions(-)
> >
> > ...
> >
> 
> What's this "with enabled unaligned memory access" thing?  You mean "if
> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> that's only x86, which isn't really in the target market for this
> patch, yes?

I'm guessing this is referring to commit 5010192d5a.

> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> any opinions on whether or not the benefits are worth the cost?

Well, we used to have only one compressed format.  Now we have nearly 
half a dozen, with the same worthiness issue between themselves.  
Either we keep it very simple, or we make it very flexible.  The former 
would argue in favor of removing some of the existing formats, the later 
would let this new format in.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29  4:29   ` Nicolas Pitre
@ 2013-01-29  6:18     ` H. Peter Anvin
  2013-01-30 10:23     ` Johannes Stezenbach
  1 sibling, 0 replies; 30+ messages in thread
From: H. Peter Anvin @ 2013-01-29  6:18 UTC (permalink / raw)
  To: Nicolas Pitre, Andrew Morton
  Cc: Kyungsik Lee, Russell King, Thomas Gleixner, Ingo Molnar,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List

Uhm... you're saying we have to be at one extreme or the other?

We probably could drop the legacy lzma format, but someone might rely on it.

Nicolas Pitre <nico@fluxnic.net> wrote:

>On Mon, 28 Jan 2013, Andrew Morton wrote:
>
>> On Sat, 26 Jan 2013 14:50:43 +0900
>> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>> 
>> > This patchset is for supporting LZ4 compressed kernel and initial
>ramdisk on
>> > the x86 and ARM architectures.
>> > 
>> > According to http://code.google.com/p/lz4/, LZ4 is a very fast
>lossless
>> > compression algorithm and also features an extremely fast decoder.
>> > 
>> > Kernel Decompression APIs are based on implementation by Yann
>Collet
>> > (http://code.google.com/p/lz4/source/checkout).
>> > De/compression Tools are also provided from the site above.
>> > 
>> > The initial test result on ARM(v7) based board shows that the size
>of kernel
>> > with LZ4 compressed is 8% bigger than LZO compressed  but the
>decompressing
>> > speed is faster(especially under the enabled unaligned memory
>access).
>> > 
>> > Test: 3.4 based kernel built with many modules
>> > Uncompressed kernel size: 13MB
>> > lzo: 6.3MB, 301ms
>> > lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
>> > 
>> > It seems that it___s worth trying LZ4 compressed kernel image or
>ramdisk 
>> > for making the kernel boot more faster.
>> >
>> > ...
>> >
>> >  20 files changed, 663 insertions(+), 3 deletions(-)
>> >
>> > ...
>> >
>> 
>> What's this "with enabled unaligned memory access" thing?  You mean
>"if
>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
>> that's only x86, which isn't really in the target market for this
>> patch, yes?
>
>I'm guessing this is referring to commit 5010192d5a.
>
>> It's a lot of code for a 50ms boot-time improvement.  Does anyone
>have
>> any opinions on whether or not the benefits are worth the cost?
>
>Well, we used to have only one compressed format.  Now we have nearly 
>half a dozen, with the same worthiness issue between themselves.  
>Either we keep it very simple, or we make it very flexible.  The former
>
>would argue in favor of removing some of the existing formats, the
>later 
>would let this new format in.
>
>
>Nicolas

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
  2013-01-29  1:16   ` kyungsik.lee
  2013-01-29  4:29   ` Nicolas Pitre
@ 2013-01-29  7:26   ` Richard Cochran
  2013-01-29 10:15   ` Russell King - ARM Linux
  2013-01-29 21:09   ` Rajesh Pawar
  4 siblings, 0 replies; 30+ messages in thread
From: Richard Cochran @ 2013-01-29  7:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kyungsik Lee, Russell King, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Michal Marek, linux-arm-kernel, linux-kernel,
	linux-kbuild, x86, Nitin Gupta, Richard Purdie, Josh Triplett,
	Joe Millenbach, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee,
	CE Linux Developers List

On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
> 
> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> any opinions on whether or not the benefits are worth the cost?

In the embedded space, quick boot is a really important feature to
have. Many people resort to awful hacks in order to improve boot time,
and so I would welcome this option.

I have seen arm systems that boot in 300 ms. I would say that 50 ms is
maybe not such a small improvement after all.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
                     ` (2 preceding siblings ...)
  2013-01-29  7:26   ` Richard Cochran
@ 2013-01-29 10:15   ` Russell King - ARM Linux
  2013-01-29 11:43     ` Egon Alter
  2013-01-30  3:36     ` H. Peter Anvin
  2013-01-29 21:09   ` Rajesh Pawar
  4 siblings, 2 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2013-01-29 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kyungsik Lee, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List

On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
> What's this "with enabled unaligned memory access" thing?  You mean "if
> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> that's only x86, which isn't really in the target market for this
> patch, yes?
> 
> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> any opinions on whether or not the benefits are worth the cost?

Well... when I saw this my immediate reaction was "oh no, yet another
decompressor for the kernel".  We have five of these things already.
Do we really need a sixth?

My feeling is that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)

And if we have a replacement one for one of these, then it should do
exactly that: replace it.  I realise that various architectures will
behave differently, so we should really be looking at numbers across
several arches.

Otherwise, where do we stop adding new ones?  After we have 6 of these
(which is after this one).  After 12?  After the 20th?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 10:15   ` Russell King - ARM Linux
@ 2013-01-29 11:43     ` Egon Alter
  2013-01-29 12:15       ` Russell King - ARM Linux
  2013-02-01  8:15       ` kyungsik.lee
  2013-01-30  3:36     ` H. Peter Anvin
  1 sibling, 2 replies; 30+ messages in thread
From: Egon Alter @ 2013-01-29 11:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Russell King - ARM Linux, Andrew Morton, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	linux-kernel, Josh Triplett, Nitin Gupta, Richard Purdie,
	Ingo Molnar, Joe Millenbach, chan.jeong, Kyungsik Lee,
	H. Peter Anvin, Thomas Gleixner, Albin Tonnerre,
	CE Linux Developers List, minchan.kim

Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
> On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
> > What's this "with enabled unaligned memory access" thing?  You mean "if
> > the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> > that's only x86, which isn't really in the target market for this
> > patch, yes?
> > 
> > It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> > any opinions on whether or not the benefits are worth the cost?
> 
> Well... when I saw this my immediate reaction was "oh no, yet another
> decompressor for the kernel".  We have five of these things already.
> Do we really need a sixth?
> 
> My feeling is that we should have:
> - one decompressor which is the fastest
> - one decompressor for the highest compression ratio
> - one popular decompressor (eg conventional gzip)

the problem gets more complicated as the "fastest" decompressor usually 
creates larger images which need more time to load from the storage, e.g. a 
one MB larger image on a 10 MB/s storage (note: bootloaders often configure 
the  storage controllers in slow modes) gives 100 ms more boot time, thus 
eating the gain of a "fast decompressor".

	Egon


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 11:43     ` Egon Alter
@ 2013-01-29 12:15       ` Russell King - ARM Linux
  2013-02-01  8:15       ` kyungsik.lee
  1 sibling, 0 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2013-01-29 12:15 UTC (permalink / raw)
  To: Egon Alter
  Cc: linux-arm-kernel, Andrew Morton, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	linux-kernel, Josh Triplett, Nitin Gupta, Richard Purdie,
	Ingo Molnar, Joe Millenbach, chan.jeong, Kyungsik Lee,
	H. Peter Anvin, Thomas Gleixner, Albin Tonnerre,
	CE Linux Developers List, minchan.kim

On Tue, Jan 29, 2013 at 12:43:20PM +0100, Egon Alter wrote:
> Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
> > On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
> > > What's this "with enabled unaligned memory access" thing?  You mean "if
> > > the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> > > that's only x86, which isn't really in the target market for this
> > > patch, yes?
> > > 
> > > It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> > > any opinions on whether or not the benefits are worth the cost?
> > 
> > Well... when I saw this my immediate reaction was "oh no, yet another
> > decompressor for the kernel".  We have five of these things already.
> > Do we really need a sixth?
> > 
> > My feeling is that we should have:
> > - one decompressor which is the fastest
> > - one decompressor for the highest compression ratio
> > - one popular decompressor (eg conventional gzip)
> 
> the problem gets more complicated as the "fastest" decompressor usually 
> creates larger images which need more time to load from the storage, e.g. a 
> one MB larger image on a 10 MB/s storage (note: bootloaders often configure 
> the  storage controllers in slow modes) gives 100 ms more boot time, thus 
> eating the gain of a "fast decompressor".

Ok.

We already have:

- lzma: 33% smaller than gzip, decompression speed between gzip and bzip2
- xz: 30% smaller than gzip, decompression speed similar to lzma
- bzip2: 10% smaller than gzip, slowest decompression
- gzip: reference implementation
- lzo: 10% bigger than gzip, fastest

And now:

- lz4: 8% bigger than lzo, 16% faster than lzo?
  (I make that 16% bigger than gzip)

So, image size wise, on a 2MB compressed gzip image, we're looking at
the difference between LZO at 2.2MB and LZ4 at 2.38MB.

But let's not stop there - the figures given for a 13MB decompressed
image were:

lzo: 6.3MB, 301ms
lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)

At 10MB/s (your figure), it takes .68s to read 6.8MB as opposed to .63s
for LZO.  So, totalling up these figures gives to give the overall figure:

lzo: 301ms + 630ms = 931ms
lz4: 167ms + 680ms = 797ms

Which gives the tradeoff at 10MB/s of 14% faster (but only with efficient
unaligned memory access.)  So... this faster decompressor is still the
fastest even with your media transfer rate factored in.

That gives an argument for replacing lzo with lz4...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
                     ` (3 preceding siblings ...)
  2013-01-29 10:15   ` Russell King - ARM Linux
@ 2013-01-29 21:09   ` Rajesh Pawar
  2013-02-01  7:00     ` kyungsik.lee
  4 siblings, 1 reply; 30+ messages in thread
From: Rajesh Pawar @ 2013-01-29 21:09 UTC (permalink / raw)
  To: Andrew Morton, Kyungsik Lee
  Cc: H. Peter Anvin, Michal Marek, Ingo Molnar, Thomas Gleixner,
	Russell King, linux-kernel, linux-kbuild, Rajesh Pawar,
	linux-arm-kernel

> Andrew Morton <akpm@linux-foundation.org> wrote:
>
>On Sat, 26 Jan 2013 14:50:43 +0900
>Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
>> the x86 and ARM architectures.
>> 
>> According to [[http://code.google.com/p/lz4/,]] LZ4 is a very fast lossless
>> compression algorithm and also features an extremely fast decoder.
>> 
>> Kernel Decompression APIs are based on implementation by Yann Collet
>> ([[http://code.google.com/p/lz4/source/checkout]]).
>> De/compression Tools are also provided from the site above.
>> 
>> The initial test result on ARM(v7) based board shows that the size of kernel
>> with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
>> speed is faster(especially under the enabled unaligned memory access).
>> 
>> Test: 3.4 based kernel built with many modules
>> Uncompressed kernel size: 13MB
>> lzo: 6.3MB, 301ms
>> lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
>> 
>> It seems that it___s worth trying LZ4 compressed kernel image or ramdisk 
>> for making the kernel boot more faster.
>>
>> ...
>>
>> 20 files changed, 663 insertions(+), 3 deletions(-)
>>
>> ...
>>
>What's this "with enabled unaligned memory access" thing? You mean "if
>the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
>that's only x86, which isn't really in the target market for this
>patch, yes?
>It's a lot of code for a 50ms boot-time improvement. Does anyone have
>any opinions on whether or not the benefits are worth the cost?

BTW, what happened to the proposed LZO update - woudn't it better to merge this first?

Also, under the hood LZ4 seems to be quite similar to LZO, so probably
LZO speed would also greatly benefit from unaligned access and some other
ARM optimisations
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
                   ` (4 preceding siblings ...)
  2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
@ 2013-01-29 22:55 ` David Sterba
  2013-02-01  7:13   ` kyungsik.lee
  5 siblings, 1 reply; 30+ messages in thread
From: David Sterba @ 2013-01-29 22:55 UTC (permalink / raw)
  To: Kyungsik Lee
  Cc: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee

On Sat, Jan 26, 2013 at 02:50:43PM +0900, Kyungsik Lee wrote:
> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
> the x86 and ARM architectures.

Have you considered the 'high compression' mode of lz4?
http://code.google.com/p/lz4/source/browse/trunk/lz4hc.c

The compression format remains the same, the compressor tries harder
(but is slower), resulting compression ratio is better.

an examle compression for vmlinux.bin of x86_64 build:

input size: 16509520 bytes

lz4 (svn 88):
output size:        6393684    (38.7%)
compression time:      41.7 ms (395 MB/s)
decompression time:    13.7 ms (1204 MB/s)

lz4hc (svn 88):
output size:        5319137    (32.2%)
compression time:       683 ms (24 MB/s)
decompression time:    13.1 ms (1259 MB/s)

compressed file delta: 6393684 - 5319137 = 1074547 ~ 1MB

tested on a Nehalem box; same test on my slow desktop gives

lz4:
compression time:      97   ms (169 MB/s)
decompression time:    25.7 ms (643 MB/s)

lz4hc:
compression time:    1386 ms (11 MB/s)
decompression time:    26 ms (619 MB/s)

While the decompression time is almost the same, image size is smaller.
The kernel image compression is run in userspace and the low speed is
not much of concern for a one-time operation.

For the reference, lzo (current kernel version) run on the destktop:

output size:         6026256 (36.5%)
decompression time:     79.6 ms (207 MB/s)

> It seems that it’s worth trying LZ4 compressed kernel image or ramdisk 
> for making the kernel boot more faster.

There's another potential user of lz4: btrfs. I've submitted a feature
preview integrating lz4 compression
http://thread.gmane.org/gmane.comp.file-systems.btrfs/15744
and we have tried to integrate the HC mode as well
http://thread.gmane.org/gmane.comp.file-systems.btrfs/18165
.
So far it's on a slow track, conceptually it works, but I the code needs
some work so it could live under lib/* (we've used the svn sources
with minor changes, no kernel coding style). It would be easier for me
to enhance the existing lib/lz4/* codebase.

Also zram could consider lz4, I'm not sure if there are other potential
users.


david

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 10:15   ` Russell King - ARM Linux
  2013-01-29 11:43     ` Egon Alter
@ 2013-01-30  3:36     ` H. Peter Anvin
  2013-01-30 18:33       ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2013-01-30  3:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Andrew Morton, Kyungsik Lee, Thomas Gleixner, Ingo Molnar,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List

On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:
> On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
>> What's this "with enabled unaligned memory access" thing?  You mean "if
>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
>> that's only x86, which isn't really in the target market for this
>> patch, yes?
>>
>> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
>> any opinions on whether or not the benefits are worth the cost?
>
> Well... when I saw this my immediate reaction was "oh no, yet another
> decompressor for the kernel".  We have five of these things already.
> Do we really need a sixth?
>
> My feeling is that we should have:
> - one decompressor which is the fastest
> - one decompressor for the highest compression ratio
> - one popular decompressor (eg conventional gzip)
>
> And if we have a replacement one for one of these, then it should do
> exactly that: replace it.  I realise that various architectures will
> behave differently, so we should really be looking at numbers across
> several arches.
>
> Otherwise, where do we stop adding new ones?  After we have 6 of these
> (which is after this one).  After 12?  After the 20th?
>

The only concern I have with that is if someone paints themselves into a 
corner and absolutely wants, say, LZO.

Otherwise, per your list it pretty much sounds like we should have lz4, 
gzip, and xz.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29  4:29   ` Nicolas Pitre
  2013-01-29  6:18     ` H. Peter Anvin
@ 2013-01-30 10:23     ` Johannes Stezenbach
  2013-02-04  2:02       ` Markus F.X.J. Oberhumer
  1 sibling, 1 reply; 30+ messages in thread
From: Johannes Stezenbach @ 2013-01-30 10:23 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Andrew Morton, Kyungsik Lee, Russell King, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Michal Marek, linux-arm-kernel,
	linux-kernel, linux-kbuild, x86, Nitin Gupta, Richard Purdie,
	Josh Triplett, Joe Millenbach, Albin Tonnerre, hyojun.im,
	chan.jeong, gunho.lee, minchan.kim, namhyung.kim,
	raphael.andy.lee, CE Linux Developers List,
	Markus F.X.J. Oberhumer

On Mon, Jan 28, 2013 at 11:29:14PM -0500, Nicolas Pitre wrote:
> On Mon, 28 Jan 2013, Andrew Morton wrote:
> 
> > On Sat, 26 Jan 2013 14:50:43 +0900
> > Kyungsik Lee <kyungsik.lee@lge.com> wrote:
> > 
> > > This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
> > > the x86 and ARM architectures.
> > > 
> > > According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
> > > compression algorithm and also features an extremely fast decoder.
> > > 
> > > Kernel Decompression APIs are based on implementation by Yann Collet
> > > (http://code.google.com/p/lz4/source/checkout).
> > > De/compression Tools are also provided from the site above.
> > > 
> > > The initial test result on ARM(v7) based board shows that the size of kernel
> > > with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
> > > speed is faster(especially under the enabled unaligned memory access).
> > > 
> > > Test: 3.4 based kernel built with many modules
> > > Uncompressed kernel size: 13MB
> > > lzo: 6.3MB, 301ms
> > > lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
> > 
> > What's this "with enabled unaligned memory access" thing?  You mean "if
> > the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> > that's only x86, which isn't really in the target market for this
> > patch, yes?
> 
> I'm guessing this is referring to commit 5010192d5a.
> 
> > It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> > any opinions on whether or not the benefits are worth the cost?
> 
> Well, we used to have only one compressed format.  Now we have nearly 
> half a dozen, with the same worthiness issue between themselves.  
> Either we keep it very simple, or we make it very flexible.  The former 
> would argue in favor of removing some of the existing formats, the later 
> would let this new format in.

This reminded me to check the status of the lzo update and it
seems it got lost?
http://lkml.org/lkml/2012/10/3/144

(Cc: added, I hope Markus still cares and someone could
eventually take his patch once he resends it.)

Johannes

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-30  3:36     ` H. Peter Anvin
@ 2013-01-30 18:33       ` Nicolas Pitre
  2013-01-31 21:48         ` H. Peter Anvin
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2013-01-30 18:33 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On Tue, 29 Jan 2013, H. Peter Anvin wrote:

> On 01/29/2013 02:15 AM, Russell King - ARM Linux wrote:
> > On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
> > > What's this "with enabled unaligned memory access" thing?  You mean "if
> > > the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
> > > that's only x86, which isn't really in the target market for this
> > > patch, yes?
> > > 
> > > It's a lot of code for a 50ms boot-time improvement.  Does anyone have
> > > any opinions on whether or not the benefits are worth the cost?
> > 
> > Well... when I saw this my immediate reaction was "oh no, yet another
> > decompressor for the kernel".  We have five of these things already.
> > Do we really need a sixth?
> > 
> > My feeling is that we should have:
> > - one decompressor which is the fastest
> > - one decompressor for the highest compression ratio
> > - one popular decompressor (eg conventional gzip)
> > 
> > And if we have a replacement one for one of these, then it should do
> > exactly that: replace it.  I realise that various architectures will
> > behave differently, so we should really be looking at numbers across
> > several arches.
> > 
> > Otherwise, where do we stop adding new ones?  After we have 6 of these
> > (which is after this one).  After 12?  After the 20th?
> > 
> 
> The only concern I have with that is if someone paints themselves into a
> corner and absolutely wants, say, LZO.

That would be hard to justify given that the kernel provides its own 
decompressor code, making the compression format transparent to 
bootloaders, etc.  And no one should be poking into the compressed 
zImage.

> Otherwise, per your list it pretty much sounds like we should have lz4, gzip,
> and xz.

I do agree with that.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-30 18:33       ` Nicolas Pitre
@ 2013-01-31 21:48         ` H. Peter Anvin
  2013-01-31 22:16           ` Nicolas Pitre
  0 siblings, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2013-01-31 21:48 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On 01/30/2013 10:33 AM, Nicolas Pitre wrote:
>>
>> The only concern I have with that is if someone paints themselves into a
>> corner and absolutely wants, say, LZO.
> 
> That would be hard to justify given that the kernel provides its own 
> decompressor code, making the compression format transparent to 
> bootloaders, etc.  And no one should be poking into the compressed 
> zImage.
> 

Some utterly weird things like the Xen domain builder do that, because
they have to.  That is why we explicitly document that the payload is
ELF and how to access it in the bzImage spec.

	-hpa



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-31 21:48         ` H. Peter Anvin
@ 2013-01-31 22:16           ` Nicolas Pitre
  2013-01-31 22:18             ` H. Peter Anvin
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2013-01-31 22:16 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On Thu, 31 Jan 2013, H. Peter Anvin wrote:

> On 01/30/2013 10:33 AM, Nicolas Pitre wrote:
> >>
> >> The only concern I have with that is if someone paints themselves into a
> >> corner and absolutely wants, say, LZO.
> > 
> > That would be hard to justify given that the kernel provides its own 
> > decompressor code, making the compression format transparent to 
> > bootloaders, etc.  And no one should be poking into the compressed 
> > zImage.
> > 
> 
> Some utterly weird things like the Xen domain builder do that, because
> they have to.  That is why we explicitly document that the payload is
> ELF and how to access it in the bzImage spec.

Are you kidding?

And what format do they expect?

If people are doing weird things with formats we're about to remove then 
it's their fault if they didn't make upstream developers aware of it.  
And if the reason they didn't tell anyone is because it is too nasty for 
public confession then they simply deserve to be broken and come up with 
a more sustainable solution.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-31 22:16           ` Nicolas Pitre
@ 2013-01-31 22:18             ` H. Peter Anvin
  2013-02-01  2:28               ` Nicolas Pitre
  0 siblings, 1 reply; 30+ messages in thread
From: H. Peter Anvin @ 2013-01-31 22:18 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On 01/31/2013 02:16 PM, Nicolas Pitre wrote:
>>
>> Some utterly weird things like the Xen domain builder do that, because
>> they have to.  That is why we explicitly document that the payload is
>> ELF and how to access it in the bzImage spec.
> 
> Are you kidding?
> 
> And what format do they expect?
> 

I think they can be fairly flexible.  Obviously gzip is always
supported.  I don't know the details.

> If people are doing weird things with formats we're about to remove then 
> it's their fault if they didn't make upstream developers aware of it.  
> And if the reason they didn't tell anyone is because it is too nasty for 
> public confession then they simply deserve to be broken and come up with 
> a more sustainable solution.

Well, it is too nasty for public confession, but it's called
"paravirtualization".

	-hpa



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-31 22:18             ` H. Peter Anvin
@ 2013-02-01  2:28               ` Nicolas Pitre
  2013-02-01  6:37                 ` H. Peter Anvin
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2013-02-01  2:28 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On Thu, 31 Jan 2013, H. Peter Anvin wrote:

> On 01/31/2013 02:16 PM, Nicolas Pitre wrote:
> >>
> >> Some utterly weird things like the Xen domain builder do that, because
> >> they have to.  That is why we explicitly document that the payload is
> >> ELF and how to access it in the bzImage spec.
> > 
> > Are you kidding?
> > 
> > And what format do they expect?
> > 
> 
> I think they can be fairly flexible.  Obviously gzip is always
> supported.  I don't know the details.
> 
> > If people are doing weird things with formats we're about to remove then 
> > it's their fault if they didn't make upstream developers aware of it.  
> > And if the reason they didn't tell anyone is because it is too nasty for 
> > public confession then they simply deserve to be broken and come up with 
> > a more sustainable solution.
> 
> Well, it is too nasty for public confession, but it's called
> "paravirtualization".

The fact that you are aware of it means we're not going to break them.  

But my point is that we must not be held back just in case someone out 
there might have painted himself in a corner without telling anyone.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-02-01  2:28               ` Nicolas Pitre
@ 2013-02-01  6:37                 ` H. Peter Anvin
  0 siblings, 0 replies; 30+ messages in thread
From: H. Peter Anvin @ 2013-02-01  6:37 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Russell King - ARM Linux, Michal Marek, hyojun.im,
	raphael.andy.lee, linux-kbuild, gunho.lee, namhyung.kim, x86,
	minchan.kim, linux-kernel, Josh Triplett, Nitin Gupta,
	Richard Purdie, Ingo Molnar, Joe Millenbach, chan.jeong,
	Kyungsik Lee, Andrew Morton, Albin Tonnerre,
	CE Linux Developers List, Thomas Gleixner, linux-arm-kernel

On 01/31/2013 06:28 PM, Nicolas Pitre wrote:
>>
>> Well, it is too nasty for public confession, but it's called
>> "paravirtualization".
> 
> The fact that you are aware of it means we're not going to break them.  
> 
> But my point is that we must not be held back just in case someone out 
> there might have painted himself in a corner without telling anyone.
> 

Yes.  However, it makes it more questionable to simply rip out
compression methods without warning.  Not that warnings help, as we have
learned.

	-hpa



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 21:09   ` Rajesh Pawar
@ 2013-02-01  7:00     ` kyungsik.lee
  2013-02-04  1:37       ` Markus F.X.J. Oberhumer
  0 siblings, 1 reply; 30+ messages in thread
From: kyungsik.lee @ 2013-02-01  7:00 UTC (permalink / raw)
  To: Rajesh Pawar
  Cc: Andrew Morton, H. Peter Anvin, Michal Marek, Ingo Molnar,
	Thomas Gleixner, Russell King, linux-kernel, linux-kbuild,
	linux-arm-kernel, David Sterba, 임효준,
	정찬균, minchan, 김남형,
	Richard Cochran, Egon Alter, CE Linux Developers List, markus,
	raphael.andy.lee

On 2013-01-30 오전 6:09, Rajesh Pawar wrote:
>> Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> On Sat, 26 Jan 2013 14:50:43 +0900
>> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>>> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
>>> the x86 and ARM architectures.
>>>
>>> According to [[http://code.google.com/p/lz4/,]] LZ4 is a very fast lossless
>>> compression algorithm and also features an extremely fast decoder.
>>>
>>> Kernel Decompression APIs are based on implementation by Yann Collet
>>> ([[http://code.google.com/p/lz4/source/checkout]]).
>>> De/compression Tools are also provided from the site above.
>>>
>>> The initial test result on ARM(v7) based board shows that the size of kernel
>>> with LZ4 compressed is 8% bigger than LZO compressed but the decompressing
>>> speed is faster(especially under the enabled unaligned memory access).
>>>
>>> Test: 3.4 based kernel built with many modules
>>> Uncompressed kernel size: 13MB
>>> lzo: 6.3MB, 301ms
>>> lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
>>>
>>> It seems that it___s worth trying LZ4 compressed kernel image or ramdisk
>>> for making the kernel boot more faster.
>>>
>>> ...
>>>
>>> 20 files changed, 663 insertions(+), 3 deletions(-)
>>>
>>> ...
>>>
>> What's this "with enabled unaligned memory access" thing? You mean "if
>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
>> that's only x86, which isn't really in the target market for this
>> patch, yes?
>> It's a lot of code for a 50ms boot-time improvement. Does anyone have
>> any opinions on whether or not the benefits are worth the cost?
> BTW, what happened to the proposed LZO update - woudn't it better to merge this first?
>
> Also, under the hood LZ4 seems to be quite similar to LZO, so probably
> LZO speed would also greatly benefit from unaligned access and some other
> ARM optimisations
>   
I didn't test with the proposed LZO update you mentioned. Sorry, which 
one do you mean?
I did some tests with the latest LZO in the mainline.

As a result, LZO is not faster in an unaligned access enabled on ARM. 
Actually Slower.

Decompression time: 336ms(383ms, with unaligned access enabled)

You may refer to https://lkml.org/lkml/2012/10/7/85 to know more about it.

Thanks,
Kyungsik


Thanks,
Kyungsik

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 22:55 ` David Sterba
@ 2013-02-01  7:13   ` kyungsik.lee
  0 siblings, 0 replies; 30+ messages in thread
From: kyungsik.lee @ 2013-02-01  7:13 UTC (permalink / raw)
  To: dsterba
  Cc: Russell King, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Andrew Morton, Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee,
	minchan.kim, namhyung.kim, raphael.andy.lee

On 2013-01-30 오전 7:55, David Sterba wrote:
> On Sat, Jan 26, 2013 at 02:50:43PM +0900, Kyungsik Lee wrote:
>> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
>> the x86 and ARM architectures.
> Have you considered the 'high compression' mode of lz4?
> http://code.google.com/p/lz4/source/browse/trunk/lz4hc.c
>
> The compression format remains the same, the compressor tries harder
> (but is slower), resulting compression ratio is better.
>
> an examle compression for vmlinux.bin of x86_64 build:
>
> input size: 16509520 bytes
>
> lz4 (svn 88):
> output size:        6393684    (38.7%)
> compression time:      41.7 ms (395 MB/s)
> decompression time:    13.7 ms (1204 MB/s)
>
> lz4hc (svn 88):
> output size:        5319137    (32.2%)
> compression time:       683 ms (24 MB/s)
> decompression time:    13.1 ms (1259 MB/s)
>
> compressed file delta: 6393684 - 5319137 = 1074547 ~ 1MB
>
> tested on a Nehalem box; same test on my slow desktop gives
>
> lz4:
> compression time:      97   ms (169 MB/s)
> decompression time:    25.7 ms (643 MB/s)
>
> lz4hc:
> compression time:    1386 ms (11 MB/s)
> decompression time:    26 ms (619 MB/s)
>
> While the decompression time is almost the same, image size is smaller.
> The kernel image compression is run in userspace and the low speed is
> not much of concern for a one-time operation.
>
> For the reference, lzo (current kernel version) run on the destktop:
>
> output size:         6026256 (36.5%)
> decompression time:     79.6 ms (207 MB/s)
>
>> It seems that it’s worth trying LZ4 compressed kernel image or ramdisk
>> for making the kernel boot more faster.
> There's another potential user of lz4: btrfs. I've submitted a feature
> preview integrating lz4 compression
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/15744
> and we have tried to integrate the HC mode as well
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/18165
> .
> So far it's on a slow track, conceptually it works, but I the code needs
> some work so it could live under lib/* (we've used the svn sources
> with minor changes, no kernel coding style). It would be easier for me
> to enhance the existing lib/lz4/* codebase.
>
> Also zram could consider lz4, I'm not sure if there are other potential
> users.
Yes, I guess squash fs and crypto would also benefit from lz4.


Thanks,
Kyungsik




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-29 11:43     ` Egon Alter
  2013-01-29 12:15       ` Russell King - ARM Linux
@ 2013-02-01  8:15       ` kyungsik.lee
  1 sibling, 0 replies; 30+ messages in thread
From: kyungsik.lee @ 2013-02-01  8:15 UTC (permalink / raw)
  To: Egon Alter
  Cc: linux-arm-kernel, Russell King - ARM Linux, Andrew Morton,
	Michal Marek, hyojun.im, raphael.andy.lee, linux-kbuild,
	gunho.lee, namhyung.kim, x86, linux-kernel, Josh Triplett,
	Nitin Gupta, Richard Purdie, Ingo Molnar, Joe Millenbach,
	chan.jeong, H. Peter Anvin, Thomas Gleixner, Albin Tonnerre,
	CE Linux Developers List, minchan.kim

On 2013-01-29 오후 8:43, Egon Alter wrote:
> Am Dienstag, 29. Januar 2013, 10:15:49 schrieb Russell King - ARM Linux:
>> On Mon, Jan 28, 2013 at 02:25:10PM -0800, Andrew Morton wrote:
>>> What's this "with enabled unaligned memory access" thing?  You mean "if
>>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
>>> that's only x86, which isn't really in the target market for this
>>> patch, yes?
>>>
>>> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
>>> any opinions on whether or not the benefits are worth the cost?
>> Well... when I saw this my immediate reaction was "oh no, yet another
>> decompressor for the kernel".  We have five of these things already.
>> Do we really need a sixth?
>>
>> My feeling is that we should have:
>> - one decompressor which is the fastest
>> - one decompressor for the highest compression ratio
>> - one popular decompressor (eg conventional gzip)
> the problem gets more complicated as the "fastest" decompressor usually
> creates larger images which need more time to load from the storage, e.g. a
> one MB larger image on a 10 MB/s storage (note: bootloaders often configure
> the  storage controllers in slow modes) gives 100 ms more boot time, thus
> eating the gain of a "fast decompressor".
Yes, the larger image could matter. Definitely it takes longer.

Here are some updated test cases: Including "loading time"

                                 lzo           lz4
loading time:             480ms       510ms
decompression time: 336ms       180ms(with efficient unaligned memory 
access enabled and ARM optimization)
total time:                 816ms        690ms

lz4 is still 15% faster in total time. This one is similar to the 
simulated result by Russell King.

Thanks,
Kyungsik

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-02-01  7:00     ` kyungsik.lee
@ 2013-02-04  1:37       ` Markus F.X.J. Oberhumer
  0 siblings, 0 replies; 30+ messages in thread
From: Markus F.X.J. Oberhumer @ 2013-02-04  1:37 UTC (permalink / raw)
  To: kyungsik.lee
  Cc: Rajesh Pawar, Andrew Morton, H. Peter Anvin, Michal Marek,
	Ingo Molnar, Thomas Gleixner, Russell King, linux-kernel,
	linux-kbuild, linux-arm-kernel, David Sterba,
	임효준, 정찬균,
	minchan, 김남형,
	Richard Cochran, Egon Alter, CE Linux Developers List,
	raphael.andy.lee

On 2013-02-01 08:00, kyungsik.lee wrote:
> On 2013-01-30 오전 6:09, Rajesh Pawar wrote:
>>> Andrew Morton <akpm@linux-foundation.org> wrote:
>>>
>>> On Sat, 26 Jan 2013 14:50:43 +0900
>>> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>>>> [...]
>>>>
>>> What's this "with enabled unaligned memory access" thing? You mean "if
>>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"? If so,
>>> that's only x86, which isn't really in the target market for this
>>> patch, yes?
>>> It's a lot of code for a 50ms boot-time improvement. Does anyone have
>>> any opinions on whether or not the benefits are worth the cost?
>> BTW, what happened to the proposed LZO update - woudn't it better to merge
>> this first?
>>
>> Also, under the hood LZ4 seems to be quite similar to LZO, so probably
>> LZO speed would also greatly benefit from unaligned access and some other
>> ARM optimisations
>>   
> I didn't test with the proposed LZO update you mentioned. Sorry, which one do
> you mean?
> I did some tests with the latest LZO in the mainline.

In fact you can easily improve LZO decompression speed on armv7 by almost 50%
by adding just a few lines for enabling unaligend access:

armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                   compression speed   decompression speed

  LZO-2005    :          27 MB/sec           84 MB/sec
  LZO-2012    :          44 MB/sec          117 MB/sec
  LZO-2013-UA :          47 MB/sec          167 MB/sec

Please see my other mail to LKML for details.

Cheers,
Markus

> As a result, LZO is not faster in an unaligned access enabled on ARM. Actually
> Slower.
> 
> Decompression time: 336ms(383ms, with unaligned access enabled)
> 
> You may refer to https://lkml.org/lkml/2012/10/7/85 to know more about it.
> 
> Thanks,
> Kyungsik
> 
> 
> Thanks,
> Kyungsik
> 

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-01-30 10:23     ` Johannes Stezenbach
@ 2013-02-04  2:02       ` Markus F.X.J. Oberhumer
  2013-02-04 10:50         ` Russell King - ARM Linux
  0 siblings, 1 reply; 30+ messages in thread
From: Markus F.X.J. Oberhumer @ 2013-02-04  2:02 UTC (permalink / raw)
  To: Johannes Stezenbach
  Cc: Nicolas Pitre, Andrew Morton, Kyungsik Lee, Russell King,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Michal Marek,
	linux-arm-kernel, linux-kernel, linux-kbuild, x86, Nitin Gupta,
	Richard Purdie, Josh Triplett, Joe Millenbach, Albin Tonnerre,
	hyojun.im, chan.jeong, gunho.lee, minchan.kim, namhyung.kim,
	raphael.andy.lee, CE Linux Developers List

[-- Attachment #1: Type: text/plain, Size: 4335 bytes --]

On 2013-01-30 11:23, Johannes Stezenbach wrote:
> On Mon, Jan 28, 2013 at 11:29:14PM -0500, Nicolas Pitre wrote:
>> On Mon, 28 Jan 2013, Andrew Morton wrote:
>>
>>> On Sat, 26 Jan 2013 14:50:43 +0900
>>> Kyungsik Lee <kyungsik.lee@lge.com> wrote:
>>>
>>>> This patchset is for supporting LZ4 compressed kernel and initial ramdisk on
>>>> the x86 and ARM architectures.
>>>>
>>>> According to http://code.google.com/p/lz4/, LZ4 is a very fast lossless
>>>> compression algorithm and also features an extremely fast decoder.
>>>>
>>>> Kernel Decompression APIs are based on implementation by Yann Collet
>>>> (http://code.google.com/p/lz4/source/checkout).
>>>> De/compression Tools are also provided from the site above.
>>>>
>>>> The initial test result on ARM(v7) based board shows that the size of kernel
>>>> with LZ4 compressed is 8% bigger than LZO compressed  but the decompressing
>>>> speed is faster(especially under the enabled unaligned memory access).
>>>>
>>>> Test: 3.4 based kernel built with many modules
>>>> Uncompressed kernel size: 13MB
>>>> lzo: 6.3MB, 301ms
>>>> lz4: 6.8MB, 251ms(167ms, with enabled unaligned memory access)
>>>
>>> What's this "with enabled unaligned memory access" thing?  You mean "if
>>> the arch supports CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS"?  If so,
>>> that's only x86, which isn't really in the target market for this
>>> patch, yes?
>>
>> I'm guessing this is referring to commit 5010192d5a.
>>
>>> It's a lot of code for a 50ms boot-time improvement.  Does anyone have
>>> any opinions on whether or not the benefits are worth the cost?
>>
>> Well, we used to have only one compressed format.  Now we have nearly 
>> half a dozen, with the same worthiness issue between themselves.  
>> Either we keep it very simple, or we make it very flexible.  The former 
>> would argue in favor of removing some of the existing formats, the later 
>> would let this new format in.
> 
> This reminded me to check the status of the lzo update and it
> seems it got lost?
> http://lkml.org/lkml/2012/10/3/144

The proposed LZO update currently lives in the linux-next tree.

I had tried several times during the last 12 months to provide an update
of the kernel LZO version, but community interest seemed low and I
basically got no feedback about performance improvements - which made
we wonder if people actually care.

At least akpm did approve the LZO update for inclusion into 3.7, but the code
still has not been merged into the main tree.
  > On 2012-10-09 21:26, Andrew Morton wrote:
  > [...]
  > The changes look OK to me.  Please ask Stephen to include the tree in
  > linux-next, for a 3.7 merge.

Well, this probably means I have done a rather poor marketing. Anyway, as
people seem to love *synthetic* benchmarks I'm finally posting some timings
(including a brand new ARM unaligned version - this is just a quick hack which
probably still can get optimized further).

Hopefully publishing these numbers will help arousing more interest. :-)

Cheers,
Markus


x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                   compression speed   decompression speed

  LZO-2005    :         150 MB/sec          468 MB/sec
  LZO-2012    :         434 MB/sec         1210 MB/sec

i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                   compression speed   decompression speed

  LZO-2005    :         143 MB/sec          409 MB/sec
  LZO-2012    :         372 MB/sec         1121 MB/sec

armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                   compression speed   decompression speed

  LZO-2005    :          27 MB/sec           84 MB/sec
  LZO-2012    :          44 MB/sec          117 MB/sec
  LZO-2013-UA :          47 MB/sec          167 MB/sec

Legend:

  LZO-2005    : LZO version in current 3.8 rc6 kernel (which is based on
                   the LZO 2.02 release from 2005)
  LZO-2012    : updated LZO version available in linux-next
  LZO-2013-UA : updated LZO version available in linux-next plus
                   ARM Unaligned Access patch (attached below)


> (Cc: added, I hope Markus still cares and someone could
> eventually take his patch once he resends it.)
> 
> Johannes
> 

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/




[-- Attachment #2: lib-lzo-huge-LZO-decompression-speedup-on-ARM.patch --]
[-- Type: text/x-patch, Size: 1584 bytes --]

commit 8745b927fcfcd6953ada9bd1220a73083db5948a
Author: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Date:   Mon Feb 4 02:26:14 2013 +0100

    lib/lzo: huge LZO decompression speedup on ARM by using unaligned access
    
    Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>

diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c
index 569985d..e3edc5f 100644
--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -72,9 +72,11 @@ copy_literal_run:
 						COPY8(op, ip);
 						op += 8;
 						ip += 8;
+#  if !defined(__arm__)
 						COPY8(op, ip);
 						op += 8;
 						ip += 8;
+#  endif
 					} while (ip < ie);
 					ip = ie;
 					op = oe;
@@ -159,9 +161,11 @@ copy_literal_run:
 					COPY8(op, m_pos);
 					op += 8;
 					m_pos += 8;
+#  if !defined(__arm__)
 					COPY8(op, m_pos);
 					op += 8;
 					m_pos += 8;
+#  endif
 				} while (op < oe);
 				op = oe;
 				if (HAVE_IP(6)) {
diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h
index 5a4beb2..b230601 100644
--- a/lib/lzo/lzodefs.h
+++ b/lib/lzo/lzodefs.h
@@ -12,8 +12,14 @@
  */
 
 
+#if 1 && defined(__arm__) && ((__LINUX_ARM_ARCH__ >= 6) || defined(__ARM_FEATURE_UNALIGNED))
+#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
+#define COPY4(dst, src)	\
+		* (u32 *) (void *) (dst) = * (const u32 *) (const void *) (src)
+#else
 #define COPY4(dst, src)	\
 		put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
+#endif
 #if defined(__x86_64__)
 #define COPY8(dst, src)	\
 		put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst))




^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-02-04  2:02       ` Markus F.X.J. Oberhumer
@ 2013-02-04 10:50         ` Russell King - ARM Linux
  2013-02-05 11:39           ` Johannes Stezenbach
  0 siblings, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2013-02-04 10:50 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: Johannes Stezenbach, Nicolas Pitre, Andrew Morton, Kyungsik Lee,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Michal Marek,
	linux-arm-kernel, linux-kernel, linux-kbuild, x86, Nitin Gupta,
	Richard Purdie, Josh Triplett, Joe Millenbach, Albin Tonnerre,
	hyojun.im, chan.jeong, gunho.lee, minchan.kim, namhyung.kim,
	raphael.andy.lee, CE Linux Developers List

On Mon, Feb 04, 2013 at 03:02:49AM +0100, Markus F.X.J. Oberhumer wrote:
> At least akpm did approve the LZO update for inclusion into 3.7, but the code
> still has not been merged into the main tree.
>   > On 2012-10-09 21:26, Andrew Morton wrote:
>   > [...]
>   > The changes look OK to me.  Please ask Stephen to include the tree in
>   > linux-next, for a 3.7 merge.
> 
> Well, this probably means I have done a rather poor marketing.

I assume this code is sitting in *your* tree?  How do you think it gets
into mainline?

There is no automatic way that code from linux-next gets merged into
mainline.  That is up to the tree owner to make happen, either by getting
their tree into a parent maintainers tree, or if there is none, asking
Linus to pull your tree at the appropriate time.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/4] Add support for LZ4-compressed kernels
  2013-02-04 10:50         ` Russell King - ARM Linux
@ 2013-02-05 11:39           ` Johannes Stezenbach
  0 siblings, 0 replies; 30+ messages in thread
From: Johannes Stezenbach @ 2013-02-05 11:39 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Markus F.X.J. Oberhumer, Nicolas Pitre, Andrew Morton,
	Kyungsik Lee, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Michal Marek, linux-arm-kernel, linux-kernel, linux-kbuild, x86,
	Nitin Gupta, Richard Purdie, Josh Triplett, Joe Millenbach,
	Albin Tonnerre, hyojun.im, chan.jeong, gunho.lee, minchan.kim,
	namhyung.kim, raphael.andy.lee, CE Linux Developers List

On Mon, Feb 04, 2013 at 10:50:52AM +0000, Russell King - ARM Linux wrote:
> On Mon, Feb 04, 2013 at 03:02:49AM +0100, Markus F.X.J. Oberhumer wrote:
> > At least akpm did approve the LZO update for inclusion into 3.7, but the code
> > still has not been merged into the main tree.
> >   > On 2012-10-09 21:26, Andrew Morton wrote:
> >   > [...]
> >   > The changes look OK to me.  Please ask Stephen to include the tree in
> >   > linux-next, for a 3.7 merge.
> > 
> > Well, this probably means I have done a rather poor marketing.
> 
> I assume this code is sitting in *your* tree?  How do you think it gets
> into mainline?
> 
> There is no automatic way that code from linux-next gets merged into
> mainline.  That is up to the tree owner to make happen, either by getting
> their tree into a parent maintainers tree, or if there is none, asking
> Linus to pull your tree at the appropriate time.

My feeling is that in this case it is unneccessarily hard
for an outside contributor to get a patch accepted, all
because get_maintainer.pl doesn't put someone in charge.

Apparently it doesn't work to put all the usual maintainer
responsibilities onto the shoulders of a Linux development novice.
Thus it would be nice if some maintainer would come
forward and offer to handle the patches for Markus.


Thanks,
Johannes

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-02-05 11:41 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-26  5:50 [RFC PATCH 0/4] Add support for LZ4-compressed kernels Kyungsik Lee
2013-01-26  5:50 ` [RFC PATCH 1/4] decompressors: add lz4 decompressor module Kyungsik Lee
2013-01-26  5:50 ` [RFC PATCH 2/4] lib: add support for LZ4-compressed kernels Kyungsik Lee
2013-01-26  5:50 ` [RFC PATCH 3/4] arm: " Kyungsik Lee
2013-01-26  5:50 ` [RFC PATCH 4/4] x86: " Kyungsik Lee
2013-01-28 22:25 ` [RFC PATCH 0/4] Add " Andrew Morton
2013-01-29  1:16   ` kyungsik.lee
2013-01-29  4:29   ` Nicolas Pitre
2013-01-29  6:18     ` H. Peter Anvin
2013-01-30 10:23     ` Johannes Stezenbach
2013-02-04  2:02       ` Markus F.X.J. Oberhumer
2013-02-04 10:50         ` Russell King - ARM Linux
2013-02-05 11:39           ` Johannes Stezenbach
2013-01-29  7:26   ` Richard Cochran
2013-01-29 10:15   ` Russell King - ARM Linux
2013-01-29 11:43     ` Egon Alter
2013-01-29 12:15       ` Russell King - ARM Linux
2013-02-01  8:15       ` kyungsik.lee
2013-01-30  3:36     ` H. Peter Anvin
2013-01-30 18:33       ` Nicolas Pitre
2013-01-31 21:48         ` H. Peter Anvin
2013-01-31 22:16           ` Nicolas Pitre
2013-01-31 22:18             ` H. Peter Anvin
2013-02-01  2:28               ` Nicolas Pitre
2013-02-01  6:37                 ` H. Peter Anvin
2013-01-29 21:09   ` Rajesh Pawar
2013-02-01  7:00     ` kyungsik.lee
2013-02-04  1:37       ` Markus F.X.J. Oberhumer
2013-01-29 22:55 ` David Sterba
2013-02-01  7:13   ` kyungsik.lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).