linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Update LZO compression
@ 2012-10-07 15:07 Markus F.X.J. Oberhumer
  2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-07 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton,
	Johannes Stezenbach, richard -rw- weinberger

As requested by akpm I am sending my "lzo-update" branch at

  git://github.com/markus-oberhumer/linux.git lzo-update

to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".

You can also browse the branch at

  https://github.com/markus-oberhumer/linux/compare/lzo-update

and review the three patches at

  https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
  https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
  https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1

Share and enjoy,
Markus


Markus F.X.J. Oberhumer (3):
  lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
  lib/lzo: Update LZO compression to current upstream version
  lib/lzo: Optimize code for CPUs with inefficient unaligned access

 include/linux/lzo.h             |   15 +-
 lib/lzo/Makefile                |    2 +-
 lib/lzo/lzo1x_compress.c        |  309 +++++++++++++++++++++------------------
 lib/lzo/lzo1x_decompress.c      |  255 --------------------------------
 lib/lzo/lzo1x_decompress_safe.c |  237 ++++++++++++++++++++++++++++++
 lib/lzo/lzodefs.h               |   34 ++++-
 6 files changed, 441 insertions(+), 411 deletions(-)
 delete mode 100644 lib/lzo/lzo1x_decompress.c
 create mode 100644 lib/lzo/lzo1x_decompress_safe.c


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
  2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer
@ 2012-10-07 15:08 ` Markus F.X.J. Oberhumer
  2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-07 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton,
	Johannes Stezenbach, richard -rw- weinberger

Rename the source file to match the function name and thereby
also make room for a possible future even slightly faster
"non-safe" decompressor version.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
---
 lib/lzo/Makefile                                   |    2 +-
 ...{lzo1x_decompress.c => lzo1x_decompress_safe.c} |    0
 2 files changed, 1 insertions(+), 1 deletions(-)
 rename lib/lzo/{lzo1x_decompress.c => lzo1x_decompress_safe.c} (100%)

diff --git a/lib/lzo/Makefile b/lib/lzo/Makefile
index e764116..f0f7d7c 100644
--- a/lib/lzo/Makefile
+++ b/lib/lzo/Makefile
@@ -1,5 +1,5 @@
 lzo_compress-objs := lzo1x_compress.o
-lzo_decompress-objs := lzo1x_decompress.o
+lzo_decompress-objs := lzo1x_decompress_safe.o
 
 obj-$(CONFIG_LZO_COMPRESS) += lzo_compress.o
 obj-$(CONFIG_LZO_DECOMPRESS) += lzo_decompress.o
diff --git a/lib/lzo/lzo1x_decompress.c b/lib/lzo/lzo1x_decompress_safe.c
similarity index 100%
rename from lib/lzo/lzo1x_decompress.c
rename to lib/lzo/lzo1x_decompress_safe.c
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version
  2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer
  2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer
@ 2012-10-07 15:08 ` Markus F.X.J. Oberhumer
  2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer
  2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton
  3 siblings, 0 replies; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-07 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton,
	Johannes Stezenbach, richard -rw- weinberger

This commit updates the kernel LZO code to the current upsteam version
which features a significant speed improvement - benchmarking the Calgary
and Silesia test corpora typically shows a doubled performance in
both compression and decompression on modern i386/x86_64/powerpc machines.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
---
 include/linux/lzo.h             |   15 +-
 lib/lzo/lzo1x_compress.c        |  309 +++++++++++++++++++----------------
 lib/lzo/lzo1x_decompress_safe.c |  341 ++++++++++++++++++---------------------
 lib/lzo/lzodefs.h               |   34 +++-
 4 files changed, 360 insertions(+), 339 deletions(-)

diff --git a/include/linux/lzo.h b/include/linux/lzo.h
index d793497..a0848d9 100644
--- a/include/linux/lzo.h
+++ b/include/linux/lzo.h
@@ -4,28 +4,28 @@
  *  LZO Public Kernel Interface
  *  A mini subset of the LZO real-time data compression library
  *
- *  Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com>
+ *  Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com>
  *
  *  The full LZO package can be found at:
  *  http://www.oberhumer.com/opensource/lzo/
  *
- *  Changed for kernel use by:
+ *  Changed for Linux kernel use by:
  *  Nitin Gupta <nitingupta910@gmail.com>
  *  Richard Purdie <rpurdie@openedhand.com>
  */
 
-#define LZO1X_MEM_COMPRESS	(16384 * sizeof(unsigned char *))
-#define LZO1X_1_MEM_COMPRESS	LZO1X_MEM_COMPRESS
+#define LZO1X_1_MEM_COMPRESS	(8192 * sizeof(unsigned short))
+#define LZO1X_MEM_COMPRESS	LZO1X_1_MEM_COMPRESS
 
 #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3)
 
-/* This requires 'workmem' of size LZO1X_1_MEM_COMPRESS */
+/* This requires 'wrkmem' of size LZO1X_1_MEM_COMPRESS */
 int lzo1x_1_compress(const unsigned char *src, size_t src_len,
-			unsigned char *dst, size_t *dst_len, void *wrkmem);
+		     unsigned char *dst, size_t *dst_len, void *wrkmem);
 
 /* safe decompression with overrun testing */
 int lzo1x_decompress_safe(const unsigned char *src, size_t src_len,
-			unsigned char *dst, size_t *dst_len);
+			  unsigned char *dst, size_t *dst_len);
 
 /*
  * Return values (< 0 = Error)
@@ -40,5 +40,6 @@ int lzo1x_decompress_safe(const unsigned char *src, size_t src_len,
 #define LZO_E_EOF_NOT_FOUND		(-7)
 #define LZO_E_INPUT_NOT_CONSUMED	(-8)
 #define LZO_E_NOT_YET_IMPLEMENTED	(-9)
+#define LZO_E_INVALID_ARGUMENT		(-10)
 
 #endif
diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c
index a604099..d42efe5 100644
--- a/lib/lzo/lzo1x_compress.c
+++ b/lib/lzo/lzo1x_compress.c
@@ -1,194 +1,217 @@
 /*
- *  LZO1X Compressor from MiniLZO
+ *  LZO1X Compressor from LZO
  *
- *  Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com>
+ *  Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com>
  *
  *  The full LZO package can be found at:
  *  http://www.oberhumer.com/opensource/lzo/
  *
- *  Changed for kernel use by:
+ *  Changed for Linux kernel use by:
  *  Nitin Gupta <nitingupta910@gmail.com>
  *  Richard Purdie <rpurdie@openedhand.com>
  */
 
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/lzo.h>
 #include <asm/unaligned.h>
+#include <linux/lzo.h>
 #include "lzodefs.h"
 
 static noinline size_t
-_lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
-		unsigned char *out, size_t *out_len, void *wrkmem)
+lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
+		    unsigned char *out, size_t *out_len,
+		    size_t ti, void *wrkmem)
 {
+	const unsigned char *ip;
+	unsigned char *op;
 	const unsigned char * const in_end = in + in_len;
-	const unsigned char * const ip_end = in + in_len - M2_MAX_LEN - 5;
-	const unsigned char ** const dict = wrkmem;
-	const unsigned char *ip = in, *ii = ip;
-	const unsigned char *end, *m, *m_pos;
-	size_t m_off, m_len, dindex;
-	unsigned char *op = out;
+	const unsigned char * const ip_end = in + in_len - 20;
+	const unsigned char *ii;
+	lzo_dict_t * const dict = (lzo_dict_t *) wrkmem;
 
-	ip += 4;
+	op = out;
+	ip = in;
+	ii = ip;
+	ip += ti < 4 ? 4 - ti : 0;
 
 	for (;;) {
-		dindex = ((size_t)(0x21 * DX3(ip, 5, 5, 6)) >> 5) & D_MASK;
-		m_pos = dict[dindex];
-
-		if (m_pos < in)
-			goto literal;
-
-		if (ip == m_pos || ((size_t)(ip - m_pos) > M4_MAX_OFFSET))
-			goto literal;
-
-		m_off = ip - m_pos;
-		if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3])
-			goto try_match;
-
-		dindex = (dindex & (D_MASK & 0x7ff)) ^ (D_HIGH | 0x1f);
-		m_pos = dict[dindex];
-
-		if (m_pos < in)
-			goto literal;
-
-		if (ip == m_pos || ((size_t)(ip - m_pos) > M4_MAX_OFFSET))
-			goto literal;
-
-		m_off = ip - m_pos;
-		if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3])
-			goto try_match;
-
-		goto literal;
-
-try_match:
-		if (get_unaligned((const unsigned short *)m_pos)
-				== get_unaligned((const unsigned short *)ip)) {
-			if (likely(m_pos[2] == ip[2]))
-					goto match;
-		}
-
+		const unsigned char *m_pos;
+		size_t t, m_len, m_off;
+		u32 dv;
 literal:
-		dict[dindex] = ip;
-		++ip;
+		ip += 1 + ((ip - ii) >> 5);
+next:
 		if (unlikely(ip >= ip_end))
 			break;
-		continue;
-
-match:
-		dict[dindex] = ip;
-		if (ip != ii) {
-			size_t t = ip - ii;
+		dv = get_unaligned_le32(ip);
+		t = ((dv * 0x1824429d) >> (32 - D_BITS)) & D_MASK;
+		m_pos = in + dict[t];
+		dict[t] = (lzo_dict_t) (ip - in);
+		if (unlikely(dv != get_unaligned_le32(m_pos)))
+			goto literal;
 
+		ii -= ti;
+		ti = 0;
+		t = ip - ii;
+		if (t != 0) {
 			if (t <= 3) {
 				op[-2] |= t;
-			} else if (t <= 18) {
+				COPY4(op, ii);
+				op += t;
+			} else if (t <= 16) {
 				*op++ = (t - 3);
+				COPY8(op, ii);
+				COPY8(op + 8, ii + 8);
+				op += t;
 			} else {
-				size_t tt = t - 18;
-
-				*op++ = 0;
-				while (tt > 255) {
-					tt -= 255;
+				if (t <= 18) {
+					*op++ = (t - 3);
+				} else {
+					size_t tt = t - 18;
 					*op++ = 0;
+					while (unlikely(tt > 255)) {
+						tt -= 255;
+						*op++ = 0;
+					}
+					*op++ = tt;
 				}
-				*op++ = tt;
+				do {
+					COPY8(op, ii);
+					COPY8(op + 8, ii + 8);
+					op += 16;
+					ii += 16;
+					t -= 16;
+				} while (t >= 16);
+				if (t > 0) do {
+					*op++ = *ii++;
+				} while (--t > 0);
 			}
-			do {
-				*op++ = *ii++;
-			} while (--t > 0);
 		}
 
-		ip += 3;
-		if (m_pos[3] != *ip++ || m_pos[4] != *ip++
-				|| m_pos[5] != *ip++ || m_pos[6] != *ip++
-				|| m_pos[7] != *ip++ || m_pos[8] != *ip++) {
-			--ip;
-			m_len = ip - ii;
+		m_len = 4;
+		{
+#if defined(LZO_USE_CTZ64)
+		u64 v;
+		v = get_unaligned((const u64 *) (ip + m_len)) ^
+		    get_unaligned((const u64 *) (m_pos + m_len));
+		if (unlikely(v == 0)) {
+			do {
+				m_len += 8;
+				v = get_unaligned((const u64 *) (ip + m_len)) ^
+				    get_unaligned((const u64 *) (m_pos + m_len));
+				if (unlikely(ip + m_len >= ip_end))
+					goto m_len_done;
+			} while (v == 0);
+		}
+#  if defined(__LITTLE_ENDIAN)
+		m_len += (unsigned) __builtin_ctzll(v) / 8;
+#  elif defined(__BIG_ENDIAN)
+		m_len += (unsigned) __builtin_clzll(v) / 8;
+#  else
+#    error "missing endian definition"
+#  endif
+#elif defined(LZO_USE_CTZ32)
+		u32 v;
+		v = get_unaligned((const u32 *) (ip + m_len)) ^
+		    get_unaligned((const u32 *) (m_pos + m_len));
+		if (unlikely(v == 0)) {
+			do {
+				m_len += 4;
+				v = get_unaligned((const u32 *) (ip + m_len)) ^
+				    get_unaligned((const u32 *) (m_pos + m_len));
+				if (unlikely(ip + m_len >= ip_end))
+					goto m_len_done;
+			} while (v == 0);
+		}
+#  if defined(__LITTLE_ENDIAN)
+		m_len += (unsigned) __builtin_ctz(v) / 8;
+#  elif defined(__BIG_ENDIAN)
+		m_len += (unsigned) __builtin_clz(v) / 8;
+#  else
+#    error "missing endian definition"
+#  endif
+#else
+		if (unlikely(ip[m_len] == m_pos[m_len])) {
+			do {
+				m_len += 1;
+				if (unlikely(ip + m_len >= ip_end))
+					goto m_len_done;
+			} while (ip[m_len] == m_pos[m_len]);
+		}
+#endif
+		}
+m_len_done:
 
-			if (m_off <= M2_MAX_OFFSET) {
-				m_off -= 1;
-				*op++ = (((m_len - 1) << 5)
-						| ((m_off & 7) << 2));
-				*op++ = (m_off >> 3);
-			} else if (m_off <= M3_MAX_OFFSET) {
-				m_off -= 1;
+		m_off = ip - m_pos;
+		ip += m_len;
+		ii = ip;
+		if (m_len <= M2_MAX_LEN && m_off <= M2_MAX_OFFSET) {
+			m_off -= 1;
+			*op++ = (((m_len - 1) << 5) | ((m_off & 7) << 2));
+			*op++ = (m_off >> 3);
+		} else if (m_off <= M3_MAX_OFFSET) {
+			m_off -= 1;
+			if (m_len <= M3_MAX_LEN)
 				*op++ = (M3_MARKER | (m_len - 2));
-				goto m3_m4_offset;
-			} else {
-				m_off -= 0x4000;
-
-				*op++ = (M4_MARKER | ((m_off & 0x4000) >> 11)
-						| (m_len - 2));
-				goto m3_m4_offset;
+			else {
+				m_len -= M3_MAX_LEN;
+				*op++ = M3_MARKER | 0;
+				while (unlikely(m_len > 255)) {
+					m_len -= 255;
+					*op++ = 0;
+				}
+				*op++ = (m_len);
 			}
+			*op++ = (m_off << 2);
+			*op++ = (m_off >> 6);
 		} else {
-			end = in_end;
-			m = m_pos + M2_MAX_LEN + 1;
-
-			while (ip < end && *m == *ip) {
-				m++;
-				ip++;
-			}
-			m_len = ip - ii;
-
-			if (m_off <= M3_MAX_OFFSET) {
-				m_off -= 1;
-				if (m_len <= 33) {
-					*op++ = (M3_MARKER | (m_len - 2));
-				} else {
-					m_len -= 33;
-					*op++ = M3_MARKER | 0;
-					goto m3_m4_len;
-				}
-			} else {
-				m_off -= 0x4000;
-				if (m_len <= M4_MAX_LEN) {
-					*op++ = (M4_MARKER
-						| ((m_off & 0x4000) >> 11)
+			m_off -= 0x4000;
+			if (m_len <= M4_MAX_LEN)
+				*op++ = (M4_MARKER | ((m_off >> 11) & 8)
 						| (m_len - 2));
-				} else {
-					m_len -= M4_MAX_LEN;
-					*op++ = (M4_MARKER
-						| ((m_off & 0x4000) >> 11));
-m3_m4_len:
-					while (m_len > 255) {
-						m_len -= 255;
-						*op++ = 0;
-					}
-
-					*op++ = (m_len);
+			else {
+				m_len -= M4_MAX_LEN;
+				*op++ = (M4_MARKER | ((m_off >> 11) & 8));
+				while (unlikely(m_len > 255)) {
+					m_len -= 255;
+					*op++ = 0;
 				}
+				*op++ = (m_len);
 			}
-m3_m4_offset:
-			*op++ = ((m_off & 63) << 2);
+			*op++ = (m_off << 2);
 			*op++ = (m_off >> 6);
 		}
-
-		ii = ip;
-		if (unlikely(ip >= ip_end))
-			break;
+		goto next;
 	}
-
 	*out_len = op - out;
-	return in_end - ii;
+	return in_end - (ii - ti);
 }
 
-int lzo1x_1_compress(const unsigned char *in, size_t in_len, unsigned char *out,
-			size_t *out_len, void *wrkmem)
+int lzo1x_1_compress(const unsigned char *in, size_t in_len,
+		     unsigned char *out, size_t *out_len,
+		     void *wrkmem)
 {
-	const unsigned char *ii;
+	const unsigned char *ip = in;
 	unsigned char *op = out;
-	size_t t;
+	size_t l = in_len;
+	size_t t = 0;
 
-	if (unlikely(in_len <= M2_MAX_LEN + 5)) {
-		t = in_len;
-	} else {
-		t = _lzo1x_1_do_compress(in, in_len, op, out_len, wrkmem);
+	while (l > 20) {
+		size_t ll = l <= (M4_MAX_OFFSET + 1) ? l : (M4_MAX_OFFSET + 1);
+		uintptr_t ll_end = (uintptr_t) ip + ll;
+		if ((ll_end + ((t + ll) >> 5)) <= ll_end)
+			break;
+		BUILD_BUG_ON(D_SIZE * sizeof(lzo_dict_t) > LZO1X_1_MEM_COMPRESS);
+		memset(wrkmem, 0, D_SIZE * sizeof(lzo_dict_t));
+		t = lzo1x_1_do_compress(ip, ll, op, out_len, t, wrkmem);
+		ip += ll;
 		op += *out_len;
+		l  -= ll;
 	}
+	t += l;
 
 	if (t > 0) {
-		ii = in + in_len - t;
+		const unsigned char *ii = in + in_len - t;
 
 		if (op == out && t <= 238) {
 			*op++ = (17 + t);
@@ -198,16 +221,21 @@ int lzo1x_1_compress(const unsigned char *in, size_t in_len, unsigned char *out,
 			*op++ = (t - 3);
 		} else {
 			size_t tt = t - 18;
-
 			*op++ = 0;
 			while (tt > 255) {
 				tt -= 255;
 				*op++ = 0;
 			}
-
 			*op++ = tt;
 		}
-		do {
+		if (t >= 16) do {
+			COPY8(op, ii);
+			COPY8(op + 8, ii + 8);
+			op += 16;
+			ii += 16;
+			t -= 16;
+		} while (t >= 16);
+		if (t > 0) do {
 			*op++ = *ii++;
 		} while (--t > 0);
 	}
@@ -223,4 +251,3 @@ EXPORT_SYMBOL_GPL(lzo1x_1_compress);
 
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("LZO1X-1 Compressor");
-
diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c
index f2fd098..0dba30c 100644
--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -1,12 +1,12 @@
 /*
- *  LZO1X Decompressor from MiniLZO
+ *  LZO1X Decompressor from LZO
  *
- *  Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com>
+ *  Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com>
  *
  *  The full LZO package can be found at:
  *  http://www.oberhumer.com/opensource/lzo/
  *
- *  Changed for kernel use by:
+ *  Changed for Linux kernel use by:
  *  Nitin Gupta <nitingupta910@gmail.com>
  *  Richard Purdie <rpurdie@openedhand.com>
  */
@@ -15,225 +15,198 @@
 #include <linux/module.h>
 #include <linux/kernel.h>
 #endif
-
 #include <asm/unaligned.h>
 #include <linux/lzo.h>
 #include "lzodefs.h"
 
-#define HAVE_IP(x, ip_end, ip) ((size_t)(ip_end - ip) < (x))
-#define HAVE_OP(x, op_end, op) ((size_t)(op_end - op) < (x))
-#define HAVE_LB(m_pos, out, op) (m_pos < out || m_pos >= op)
-
-#define COPY4(dst, src)	\
-		put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
+#define HAVE_IP(x)      ((size_t)(ip_end - ip) >= (size_t)(x))
+#define HAVE_OP(x)      ((size_t)(op_end - op) >= (size_t)(x))
+#define NEED_IP(x)      if (!HAVE_IP(x)) goto input_overrun
+#define NEED_OP(x)      if (!HAVE_OP(x)) goto output_overrun
+#define TEST_LB(m_pos)  if ((m_pos) < out) goto lookbehind_overrun
 
 int lzo1x_decompress_safe(const unsigned char *in, size_t in_len,
-			unsigned char *out, size_t *out_len)
+			  unsigned char *out, size_t *out_len)
 {
+	unsigned char *op;
+	const unsigned char *ip;
+	size_t t, next;
+	size_t state = 0;
+	const unsigned char *m_pos;
 	const unsigned char * const ip_end = in + in_len;
 	unsigned char * const op_end = out + *out_len;
-	const unsigned char *ip = in, *m_pos;
-	unsigned char *op = out;
-	size_t t;
 
-	*out_len = 0;
+	op = out;
+	ip = in;
 
+	if (unlikely(in_len < 3))
+		goto input_overrun;
 	if (*ip > 17) {
 		t = *ip++ - 17;
-		if (t < 4)
+		if (t < 4) {
+			next = t;
 			goto match_next;
-		if (HAVE_OP(t, op_end, op))
-			goto output_overrun;
-		if (HAVE_IP(t + 1, ip_end, ip))
-			goto input_overrun;
-		do {
-			*op++ = *ip++;
-		} while (--t > 0);
-		goto first_literal_run;
-	}
-
-	while ((ip < ip_end)) {
-		t = *ip++;
-		if (t >= 16)
-			goto match;
-		if (t == 0) {
-			if (HAVE_IP(1, ip_end, ip))
-				goto input_overrun;
-			while (*ip == 0) {
-				t += 255;
-				ip++;
-				if (HAVE_IP(1, ip_end, ip))
-					goto input_overrun;
-			}
-			t += 15 + *ip++;
-		}
-		if (HAVE_OP(t + 3, op_end, op))
-			goto output_overrun;
-		if (HAVE_IP(t + 4, ip_end, ip))
-			goto input_overrun;
-
-		COPY4(op, ip);
-		op += 4;
-		ip += 4;
-		if (--t > 0) {
-			if (t >= 4) {
-				do {
-					COPY4(op, ip);
-					op += 4;
-					ip += 4;
-					t -= 4;
-				} while (t >= 4);
-				if (t > 0) {
-					do {
-						*op++ = *ip++;
-					} while (--t > 0);
-				}
-			} else {
-				do {
-					*op++ = *ip++;
-				} while (--t > 0);
-			}
 		}
+		goto copy_literal_run;
+	}
 
-first_literal_run:
+	for (;;) {
 		t = *ip++;
-		if (t >= 16)
-			goto match;
-		m_pos = op - (1 + M2_MAX_OFFSET);
-		m_pos -= t >> 2;
-		m_pos -= *ip++ << 2;
-
-		if (HAVE_LB(m_pos, out, op))
-			goto lookbehind_overrun;
-
-		if (HAVE_OP(3, op_end, op))
-			goto output_overrun;
-		*op++ = *m_pos++;
-		*op++ = *m_pos++;
-		*op++ = *m_pos;
-
-		goto match_done;
-
-		do {
-match:
-			if (t >= 64) {
-				m_pos = op - 1;
-				m_pos -= (t >> 2) & 7;
-				m_pos -= *ip++ << 3;
-				t = (t >> 5) - 1;
-				if (HAVE_LB(m_pos, out, op))
-					goto lookbehind_overrun;
-				if (HAVE_OP(t + 3 - 1, op_end, op))
-					goto output_overrun;
-				goto copy_match;
-			} else if (t >= 32) {
-				t &= 31;
-				if (t == 0) {
-					if (HAVE_IP(1, ip_end, ip))
-						goto input_overrun;
-					while (*ip == 0) {
+		if (t < 16) {
+			if (likely(state == 0)) {
+				if (unlikely(t == 0)) {
+					while (unlikely(*ip == 0)) {
 						t += 255;
 						ip++;
-						if (HAVE_IP(1, ip_end, ip))
-							goto input_overrun;
+						NEED_IP(1);
 					}
-					t += 31 + *ip++;
+					t += 15 + *ip++;
 				}
-				m_pos = op - 1;
-				m_pos -= get_unaligned_le16(ip) >> 2;
-				ip += 2;
-			} else if (t >= 16) {
-				m_pos = op;
-				m_pos -= (t & 8) << 11;
-
-				t &= 7;
-				if (t == 0) {
-					if (HAVE_IP(1, ip_end, ip))
-						goto input_overrun;
-					while (*ip == 0) {
-						t += 255;
-						ip++;
-						if (HAVE_IP(1, ip_end, ip))
-							goto input_overrun;
-					}
-					t += 7 + *ip++;
+				t += 3;
+copy_literal_run:
+				if (likely(HAVE_IP(t + 15) && HAVE_OP(t + 15))) {
+					const unsigned char *ie = ip + t;
+					unsigned char *oe = op + t;
+					do {
+						COPY8(op, ip);
+						op += 8;
+						ip += 8;
+						COPY8(op, ip);
+						op += 8;
+						ip += 8;
+					} while (ip < ie);
+					ip = ie;
+					op = oe;
+				} else {
+					NEED_OP(t);
+					NEED_IP(t + 3);
+					do {
+						*op++ = *ip++;
+					} while (--t > 0);
 				}
-				m_pos -= get_unaligned_le16(ip) >> 2;
-				ip += 2;
-				if (m_pos == op)
-					goto eof_found;
-				m_pos -= 0x4000;
-			} else {
+				state = 4;
+				continue;
+			} else if (state != 4) {
+				next = t & 3;
 				m_pos = op - 1;
 				m_pos -= t >> 2;
 				m_pos -= *ip++ << 2;
-
-				if (HAVE_LB(m_pos, out, op))
-					goto lookbehind_overrun;
-				if (HAVE_OP(2, op_end, op))
-					goto output_overrun;
-
-				*op++ = *m_pos++;
-				*op++ = *m_pos;
-				goto match_done;
+				TEST_LB(m_pos);
+				NEED_OP(2);
+				op[0] = m_pos[0];
+				op[1] = m_pos[1];
+				op += 2;
+				goto match_next;
+			} else {
+				next = t & 3;
+				m_pos = op - (1 + M2_MAX_OFFSET);
+				m_pos -= t >> 2;
+				m_pos -= *ip++ << 2;
+				t = 3;
 			}
-
-			if (HAVE_LB(m_pos, out, op))
-				goto lookbehind_overrun;
-			if (HAVE_OP(t + 3 - 1, op_end, op))
-				goto output_overrun;
-
-			if (t >= 2 * 4 - (3 - 1) && (op - m_pos) >= 4) {
-				COPY4(op, m_pos);
-				op += 4;
-				m_pos += 4;
-				t -= 4 - (3 - 1);
+		} else if (t >= 64) {
+			next = t & 3;
+			m_pos = op - 1;
+			m_pos -= (t >> 2) & 7;
+			m_pos -= *ip++ << 3;
+			t = (t >> 5) - 1 + (3 - 1);
+		} else if (t >= 32) {
+			t = (t & 31) + (3 - 1);
+			if (unlikely(t == 2)) {
+				while (unlikely(*ip == 0)) {
+					t += 255;
+					ip++;
+					NEED_IP(1);
+				}
+				t += 31 + *ip++;
+				NEED_IP(2);
+			}
+			m_pos = op - 1;
+			next = get_unaligned_le16(ip);
+			ip += 2;
+			m_pos -= next >> 2;
+			next &= 3;
+		} else {
+			m_pos = op;
+			m_pos -= (t & 8) << 11;
+			t = (t & 7) + (3 - 1);
+			if (unlikely(t == 2)) {
+				while (unlikely(*ip == 0)) {
+					t += 255;
+					ip++;
+					NEED_IP(1);
+				}
+				t += 7 + *ip++;
+				NEED_IP(2);
+			}
+			next = get_unaligned_le16(ip);
+			ip += 2;
+			m_pos -= next >> 2;
+			next &= 3;
+			if (m_pos == op)
+				goto eof_found;
+			m_pos -= 0x4000;
+		}
+		TEST_LB(m_pos);
+		if (op - m_pos >= 8) {
+			unsigned char *oe = op + t;
+			if (likely(HAVE_OP(t + 15))) {
 				do {
-					COPY4(op, m_pos);
-					op += 4;
-					m_pos += 4;
-					t -= 4;
-				} while (t >= 4);
-				if (t > 0)
-					do {
-						*op++ = *m_pos++;
-					} while (--t > 0);
+					COPY8(op, m_pos);
+					op += 8;
+					m_pos += 8;
+					COPY8(op, m_pos);
+					op += 8;
+					m_pos += 8;
+				} while (op < oe);
+				op = oe;
+				if (HAVE_IP(6)) {
+					state = next;
+					COPY4(op, ip);
+					op += next;
+					ip += next;
+					continue;
+				}
 			} else {
-copy_match:
-				*op++ = *m_pos++;
-				*op++ = *m_pos++;
+				NEED_OP(t);
 				do {
 					*op++ = *m_pos++;
-				} while (--t > 0);
+				} while (op < oe);
 			}
-match_done:
-			t = ip[-2] & 3;
-			if (t == 0)
-				break;
+		} else {
+			unsigned char *oe = op + t;
+			NEED_OP(t);
+			op[0] = m_pos[0];
+			op[1] = m_pos[1];
+			op += 2;
+			m_pos += 2;
+			do {
+				*op++ = *m_pos++;
+			} while (op < oe);
+		}
 match_next:
-			if (HAVE_OP(t, op_end, op))
-				goto output_overrun;
-			if (HAVE_IP(t + 1, ip_end, ip))
-				goto input_overrun;
-
-			*op++ = *ip++;
-			if (t > 1) {
+		state = next;
+		t = next;
+		if (likely(HAVE_IP(6) && HAVE_OP(4))) {
+			COPY4(op, ip);
+			op += t;
+			ip += t;
+		} else {
+			NEED_IP(t + 3);
+			NEED_OP(t);
+			while (t > 0) {
 				*op++ = *ip++;
-				if (t > 2)
-					*op++ = *ip++;
+				t--;
 			}
-
-			t = *ip++;
-		} while (ip < ip_end);
+		}
 	}
 
-	*out_len = op - out;
-	return LZO_E_EOF_NOT_FOUND;
-
 eof_found:
 	*out_len = op - out;
-	return (ip == ip_end ? LZO_E_OK :
-		(ip < ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN));
+	return (t != 3       ? LZO_E_ERROR :
+		ip == ip_end ? LZO_E_OK :
+		ip <  ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN);
+
 input_overrun:
 	*out_len = op - out;
 	return LZO_E_INPUT_OVERRUN;
diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h
index b6d482c..ddc8db5 100644
--- a/lib/lzo/lzodefs.h
+++ b/lib/lzo/lzodefs.h
@@ -1,19 +1,37 @@
 /*
  *  lzodefs.h -- architecture, OS and compiler specific defines
  *
- *  Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com>
+ *  Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com>
  *
  *  The full LZO package can be found at:
  *  http://www.oberhumer.com/opensource/lzo/
  *
- *  Changed for kernel use by:
+ *  Changed for Linux kernel use by:
  *  Nitin Gupta <nitingupta910@gmail.com>
  *  Richard Purdie <rpurdie@openedhand.com>
  */
 
-#define LZO_VERSION		0x2020
-#define LZO_VERSION_STRING	"2.02"
-#define LZO_VERSION_DATE	"Oct 17 2005"
+
+#define COPY4(dst, src)	\
+		put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
+#if defined(__x86_64__)
+#define COPY8(dst, src)	\
+		put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst))
+#else
+#define COPY8(dst, src)	\
+		COPY4(dst, src); COPY4((dst) + 4, (src) + 4)
+#endif
+
+#if defined(__BIG_ENDIAN) && defined(__LITTLE_ENDIAN)
+#error "conflicting endian definitions"
+#elif defined(__x86_64__)
+#define LZO_USE_CTZ64	1
+#define LZO_USE_CTZ32	1
+#elif defined(__i386__) || defined(__powerpc__)
+#define LZO_USE_CTZ32	1
+#else
+#define LZO_USE_CTZ32	1
+#endif
 
 #define M1_MAX_OFFSET	0x0400
 #define M2_MAX_OFFSET	0x0800
@@ -34,8 +52,10 @@
 #define M3_MARKER	32
 #define M4_MARKER	16
 
-#define D_BITS		14
-#define D_MASK		((1u << D_BITS) - 1)
+#define lzo_dict_t      unsigned short
+#define D_BITS		13
+#define D_SIZE		(1u << D_BITS)
+#define D_MASK		(D_SIZE - 1)
 #define D_HIGH		((D_MASK >> 1) + 1)
 
 #define DX2(p, s1, s2)	(((((size_t)((p)[2]) << (s2)) ^ (p)[1]) \
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access
  2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer
  2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer
  2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer
@ 2012-10-07 15:09 ` Markus F.X.J. Oberhumer
  2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton
  3 siblings, 0 replies; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-07 15:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton,
	Johannes Stezenbach, richard -rw- weinberger

Some code paths are only benefical on machines with fast unaligned
loads, so only use these if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
defined.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
---
 lib/lzo/lzo1x_compress.c        |    4 ++--
 lib/lzo/lzo1x_decompress_safe.c |   15 ++++++++++++---
 lib/lzo/lzodefs.h               |    2 +-
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c
index d42efe5..1593dba 100644
--- a/lib/lzo/lzo1x_compress.c
+++ b/lib/lzo/lzo1x_compress.c
@@ -90,7 +90,7 @@ next:
 
 		m_len = 4;
 		{
-#if defined(LZO_USE_CTZ64)
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(LZO_USE_CTZ64)
 		u64 v;
 		v = get_unaligned((const u64 *) (ip + m_len)) ^
 		    get_unaligned((const u64 *) (m_pos + m_len));
@@ -110,7 +110,7 @@ next:
 #  else
 #    error "missing endian definition"
 #  endif
-#elif defined(LZO_USE_CTZ32)
+#elif defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(LZO_USE_CTZ32)
 		u32 v;
 		v = get_unaligned((const u32 *) (ip + m_len)) ^
 		    get_unaligned((const u32 *) (m_pos + m_len));
diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c
index 0dba30c..569985d 100644
--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -64,6 +64,7 @@ int lzo1x_decompress_safe(const unsigned char *in, size_t in_len,
 				}
 				t += 3;
 copy_literal_run:
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
 				if (likely(HAVE_IP(t + 15) && HAVE_OP(t + 15))) {
 					const unsigned char *ie = ip + t;
 					unsigned char *oe = op + t;
@@ -77,7 +78,9 @@ copy_literal_run:
 					} while (ip < ie);
 					ip = ie;
 					op = oe;
-				} else {
+				} else
+#endif
+				{
 					NEED_OP(t);
 					NEED_IP(t + 3);
 					do {
@@ -148,6 +151,7 @@ copy_literal_run:
 			m_pos -= 0x4000;
 		}
 		TEST_LB(m_pos);
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
 		if (op - m_pos >= 8) {
 			unsigned char *oe = op + t;
 			if (likely(HAVE_OP(t + 15))) {
@@ -173,7 +177,9 @@ copy_literal_run:
 					*op++ = *m_pos++;
 				} while (op < oe);
 			}
-		} else {
+		} else
+#endif
+		{
 			unsigned char *oe = op + t;
 			NEED_OP(t);
 			op[0] = m_pos[0];
@@ -187,11 +193,14 @@ copy_literal_run:
 match_next:
 		state = next;
 		t = next;
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
 		if (likely(HAVE_IP(6) && HAVE_OP(4))) {
 			COPY4(op, ip);
 			op += t;
 			ip += t;
-		} else {
+		} else
+#endif
+		{
 			NEED_IP(t + 3);
 			NEED_OP(t);
 			while (t > 0) {
diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h
index ddc8db5..5a4beb2 100644
--- a/lib/lzo/lzodefs.h
+++ b/lib/lzo/lzodefs.h
@@ -29,7 +29,7 @@
 #define LZO_USE_CTZ32	1
 #elif defined(__i386__) || defined(__powerpc__)
 #define LZO_USE_CTZ32	1
-#else
+#elif defined(__arm__) && (__LINUX_ARM_ARCH__ >= 5)
 #define LZO_USE_CTZ32	1
 #endif
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer
                   ` (2 preceding siblings ...)
  2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer
@ 2012-10-09 19:26 ` Andrew Morton
  2012-10-09 19:54   ` Markus F.X.J. Oberhumer
  3 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2012-10-09 19:26 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: linux-kernel, Andi Kleen, Johannes Stezenbach,
	richard -rw- weinberger, linux-btrfs, linux-crypto,
	Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Magenheimer, Dan Carpenter,
	Stephen Rothwell

On Sun, 7 Oct 2012 17:07:55 +0200
"Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote:

> As requested by akpm I am sending my "lzo-update" branch at
> 
>   git://github.com/markus-oberhumer/linux.git lzo-update
> 
> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".
> 
> You can also browse the branch at
> 
>   https://github.com/markus-oberhumer/linux/compare/lzo-update
> 
> and review the three patches at
> 
>   https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
>   https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
>   https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1

The changes look OK to me.  Please ask Stephen to include the tree in
linux-next, for a 3.7 merge.



The changelog for patch 2/3 says:

: This commit updates the kernel LZO code to the current upsteam version
: which features a significant speed improvement - benchmarking the Calgary
: and Silesia test corpora typically shows a doubled performance in
: both compression and decompression on modern i386/x86_64/powerpc machines.


There are significant clients of the LZO library - crypto, btrfs,
jffs2, ubifs, squashfs and zcache.  So let's give all those people a cc
and ask that they test the LZO changes once they land in linux-next. 
For correctness and performance, please.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton
@ 2012-10-09 19:54   ` Markus F.X.J. Oberhumer
  2012-10-09 22:43     ` Stephen Rothwell
  2012-10-11 11:41     ` Arnd Bergmann
  0 siblings, 2 replies; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-09 19:54 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach,
	richard -rw- weinberger, linux-btrfs, linux-crypto,
	Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Magenheimer, Dan Carpenter

Hi Stephen,

On 2012-10-09 21:26, Andrew Morton wrote:
> On Sun, 7 Oct 2012 17:07:55 +0200
> "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote:
> 
>> As requested by akpm I am sending my "lzo-update" branch at
>>
>>   git://github.com/markus-oberhumer/linux.git lzo-update
>>
>> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".
>>
>> You can also browse the branch at
>>
>>   https://github.com/markus-oberhumer/linux/compare/lzo-update
>>
>> and review the three patches at
>>
>>   https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
>>   https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
>>   https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1
> 
> The changes look OK to me.  Please ask Stephen to include the tree in
> linux-next, for a 3.7 merge.

I'd ask you to include my "lzo-update" branch in linux-next:

  git://github.com/markus-oberhumer/linux.git lzo-update

> The changelog for patch 2/3 says:
> 
> : This commit updates the kernel LZO code to the current upsteam version
> : which features a significant speed improvement - benchmarking the Calgary
> : and Silesia test corpora typically shows a doubled performance in
> : both compression and decompression on modern i386/x86_64/powerpc machines.
> 
> There are significant clients of the LZO library - crypto, btrfs,
> jffs2, ubifs, squashfs and zcache.  So let's give all those people a cc
> and ask that they test the LZO changes once they land in linux-next. 
> For correctness and performance, please.

The core compression and decompression code has been thoroughly tested, so I
do not expect major problems.

Good testing after the merge and feedback about build or performance issues
(and improvements!) is highly appreciated.

Many thanks,
Markus

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-09 19:54   ` Markus F.X.J. Oberhumer
@ 2012-10-09 22:43     ` Stephen Rothwell
  2012-10-11 11:41     ` Arnd Bergmann
  1 sibling, 0 replies; 13+ messages in thread
From: Stephen Rothwell @ 2012-10-09 22:43 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach,
	richard -rw- weinberger, linux-btrfs, linux-crypto,
	Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Magenheimer, Dan Carpenter

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

Hi Markus,

On Tue, 09 Oct 2012 21:54:59 +0200 "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote:
>
> On 2012-10-09 21:26, Andrew Morton wrote:
> > On Sun, 7 Oct 2012 17:07:55 +0200
> > "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote:
> > 
> >> As requested by akpm I am sending my "lzo-update" branch at
> >>
> >>   git://github.com/markus-oberhumer/linux.git lzo-update
> >>
> >> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".
> >>
> >> You can also browse the branch at
> >>
> >>   https://github.com/markus-oberhumer/linux/compare/lzo-update
> >>
> >> and review the three patches at
> >>
> >>   https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
> >>   https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
> >>   https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1
> > 
> > The changes look OK to me.  Please ask Stephen to include the tree in
> > linux-next, for a 3.7 merge.
> 
> I'd ask you to include my "lzo-update" branch in linux-next:
> 
>   git://github.com/markus-oberhumer/linux.git lzo-update

I have added this from today.

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgment of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
     * submitted under GPL v2 (or later) and include the Contributor's
	Signed-off-by,
     * posted to the relevant mailing list,
     * reviewed by you (or another maintainer of your subsystem tree),
     * successfully unit tested, and 
     * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
sfr@canb.auug.org.au

Legal Stuff:
By participating in linux-next, your subsystem tree contributions are
public and will be included in the linux-next trees.  You may be sent
e-mail messages indicating errors or other issues when the
patches/commits from your subsystem tree are merged and tested in
linux-next.  These messages may also be cross-posted to the linux-next
mailing list, the linux-kernel mailing list, etc.  The linux-next tree
project and IBM (my employer) make no warranties regarding the linux-next
project, the testing procedures, the results, the e-mails, etc.  If you
don't agree to these ground rules, let me know and I'll remove your tree
from participation in linux-next.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-09 19:54   ` Markus F.X.J. Oberhumer
  2012-10-09 22:43     ` Stephen Rothwell
@ 2012-10-11 11:41     ` Arnd Bergmann
  2012-10-11 16:28       ` Markus F.X.J. Oberhumer
  1 sibling, 1 reply; 13+ messages in thread
From: Arnd Bergmann @ 2012-10-11 11:41 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen,
	Johannes Stezenbach, richard -rw- weinberger, linux-btrfs,
	linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Magenheimer, Dan Carpenter

On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote:
> > 
> > : This commit updates the kernel LZO code to the current upsteam version
> > : which features a significant speed improvement - benchmarking the Calgary
> > : and Silesia test corpora typically shows a doubled performance in
> > : both compression and decompression on modern i386/x86_64/powerpc machines.
> > 
> > There are significant clients of the LZO library - crypto, btrfs,
> > jffs2, ubifs, squashfs and zcache.  So let's give all those people a cc
> > and ask that they test the LZO changes once they land in linux-next. 
> > For correctness and performance, please.
> 
> The core compression and decompression code has been thoroughly tested, so I
> do not expect major problems.
> 
> Good testing after the merge and feedback about build or performance issues
> (and improvements!) is highly appreciated.

The addition of the lzo tree to linux-next caused this problem for ARM
imx_v6_v7_defconfig:

In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0:
/home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error: lzo/lzo1x_decompress.c: No such file or directory

Since the file was renamed, anything including it needs to be updated to the
new file name.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c
index 4531294..960183d 100644
--- a/lib/decompress_unlzo.c
+++ b/lib/decompress_unlzo.c
@@ -31,7 +31,7 @@
  */
 
 #ifdef STATIC
-#include "lzo/lzo1x_decompress.c"
+#include "lzo/lzo1x_decompress_safe.c"
 #else
 #include <linux/decompress/unlzo.h>
 #endif

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-11 11:41     ` Arnd Bergmann
@ 2012-10-11 16:28       ` Markus F.X.J. Oberhumer
  2012-12-21  2:03         ` Dan Magenheimer
  0 siblings, 1 reply; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-11 16:28 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen,
	Johannes Stezenbach, richard -rw- weinberger, linux-btrfs,
	linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Magenheimer, Dan Carpenter

Thanks Arnd,

On 2012-10-11 13:41, Arnd Bergmann wrote:
> On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote:
>>>
>>> : This commit updates the kernel LZO code to the current upsteam version
>>> : which features a significant speed improvement - benchmarking the Calgary
>>> : and Silesia test corpora typically shows a doubled performance in
>>> : both compression and decompression on modern i386/x86_64/powerpc machines.
>>>
>>> There are significant clients of the LZO library - crypto, btrfs,
>>> jffs2, ubifs, squashfs and zcache.  So let's give all those people a cc
>>> and ask that they test the LZO changes once they land in linux-next. 
>>> For correctness and performance, please.
>>
>> The core compression and decompression code has been thoroughly tested, so I
>> do not expect major problems.
>>
>> Good testing after the merge and feedback about build or performance issues
>> (and improvements!) is highly appreciated.
> 
> The addition of the lzo tree to linux-next caused this problem for ARM
> imx_v6_v7_defconfig:
> 
> In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0:
> /home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error: lzo/lzo1x_decompress.c: No such file or directory
> 
> Since the file was renamed, anything including it needs to be updated to the
> new file name.

I will add that patch to my tree.

Cheers,
Markus

> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
> diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c
> index 4531294..960183d 100644
> --- a/lib/decompress_unlzo.c
> +++ b/lib/decompress_unlzo.c
> @@ -31,7 +31,7 @@
>   */
>  
>  #ifdef STATIC
> -#include "lzo/lzo1x_decompress.c"
> +#include "lzo/lzo1x_decompress_safe.c"
>  #else
>  #include <linux/decompress/unlzo.h>
>  #endif

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 0/3] Update LZO compression
  2012-10-11 16:28       ` Markus F.X.J. Oberhumer
@ 2012-12-21  2:03         ` Dan Magenheimer
  0 siblings, 0 replies; 13+ messages in thread
From: Dan Magenheimer @ 2012-12-21  2:03 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer, Arnd Bergmann
  Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen,
	Johannes Stezenbach, richard -rw- weinberger, linux-btrfs,
	linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse,
	Phillip Lougher, Dan Carpenter

> From: Markus F.X.J. Oberhumer [mailto:markus@oberhumer.com]
> Subject: Re: [PATCH 0/3] Update LZO compression
> 
> Thanks Arnd,
> 
> On 2012-10-11 13:41, Arnd Bergmann wrote:
> > On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote:
> >>>
> >>> : This commit updates the kernel LZO code to the current upsteam version
> >>> : which features a significant speed improvement - benchmarking the Calgary
> >>> : and Silesia test corpora typically shows a doubled performance in
> >>> : both compression and decompression on modern i386/x86_64/powerpc machines.
> >>>
> >>> There are significant clients of the LZO library - crypto, btrfs,
> >>> jffs2, ubifs, squashfs and zcache.  So let's give all those people a cc
> >>> and ask that they test the LZO changes once they land in linux-next.
> >>> For correctness and performance, please.
> >>
> >> The core compression and decompression code has been thoroughly tested, so I
> >> do not expect major problems.
> >>
> >> Good testing after the merge and feedback about build or performance issues
> >> (and improvements!) is highly appreciated.
> >
> > The addition of the lzo tree to linux-next caused this problem for ARM
> > imx_v6_v7_defconfig:
> >
> > In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0:
> > /home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error:
> lzo/lzo1x_decompress.c: No such file or directory
> >
> > Since the file was renamed, anything including it needs to be updated to the
> > new file name.
> 
> I will add that patch to my tree.
> 
> Cheers,
> Markus

Sorry if I missed it (bad connectivity this week), but is someone going
to send a pull request to get this LZO update from linux-next into
Linus's tree?  The window is closing soon isn't it?

Dan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-15 23:45 ` Markus F.X.J. Oberhumer
@ 2012-10-16 16:50   ` Seth Jennings
  0 siblings, 0 replies; 13+ messages in thread
From: Seth Jennings @ 2012-10-16 16:50 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: Robert Jennings, Andrew Morton, LKML, Andi Kleen,
	Johannes Stezenbach, Richard Weinberger

On 10/15/2012 06:45 PM, Markus F.X.J. Oberhumer wrote:
> The crypto LZO test vectors had to be updated - this should land in linux-next
> soon (or you can just pull from my branch).

Thanks, I pulled from your branch and now it works fine.

I noticed that you made the test vectors longer.  Any particular
reason? Is there a new minimum length now or something?

I guess I'm not understanding why a change to the test vectors was needed.

Seth


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
  2012-10-15 19:19 Seth Jennings
@ 2012-10-15 23:45 ` Markus F.X.J. Oberhumer
  2012-10-16 16:50   ` Seth Jennings
  0 siblings, 1 reply; 13+ messages in thread
From: Markus F.X.J. Oberhumer @ 2012-10-15 23:45 UTC (permalink / raw)
  To: cover.1349621096.git.markus
  Cc: Seth Jennings, Robert Jennings, Andrew Morton, LKML, Andi Kleen,
	Johannes Stezenbach, Richard Weinberger

On 2012-10-15 21:19, Seth Jennings wrote:
>> As requested by akpm I am sending my "lzo-update" branch at
>>
>>   git://github.com/markus-oberhumer/linux.git lzo-update
>>
>> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".
>>
>> You can also browse the branch at
>>
>>   https://github.com/markus-oberhumer/linux/compare/lzo-update
>>
>> and review the three patches at
>>
>>   https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
>>   https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
>>   https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1
> 
> As this relates to my work on zcache, I just tested these patches on PPC64 and
> they cause the LZO crypto module to fail its self-test:
> 
> [    0.521137] alg: comp: Compression test 1 failed for lzo-generic: output len = 62
> 
> I built the exact same kernel for x86_64 and all is fine.  I suspect an endianness
> related bug, but I haven't looked at the code that closely yet.
> 
> Any ideas?  I'd be happy to test any potential fixes.

The crypto LZO test vectors had to be updated - this should land in linux-next
soon (or you can just pull from my branch).

BTW, this cannot have worked on x86_64 (or any other arch), so you probably
tested the wrong kernel.

Cheers,
Markus

> Seth

-- 
Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] Update LZO compression
@ 2012-10-15 19:19 Seth Jennings
  2012-10-15 23:45 ` Markus F.X.J. Oberhumer
  0 siblings, 1 reply; 13+ messages in thread
From: Seth Jennings @ 2012-10-15 19:19 UTC (permalink / raw)
  To: Markus F.X.J. Oberhumer
  Cc: Robert Jennings, Andrew Morton, LKML, Andi Kleen,
	Johannes Stezenbach, Richard Weinberger

> As requested by akpm I am sending my "lzo-update" branch at
> 
>   git://github.com/markus-oberhumer/linux.git lzo-update
> 
> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update".
> 
> You can also browse the branch at
> 
>   https://github.com/markus-oberhumer/linux/compare/lzo-update
> 
> and review the three patches at
> 
>   https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20
>   https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826
>   https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1

As this relates to my work on zcache, I just tested these patches on PPC64 and
they cause the LZO crypto module to fail its self-test:

[    0.521137] alg: comp: Compression test 1 failed for lzo-generic: output len = 62

I built the exact same kernel for x86_64 and all is fine.  I suspect an endianness
related bug, but I haven't looked at the code that closely yet.

Any ideas?  I'd be happy to test any potential fixes.

Seth


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-12-21  2:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer
2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer
2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer
2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer
2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton
2012-10-09 19:54   ` Markus F.X.J. Oberhumer
2012-10-09 22:43     ` Stephen Rothwell
2012-10-11 11:41     ` Arnd Bergmann
2012-10-11 16:28       ` Markus F.X.J. Oberhumer
2012-12-21  2:03         ` Dan Magenheimer
2012-10-15 19:19 Seth Jennings
2012-10-15 23:45 ` Markus F.X.J. Oberhumer
2012-10-16 16:50   ` Seth Jennings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).