* [PATCH 0/3] Update LZO compression @ 2012-10-07 15:07 Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-07 15:07 UTC (permalink / raw) To: linux-kernel Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton, Johannes Stezenbach, richard -rw- weinberger As requested by akpm I am sending my "lzo-update" branch at git://github.com/markus-oberhumer/linux.git lzo-update to lkml as a patch series created by "git format-patch -M v3.5..lzo-update". You can also browse the branch at https://github.com/markus-oberhumer/linux/compare/lzo-update and review the three patches at https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20 https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826 https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1 Share and enjoy, Markus Markus F.X.J. Oberhumer (3): lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c lib/lzo: Update LZO compression to current upstream version lib/lzo: Optimize code for CPUs with inefficient unaligned access include/linux/lzo.h | 15 +- lib/lzo/Makefile | 2 +- lib/lzo/lzo1x_compress.c | 309 +++++++++++++++++++++------------------ lib/lzo/lzo1x_decompress.c | 255 -------------------------------- lib/lzo/lzo1x_decompress_safe.c | 237 ++++++++++++++++++++++++++++++ lib/lzo/lzodefs.h | 34 ++++- 6 files changed, 441 insertions(+), 411 deletions(-) delete mode 100644 lib/lzo/lzo1x_decompress.c create mode 100644 lib/lzo/lzo1x_decompress_safe.c ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c 2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer @ 2012-10-07 15:08 ` Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer ` (2 subsequent siblings) 3 siblings, 0 replies; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-07 15:08 UTC (permalink / raw) To: linux-kernel Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton, Johannes Stezenbach, richard -rw- weinberger Rename the source file to match the function name and thereby also make room for a possible future even slightly faster "non-safe" decompressor version. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com> --- lib/lzo/Makefile | 2 +- ...{lzo1x_decompress.c => lzo1x_decompress_safe.c} | 0 2 files changed, 1 insertions(+), 1 deletions(-) rename lib/lzo/{lzo1x_decompress.c => lzo1x_decompress_safe.c} (100%) diff --git a/lib/lzo/Makefile b/lib/lzo/Makefile index e764116..f0f7d7c 100644 --- a/lib/lzo/Makefile +++ b/lib/lzo/Makefile @@ -1,5 +1,5 @@ lzo_compress-objs := lzo1x_compress.o -lzo_decompress-objs := lzo1x_decompress.o +lzo_decompress-objs := lzo1x_decompress_safe.o obj-$(CONFIG_LZO_COMPRESS) += lzo_compress.o obj-$(CONFIG_LZO_DECOMPRESS) += lzo_decompress.o diff --git a/lib/lzo/lzo1x_decompress.c b/lib/lzo/lzo1x_decompress_safe.c similarity index 100% rename from lib/lzo/lzo1x_decompress.c rename to lib/lzo/lzo1x_decompress_safe.c -- 1.7.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version 2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer @ 2012-10-07 15:08 ` Markus F.X.J. Oberhumer 2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer 2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton 3 siblings, 0 replies; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-07 15:08 UTC (permalink / raw) To: linux-kernel Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton, Johannes Stezenbach, richard -rw- weinberger This commit updates the kernel LZO code to the current upsteam version which features a significant speed improvement - benchmarking the Calgary and Silesia test corpora typically shows a doubled performance in both compression and decompression on modern i386/x86_64/powerpc machines. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com> --- include/linux/lzo.h | 15 +- lib/lzo/lzo1x_compress.c | 309 +++++++++++++++++++---------------- lib/lzo/lzo1x_decompress_safe.c | 341 ++++++++++++++++++--------------------- lib/lzo/lzodefs.h | 34 +++- 4 files changed, 360 insertions(+), 339 deletions(-) diff --git a/include/linux/lzo.h b/include/linux/lzo.h index d793497..a0848d9 100644 --- a/include/linux/lzo.h +++ b/include/linux/lzo.h @@ -4,28 +4,28 @@ * LZO Public Kernel Interface * A mini subset of the LZO real-time data compression library * - * Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com> + * Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com> * * The full LZO package can be found at: * http://www.oberhumer.com/opensource/lzo/ * - * Changed for kernel use by: + * Changed for Linux kernel use by: * Nitin Gupta <nitingupta910@gmail.com> * Richard Purdie <rpurdie@openedhand.com> */ -#define LZO1X_MEM_COMPRESS (16384 * sizeof(unsigned char *)) -#define LZO1X_1_MEM_COMPRESS LZO1X_MEM_COMPRESS +#define LZO1X_1_MEM_COMPRESS (8192 * sizeof(unsigned short)) +#define LZO1X_MEM_COMPRESS LZO1X_1_MEM_COMPRESS #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3) -/* This requires 'workmem' of size LZO1X_1_MEM_COMPRESS */ +/* This requires 'wrkmem' of size LZO1X_1_MEM_COMPRESS */ int lzo1x_1_compress(const unsigned char *src, size_t src_len, - unsigned char *dst, size_t *dst_len, void *wrkmem); + unsigned char *dst, size_t *dst_len, void *wrkmem); /* safe decompression with overrun testing */ int lzo1x_decompress_safe(const unsigned char *src, size_t src_len, - unsigned char *dst, size_t *dst_len); + unsigned char *dst, size_t *dst_len); /* * Return values (< 0 = Error) @@ -40,5 +40,6 @@ int lzo1x_decompress_safe(const unsigned char *src, size_t src_len, #define LZO_E_EOF_NOT_FOUND (-7) #define LZO_E_INPUT_NOT_CONSUMED (-8) #define LZO_E_NOT_YET_IMPLEMENTED (-9) +#define LZO_E_INVALID_ARGUMENT (-10) #endif diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c index a604099..d42efe5 100644 --- a/lib/lzo/lzo1x_compress.c +++ b/lib/lzo/lzo1x_compress.c @@ -1,194 +1,217 @@ /* - * LZO1X Compressor from MiniLZO + * LZO1X Compressor from LZO * - * Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com> + * Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com> * * The full LZO package can be found at: * http://www.oberhumer.com/opensource/lzo/ * - * Changed for kernel use by: + * Changed for Linux kernel use by: * Nitin Gupta <nitingupta910@gmail.com> * Richard Purdie <rpurdie@openedhand.com> */ #include <linux/module.h> #include <linux/kernel.h> -#include <linux/lzo.h> #include <asm/unaligned.h> +#include <linux/lzo.h> #include "lzodefs.h" static noinline size_t -_lzo1x_1_do_compress(const unsigned char *in, size_t in_len, - unsigned char *out, size_t *out_len, void *wrkmem) +lzo1x_1_do_compress(const unsigned char *in, size_t in_len, + unsigned char *out, size_t *out_len, + size_t ti, void *wrkmem) { + const unsigned char *ip; + unsigned char *op; const unsigned char * const in_end = in + in_len; - const unsigned char * const ip_end = in + in_len - M2_MAX_LEN - 5; - const unsigned char ** const dict = wrkmem; - const unsigned char *ip = in, *ii = ip; - const unsigned char *end, *m, *m_pos; - size_t m_off, m_len, dindex; - unsigned char *op = out; + const unsigned char * const ip_end = in + in_len - 20; + const unsigned char *ii; + lzo_dict_t * const dict = (lzo_dict_t *) wrkmem; - ip += 4; + op = out; + ip = in; + ii = ip; + ip += ti < 4 ? 4 - ti : 0; for (;;) { - dindex = ((size_t)(0x21 * DX3(ip, 5, 5, 6)) >> 5) & D_MASK; - m_pos = dict[dindex]; - - if (m_pos < in) - goto literal; - - if (ip == m_pos || ((size_t)(ip - m_pos) > M4_MAX_OFFSET)) - goto literal; - - m_off = ip - m_pos; - if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3]) - goto try_match; - - dindex = (dindex & (D_MASK & 0x7ff)) ^ (D_HIGH | 0x1f); - m_pos = dict[dindex]; - - if (m_pos < in) - goto literal; - - if (ip == m_pos || ((size_t)(ip - m_pos) > M4_MAX_OFFSET)) - goto literal; - - m_off = ip - m_pos; - if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3]) - goto try_match; - - goto literal; - -try_match: - if (get_unaligned((const unsigned short *)m_pos) - == get_unaligned((const unsigned short *)ip)) { - if (likely(m_pos[2] == ip[2])) - goto match; - } - + const unsigned char *m_pos; + size_t t, m_len, m_off; + u32 dv; literal: - dict[dindex] = ip; - ++ip; + ip += 1 + ((ip - ii) >> 5); +next: if (unlikely(ip >= ip_end)) break; - continue; - -match: - dict[dindex] = ip; - if (ip != ii) { - size_t t = ip - ii; + dv = get_unaligned_le32(ip); + t = ((dv * 0x1824429d) >> (32 - D_BITS)) & D_MASK; + m_pos = in + dict[t]; + dict[t] = (lzo_dict_t) (ip - in); + if (unlikely(dv != get_unaligned_le32(m_pos))) + goto literal; + ii -= ti; + ti = 0; + t = ip - ii; + if (t != 0) { if (t <= 3) { op[-2] |= t; - } else if (t <= 18) { + COPY4(op, ii); + op += t; + } else if (t <= 16) { *op++ = (t - 3); + COPY8(op, ii); + COPY8(op + 8, ii + 8); + op += t; } else { - size_t tt = t - 18; - - *op++ = 0; - while (tt > 255) { - tt -= 255; + if (t <= 18) { + *op++ = (t - 3); + } else { + size_t tt = t - 18; *op++ = 0; + while (unlikely(tt > 255)) { + tt -= 255; + *op++ = 0; + } + *op++ = tt; } - *op++ = tt; + do { + COPY8(op, ii); + COPY8(op + 8, ii + 8); + op += 16; + ii += 16; + t -= 16; + } while (t >= 16); + if (t > 0) do { + *op++ = *ii++; + } while (--t > 0); } - do { - *op++ = *ii++; - } while (--t > 0); } - ip += 3; - if (m_pos[3] != *ip++ || m_pos[4] != *ip++ - || m_pos[5] != *ip++ || m_pos[6] != *ip++ - || m_pos[7] != *ip++ || m_pos[8] != *ip++) { - --ip; - m_len = ip - ii; + m_len = 4; + { +#if defined(LZO_USE_CTZ64) + u64 v; + v = get_unaligned((const u64 *) (ip + m_len)) ^ + get_unaligned((const u64 *) (m_pos + m_len)); + if (unlikely(v == 0)) { + do { + m_len += 8; + v = get_unaligned((const u64 *) (ip + m_len)) ^ + get_unaligned((const u64 *) (m_pos + m_len)); + if (unlikely(ip + m_len >= ip_end)) + goto m_len_done; + } while (v == 0); + } +# if defined(__LITTLE_ENDIAN) + m_len += (unsigned) __builtin_ctzll(v) / 8; +# elif defined(__BIG_ENDIAN) + m_len += (unsigned) __builtin_clzll(v) / 8; +# else +# error "missing endian definition" +# endif +#elif defined(LZO_USE_CTZ32) + u32 v; + v = get_unaligned((const u32 *) (ip + m_len)) ^ + get_unaligned((const u32 *) (m_pos + m_len)); + if (unlikely(v == 0)) { + do { + m_len += 4; + v = get_unaligned((const u32 *) (ip + m_len)) ^ + get_unaligned((const u32 *) (m_pos + m_len)); + if (unlikely(ip + m_len >= ip_end)) + goto m_len_done; + } while (v == 0); + } +# if defined(__LITTLE_ENDIAN) + m_len += (unsigned) __builtin_ctz(v) / 8; +# elif defined(__BIG_ENDIAN) + m_len += (unsigned) __builtin_clz(v) / 8; +# else +# error "missing endian definition" +# endif +#else + if (unlikely(ip[m_len] == m_pos[m_len])) { + do { + m_len += 1; + if (unlikely(ip + m_len >= ip_end)) + goto m_len_done; + } while (ip[m_len] == m_pos[m_len]); + } +#endif + } +m_len_done: - if (m_off <= M2_MAX_OFFSET) { - m_off -= 1; - *op++ = (((m_len - 1) << 5) - | ((m_off & 7) << 2)); - *op++ = (m_off >> 3); - } else if (m_off <= M3_MAX_OFFSET) { - m_off -= 1; + m_off = ip - m_pos; + ip += m_len; + ii = ip; + if (m_len <= M2_MAX_LEN && m_off <= M2_MAX_OFFSET) { + m_off -= 1; + *op++ = (((m_len - 1) << 5) | ((m_off & 7) << 2)); + *op++ = (m_off >> 3); + } else if (m_off <= M3_MAX_OFFSET) { + m_off -= 1; + if (m_len <= M3_MAX_LEN) *op++ = (M3_MARKER | (m_len - 2)); - goto m3_m4_offset; - } else { - m_off -= 0x4000; - - *op++ = (M4_MARKER | ((m_off & 0x4000) >> 11) - | (m_len - 2)); - goto m3_m4_offset; + else { + m_len -= M3_MAX_LEN; + *op++ = M3_MARKER | 0; + while (unlikely(m_len > 255)) { + m_len -= 255; + *op++ = 0; + } + *op++ = (m_len); } + *op++ = (m_off << 2); + *op++ = (m_off >> 6); } else { - end = in_end; - m = m_pos + M2_MAX_LEN + 1; - - while (ip < end && *m == *ip) { - m++; - ip++; - } - m_len = ip - ii; - - if (m_off <= M3_MAX_OFFSET) { - m_off -= 1; - if (m_len <= 33) { - *op++ = (M3_MARKER | (m_len - 2)); - } else { - m_len -= 33; - *op++ = M3_MARKER | 0; - goto m3_m4_len; - } - } else { - m_off -= 0x4000; - if (m_len <= M4_MAX_LEN) { - *op++ = (M4_MARKER - | ((m_off & 0x4000) >> 11) + m_off -= 0x4000; + if (m_len <= M4_MAX_LEN) + *op++ = (M4_MARKER | ((m_off >> 11) & 8) | (m_len - 2)); - } else { - m_len -= M4_MAX_LEN; - *op++ = (M4_MARKER - | ((m_off & 0x4000) >> 11)); -m3_m4_len: - while (m_len > 255) { - m_len -= 255; - *op++ = 0; - } - - *op++ = (m_len); + else { + m_len -= M4_MAX_LEN; + *op++ = (M4_MARKER | ((m_off >> 11) & 8)); + while (unlikely(m_len > 255)) { + m_len -= 255; + *op++ = 0; } + *op++ = (m_len); } -m3_m4_offset: - *op++ = ((m_off & 63) << 2); + *op++ = (m_off << 2); *op++ = (m_off >> 6); } - - ii = ip; - if (unlikely(ip >= ip_end)) - break; + goto next; } - *out_len = op - out; - return in_end - ii; + return in_end - (ii - ti); } -int lzo1x_1_compress(const unsigned char *in, size_t in_len, unsigned char *out, - size_t *out_len, void *wrkmem) +int lzo1x_1_compress(const unsigned char *in, size_t in_len, + unsigned char *out, size_t *out_len, + void *wrkmem) { - const unsigned char *ii; + const unsigned char *ip = in; unsigned char *op = out; - size_t t; + size_t l = in_len; + size_t t = 0; - if (unlikely(in_len <= M2_MAX_LEN + 5)) { - t = in_len; - } else { - t = _lzo1x_1_do_compress(in, in_len, op, out_len, wrkmem); + while (l > 20) { + size_t ll = l <= (M4_MAX_OFFSET + 1) ? l : (M4_MAX_OFFSET + 1); + uintptr_t ll_end = (uintptr_t) ip + ll; + if ((ll_end + ((t + ll) >> 5)) <= ll_end) + break; + BUILD_BUG_ON(D_SIZE * sizeof(lzo_dict_t) > LZO1X_1_MEM_COMPRESS); + memset(wrkmem, 0, D_SIZE * sizeof(lzo_dict_t)); + t = lzo1x_1_do_compress(ip, ll, op, out_len, t, wrkmem); + ip += ll; op += *out_len; + l -= ll; } + t += l; if (t > 0) { - ii = in + in_len - t; + const unsigned char *ii = in + in_len - t; if (op == out && t <= 238) { *op++ = (17 + t); @@ -198,16 +221,21 @@ int lzo1x_1_compress(const unsigned char *in, size_t in_len, unsigned char *out, *op++ = (t - 3); } else { size_t tt = t - 18; - *op++ = 0; while (tt > 255) { tt -= 255; *op++ = 0; } - *op++ = tt; } - do { + if (t >= 16) do { + COPY8(op, ii); + COPY8(op + 8, ii + 8); + op += 16; + ii += 16; + t -= 16; + } while (t >= 16); + if (t > 0) do { *op++ = *ii++; } while (--t > 0); } @@ -223,4 +251,3 @@ EXPORT_SYMBOL_GPL(lzo1x_1_compress); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("LZO1X-1 Compressor"); - diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c index f2fd098..0dba30c 100644 --- a/lib/lzo/lzo1x_decompress_safe.c +++ b/lib/lzo/lzo1x_decompress_safe.c @@ -1,12 +1,12 @@ /* - * LZO1X Decompressor from MiniLZO + * LZO1X Decompressor from LZO * - * Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com> + * Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com> * * The full LZO package can be found at: * http://www.oberhumer.com/opensource/lzo/ * - * Changed for kernel use by: + * Changed for Linux kernel use by: * Nitin Gupta <nitingupta910@gmail.com> * Richard Purdie <rpurdie@openedhand.com> */ @@ -15,225 +15,198 @@ #include <linux/module.h> #include <linux/kernel.h> #endif - #include <asm/unaligned.h> #include <linux/lzo.h> #include "lzodefs.h" -#define HAVE_IP(x, ip_end, ip) ((size_t)(ip_end - ip) < (x)) -#define HAVE_OP(x, op_end, op) ((size_t)(op_end - op) < (x)) -#define HAVE_LB(m_pos, out, op) (m_pos < out || m_pos >= op) - -#define COPY4(dst, src) \ - put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst)) +#define HAVE_IP(x) ((size_t)(ip_end - ip) >= (size_t)(x)) +#define HAVE_OP(x) ((size_t)(op_end - op) >= (size_t)(x)) +#define NEED_IP(x) if (!HAVE_IP(x)) goto input_overrun +#define NEED_OP(x) if (!HAVE_OP(x)) goto output_overrun +#define TEST_LB(m_pos) if ((m_pos) < out) goto lookbehind_overrun int lzo1x_decompress_safe(const unsigned char *in, size_t in_len, - unsigned char *out, size_t *out_len) + unsigned char *out, size_t *out_len) { + unsigned char *op; + const unsigned char *ip; + size_t t, next; + size_t state = 0; + const unsigned char *m_pos; const unsigned char * const ip_end = in + in_len; unsigned char * const op_end = out + *out_len; - const unsigned char *ip = in, *m_pos; - unsigned char *op = out; - size_t t; - *out_len = 0; + op = out; + ip = in; + if (unlikely(in_len < 3)) + goto input_overrun; if (*ip > 17) { t = *ip++ - 17; - if (t < 4) + if (t < 4) { + next = t; goto match_next; - if (HAVE_OP(t, op_end, op)) - goto output_overrun; - if (HAVE_IP(t + 1, ip_end, ip)) - goto input_overrun; - do { - *op++ = *ip++; - } while (--t > 0); - goto first_literal_run; - } - - while ((ip < ip_end)) { - t = *ip++; - if (t >= 16) - goto match; - if (t == 0) { - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; - while (*ip == 0) { - t += 255; - ip++; - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; - } - t += 15 + *ip++; - } - if (HAVE_OP(t + 3, op_end, op)) - goto output_overrun; - if (HAVE_IP(t + 4, ip_end, ip)) - goto input_overrun; - - COPY4(op, ip); - op += 4; - ip += 4; - if (--t > 0) { - if (t >= 4) { - do { - COPY4(op, ip); - op += 4; - ip += 4; - t -= 4; - } while (t >= 4); - if (t > 0) { - do { - *op++ = *ip++; - } while (--t > 0); - } - } else { - do { - *op++ = *ip++; - } while (--t > 0); - } } + goto copy_literal_run; + } -first_literal_run: + for (;;) { t = *ip++; - if (t >= 16) - goto match; - m_pos = op - (1 + M2_MAX_OFFSET); - m_pos -= t >> 2; - m_pos -= *ip++ << 2; - - if (HAVE_LB(m_pos, out, op)) - goto lookbehind_overrun; - - if (HAVE_OP(3, op_end, op)) - goto output_overrun; - *op++ = *m_pos++; - *op++ = *m_pos++; - *op++ = *m_pos; - - goto match_done; - - do { -match: - if (t >= 64) { - m_pos = op - 1; - m_pos -= (t >> 2) & 7; - m_pos -= *ip++ << 3; - t = (t >> 5) - 1; - if (HAVE_LB(m_pos, out, op)) - goto lookbehind_overrun; - if (HAVE_OP(t + 3 - 1, op_end, op)) - goto output_overrun; - goto copy_match; - } else if (t >= 32) { - t &= 31; - if (t == 0) { - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; - while (*ip == 0) { + if (t < 16) { + if (likely(state == 0)) { + if (unlikely(t == 0)) { + while (unlikely(*ip == 0)) { t += 255; ip++; - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; + NEED_IP(1); } - t += 31 + *ip++; + t += 15 + *ip++; } - m_pos = op - 1; - m_pos -= get_unaligned_le16(ip) >> 2; - ip += 2; - } else if (t >= 16) { - m_pos = op; - m_pos -= (t & 8) << 11; - - t &= 7; - if (t == 0) { - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; - while (*ip == 0) { - t += 255; - ip++; - if (HAVE_IP(1, ip_end, ip)) - goto input_overrun; - } - t += 7 + *ip++; + t += 3; +copy_literal_run: + if (likely(HAVE_IP(t + 15) && HAVE_OP(t + 15))) { + const unsigned char *ie = ip + t; + unsigned char *oe = op + t; + do { + COPY8(op, ip); + op += 8; + ip += 8; + COPY8(op, ip); + op += 8; + ip += 8; + } while (ip < ie); + ip = ie; + op = oe; + } else { + NEED_OP(t); + NEED_IP(t + 3); + do { + *op++ = *ip++; + } while (--t > 0); } - m_pos -= get_unaligned_le16(ip) >> 2; - ip += 2; - if (m_pos == op) - goto eof_found; - m_pos -= 0x4000; - } else { + state = 4; + continue; + } else if (state != 4) { + next = t & 3; m_pos = op - 1; m_pos -= t >> 2; m_pos -= *ip++ << 2; - - if (HAVE_LB(m_pos, out, op)) - goto lookbehind_overrun; - if (HAVE_OP(2, op_end, op)) - goto output_overrun; - - *op++ = *m_pos++; - *op++ = *m_pos; - goto match_done; + TEST_LB(m_pos); + NEED_OP(2); + op[0] = m_pos[0]; + op[1] = m_pos[1]; + op += 2; + goto match_next; + } else { + next = t & 3; + m_pos = op - (1 + M2_MAX_OFFSET); + m_pos -= t >> 2; + m_pos -= *ip++ << 2; + t = 3; } - - if (HAVE_LB(m_pos, out, op)) - goto lookbehind_overrun; - if (HAVE_OP(t + 3 - 1, op_end, op)) - goto output_overrun; - - if (t >= 2 * 4 - (3 - 1) && (op - m_pos) >= 4) { - COPY4(op, m_pos); - op += 4; - m_pos += 4; - t -= 4 - (3 - 1); + } else if (t >= 64) { + next = t & 3; + m_pos = op - 1; + m_pos -= (t >> 2) & 7; + m_pos -= *ip++ << 3; + t = (t >> 5) - 1 + (3 - 1); + } else if (t >= 32) { + t = (t & 31) + (3 - 1); + if (unlikely(t == 2)) { + while (unlikely(*ip == 0)) { + t += 255; + ip++; + NEED_IP(1); + } + t += 31 + *ip++; + NEED_IP(2); + } + m_pos = op - 1; + next = get_unaligned_le16(ip); + ip += 2; + m_pos -= next >> 2; + next &= 3; + } else { + m_pos = op; + m_pos -= (t & 8) << 11; + t = (t & 7) + (3 - 1); + if (unlikely(t == 2)) { + while (unlikely(*ip == 0)) { + t += 255; + ip++; + NEED_IP(1); + } + t += 7 + *ip++; + NEED_IP(2); + } + next = get_unaligned_le16(ip); + ip += 2; + m_pos -= next >> 2; + next &= 3; + if (m_pos == op) + goto eof_found; + m_pos -= 0x4000; + } + TEST_LB(m_pos); + if (op - m_pos >= 8) { + unsigned char *oe = op + t; + if (likely(HAVE_OP(t + 15))) { do { - COPY4(op, m_pos); - op += 4; - m_pos += 4; - t -= 4; - } while (t >= 4); - if (t > 0) - do { - *op++ = *m_pos++; - } while (--t > 0); + COPY8(op, m_pos); + op += 8; + m_pos += 8; + COPY8(op, m_pos); + op += 8; + m_pos += 8; + } while (op < oe); + op = oe; + if (HAVE_IP(6)) { + state = next; + COPY4(op, ip); + op += next; + ip += next; + continue; + } } else { -copy_match: - *op++ = *m_pos++; - *op++ = *m_pos++; + NEED_OP(t); do { *op++ = *m_pos++; - } while (--t > 0); + } while (op < oe); } -match_done: - t = ip[-2] & 3; - if (t == 0) - break; + } else { + unsigned char *oe = op + t; + NEED_OP(t); + op[0] = m_pos[0]; + op[1] = m_pos[1]; + op += 2; + m_pos += 2; + do { + *op++ = *m_pos++; + } while (op < oe); + } match_next: - if (HAVE_OP(t, op_end, op)) - goto output_overrun; - if (HAVE_IP(t + 1, ip_end, ip)) - goto input_overrun; - - *op++ = *ip++; - if (t > 1) { + state = next; + t = next; + if (likely(HAVE_IP(6) && HAVE_OP(4))) { + COPY4(op, ip); + op += t; + ip += t; + } else { + NEED_IP(t + 3); + NEED_OP(t); + while (t > 0) { *op++ = *ip++; - if (t > 2) - *op++ = *ip++; + t--; } - - t = *ip++; - } while (ip < ip_end); + } } - *out_len = op - out; - return LZO_E_EOF_NOT_FOUND; - eof_found: *out_len = op - out; - return (ip == ip_end ? LZO_E_OK : - (ip < ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN)); + return (t != 3 ? LZO_E_ERROR : + ip == ip_end ? LZO_E_OK : + ip < ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN); + input_overrun: *out_len = op - out; return LZO_E_INPUT_OVERRUN; diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h index b6d482c..ddc8db5 100644 --- a/lib/lzo/lzodefs.h +++ b/lib/lzo/lzodefs.h @@ -1,19 +1,37 @@ /* * lzodefs.h -- architecture, OS and compiler specific defines * - * Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <markus@oberhumer.com> + * Copyright (C) 1996-2012 Markus F.X.J. Oberhumer <markus@oberhumer.com> * * The full LZO package can be found at: * http://www.oberhumer.com/opensource/lzo/ * - * Changed for kernel use by: + * Changed for Linux kernel use by: * Nitin Gupta <nitingupta910@gmail.com> * Richard Purdie <rpurdie@openedhand.com> */ -#define LZO_VERSION 0x2020 -#define LZO_VERSION_STRING "2.02" -#define LZO_VERSION_DATE "Oct 17 2005" + +#define COPY4(dst, src) \ + put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst)) +#if defined(__x86_64__) +#define COPY8(dst, src) \ + put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst)) +#else +#define COPY8(dst, src) \ + COPY4(dst, src); COPY4((dst) + 4, (src) + 4) +#endif + +#if defined(__BIG_ENDIAN) && defined(__LITTLE_ENDIAN) +#error "conflicting endian definitions" +#elif defined(__x86_64__) +#define LZO_USE_CTZ64 1 +#define LZO_USE_CTZ32 1 +#elif defined(__i386__) || defined(__powerpc__) +#define LZO_USE_CTZ32 1 +#else +#define LZO_USE_CTZ32 1 +#endif #define M1_MAX_OFFSET 0x0400 #define M2_MAX_OFFSET 0x0800 @@ -34,8 +52,10 @@ #define M3_MARKER 32 #define M4_MARKER 16 -#define D_BITS 14 -#define D_MASK ((1u << D_BITS) - 1) +#define lzo_dict_t unsigned short +#define D_BITS 13 +#define D_SIZE (1u << D_BITS) +#define D_MASK (D_SIZE - 1) #define D_HIGH ((D_MASK >> 1) + 1) #define DX2(p, s1, s2) (((((size_t)((p)[2]) << (s2)) ^ (p)[1]) \ -- 1.7.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access 2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer @ 2012-10-07 15:09 ` Markus F.X.J. Oberhumer 2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton 3 siblings, 0 replies; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-07 15:09 UTC (permalink / raw) To: linux-kernel Cc: Markus F.X.J. Oberhumer, Andi Kleen, Andrew Morton, Johannes Stezenbach, richard -rw- weinberger Some code paths are only benefical on machines with fast unaligned loads, so only use these if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS defined. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com> --- lib/lzo/lzo1x_compress.c | 4 ++-- lib/lzo/lzo1x_decompress_safe.c | 15 ++++++++++++--- lib/lzo/lzodefs.h | 2 +- 3 files changed, 15 insertions(+), 6 deletions(-) diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c index d42efe5..1593dba 100644 --- a/lib/lzo/lzo1x_compress.c +++ b/lib/lzo/lzo1x_compress.c @@ -90,7 +90,7 @@ next: m_len = 4; { -#if defined(LZO_USE_CTZ64) +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(LZO_USE_CTZ64) u64 v; v = get_unaligned((const u64 *) (ip + m_len)) ^ get_unaligned((const u64 *) (m_pos + m_len)); @@ -110,7 +110,7 @@ next: # else # error "missing endian definition" # endif -#elif defined(LZO_USE_CTZ32) +#elif defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(LZO_USE_CTZ32) u32 v; v = get_unaligned((const u32 *) (ip + m_len)) ^ get_unaligned((const u32 *) (m_pos + m_len)); diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c index 0dba30c..569985d 100644 --- a/lib/lzo/lzo1x_decompress_safe.c +++ b/lib/lzo/lzo1x_decompress_safe.c @@ -64,6 +64,7 @@ int lzo1x_decompress_safe(const unsigned char *in, size_t in_len, } t += 3; copy_literal_run: +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) if (likely(HAVE_IP(t + 15) && HAVE_OP(t + 15))) { const unsigned char *ie = ip + t; unsigned char *oe = op + t; @@ -77,7 +78,9 @@ copy_literal_run: } while (ip < ie); ip = ie; op = oe; - } else { + } else +#endif + { NEED_OP(t); NEED_IP(t + 3); do { @@ -148,6 +151,7 @@ copy_literal_run: m_pos -= 0x4000; } TEST_LB(m_pos); +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) if (op - m_pos >= 8) { unsigned char *oe = op + t; if (likely(HAVE_OP(t + 15))) { @@ -173,7 +177,9 @@ copy_literal_run: *op++ = *m_pos++; } while (op < oe); } - } else { + } else +#endif + { unsigned char *oe = op + t; NEED_OP(t); op[0] = m_pos[0]; @@ -187,11 +193,14 @@ copy_literal_run: match_next: state = next; t = next; +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) if (likely(HAVE_IP(6) && HAVE_OP(4))) { COPY4(op, ip); op += t; ip += t; - } else { + } else +#endif + { NEED_IP(t + 3); NEED_OP(t); while (t > 0) { diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h index ddc8db5..5a4beb2 100644 --- a/lib/lzo/lzodefs.h +++ b/lib/lzo/lzodefs.h @@ -29,7 +29,7 @@ #define LZO_USE_CTZ32 1 #elif defined(__i386__) || defined(__powerpc__) #define LZO_USE_CTZ32 1 -#else +#elif defined(__arm__) && (__LINUX_ARM_ARCH__ >= 5) #define LZO_USE_CTZ32 1 #endif -- 1.7.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] Update LZO compression 2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer ` (2 preceding siblings ...) 2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer @ 2012-10-09 19:26 ` Andrew Morton 2012-10-09 19:54 ` Markus F.X.J. Oberhumer 3 siblings, 1 reply; 10+ messages in thread From: Andrew Morton @ 2012-10-09 19:26 UTC (permalink / raw) To: Markus F.X.J. Oberhumer Cc: linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Magenheimer, Dan Carpenter, Stephen Rothwell On Sun, 7 Oct 2012 17:07:55 +0200 "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote: > As requested by akpm I am sending my "lzo-update" branch at > > git://github.com/markus-oberhumer/linux.git lzo-update > > to lkml as a patch series created by "git format-patch -M v3.5..lzo-update". > > You can also browse the branch at > > https://github.com/markus-oberhumer/linux/compare/lzo-update > > and review the three patches at > > https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20 > https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826 > https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1 The changes look OK to me. Please ask Stephen to include the tree in linux-next, for a 3.7 merge. The changelog for patch 2/3 says: : This commit updates the kernel LZO code to the current upsteam version : which features a significant speed improvement - benchmarking the Calgary : and Silesia test corpora typically shows a doubled performance in : both compression and decompression on modern i386/x86_64/powerpc machines. There are significant clients of the LZO library - crypto, btrfs, jffs2, ubifs, squashfs and zcache. So let's give all those people a cc and ask that they test the LZO changes once they land in linux-next. For correctness and performance, please. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] Update LZO compression 2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton @ 2012-10-09 19:54 ` Markus F.X.J. Oberhumer 2012-10-09 22:43 ` Stephen Rothwell 2012-10-11 11:41 ` Arnd Bergmann 0 siblings, 2 replies; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-09 19:54 UTC (permalink / raw) To: Stephen Rothwell Cc: Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Magenheimer, Dan Carpenter Hi Stephen, On 2012-10-09 21:26, Andrew Morton wrote: > On Sun, 7 Oct 2012 17:07:55 +0200 > "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote: > >> As requested by akpm I am sending my "lzo-update" branch at >> >> git://github.com/markus-oberhumer/linux.git lzo-update >> >> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update". >> >> You can also browse the branch at >> >> https://github.com/markus-oberhumer/linux/compare/lzo-update >> >> and review the three patches at >> >> https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20 >> https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826 >> https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1 > > The changes look OK to me. Please ask Stephen to include the tree in > linux-next, for a 3.7 merge. I'd ask you to include my "lzo-update" branch in linux-next: git://github.com/markus-oberhumer/linux.git lzo-update > The changelog for patch 2/3 says: > > : This commit updates the kernel LZO code to the current upsteam version > : which features a significant speed improvement - benchmarking the Calgary > : and Silesia test corpora typically shows a doubled performance in > : both compression and decompression on modern i386/x86_64/powerpc machines. > > There are significant clients of the LZO library - crypto, btrfs, > jffs2, ubifs, squashfs and zcache. So let's give all those people a cc > and ask that they test the LZO changes once they land in linux-next. > For correctness and performance, please. The core compression and decompression code has been thoroughly tested, so I do not expect major problems. Good testing after the merge and feedback about build or performance issues (and improvements!) is highly appreciated. Many thanks, Markus -- Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] Update LZO compression 2012-10-09 19:54 ` Markus F.X.J. Oberhumer @ 2012-10-09 22:43 ` Stephen Rothwell 2012-10-11 11:41 ` Arnd Bergmann 1 sibling, 0 replies; 10+ messages in thread From: Stephen Rothwell @ 2012-10-09 22:43 UTC (permalink / raw) To: Markus F.X.J. Oberhumer Cc: Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Magenheimer, Dan Carpenter [-- Attachment #1: Type: text/plain, Size: 2772 bytes --] Hi Markus, On Tue, 09 Oct 2012 21:54:59 +0200 "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote: > > On 2012-10-09 21:26, Andrew Morton wrote: > > On Sun, 7 Oct 2012 17:07:55 +0200 > > "Markus F.X.J. Oberhumer" <markus@oberhumer.com> wrote: > > > >> As requested by akpm I am sending my "lzo-update" branch at > >> > >> git://github.com/markus-oberhumer/linux.git lzo-update > >> > >> to lkml as a patch series created by "git format-patch -M v3.5..lzo-update". > >> > >> You can also browse the branch at > >> > >> https://github.com/markus-oberhumer/linux/compare/lzo-update > >> > >> and review the three patches at > >> > >> https://github.com/markus-oberhumer/linux/commit/7c979cebc0f93dc692b734c12665a6824d219c20 > >> https://github.com/markus-oberhumer/linux/commit/10f6781c8591fe5fe4c8c733131915e5ae057826 > >> https://github.com/markus-oberhumer/linux/commit/5f702781f158cb59075cfa97e5c21f52275057f1 > > > > The changes look OK to me. Please ask Stephen to include the tree in > > linux-next, for a 3.7 merge. > > I'd ask you to include my "lzo-update" branch in linux-next: > > git://github.com/markus-oberhumer/linux.git lzo-update I have added this from today. Thanks for adding your subsystem tree as a participant of linux-next. As you may know, this is not a judgment of your code. The purpose of linux-next is for integration testing and to lower the impact of conflicts between subsystems in the next merge window. You will need to ensure that the patches/commits in your tree/series have been: * submitted under GPL v2 (or later) and include the Contributor's Signed-off-by, * posted to the relevant mailing list, * reviewed by you (or another maintainer of your subsystem tree), * successfully unit tested, and * destined for the current or next Linux merge window. Basically, this should be just what you would send to Linus (or ask him to fetch). It is allowed to be rebased if you deem it necessary. -- Cheers, Stephen Rothwell sfr@canb.auug.org.au Legal Stuff: By participating in linux-next, your subsystem tree contributions are public and will be included in the linux-next trees. You may be sent e-mail messages indicating errors or other issues when the patches/commits from your subsystem tree are merged and tested in linux-next. These messages may also be cross-posted to the linux-next mailing list, the linux-kernel mailing list, etc. The linux-next tree project and IBM (my employer) make no warranties regarding the linux-next project, the testing procedures, the results, the e-mails, etc. If you don't agree to these ground rules, let me know and I'll remove your tree from participation in linux-next. [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] Update LZO compression 2012-10-09 19:54 ` Markus F.X.J. Oberhumer 2012-10-09 22:43 ` Stephen Rothwell @ 2012-10-11 11:41 ` Arnd Bergmann 2012-10-11 16:28 ` Markus F.X.J. Oberhumer 1 sibling, 1 reply; 10+ messages in thread From: Arnd Bergmann @ 2012-10-11 11:41 UTC (permalink / raw) To: Markus F.X.J. Oberhumer Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Magenheimer, Dan Carpenter On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote: > > > > : This commit updates the kernel LZO code to the current upsteam version > > : which features a significant speed improvement - benchmarking the Calgary > > : and Silesia test corpora typically shows a doubled performance in > > : both compression and decompression on modern i386/x86_64/powerpc machines. > > > > There are significant clients of the LZO library - crypto, btrfs, > > jffs2, ubifs, squashfs and zcache. So let's give all those people a cc > > and ask that they test the LZO changes once they land in linux-next. > > For correctness and performance, please. > > The core compression and decompression code has been thoroughly tested, so I > do not expect major problems. > > Good testing after the merge and feedback about build or performance issues > (and improvements!) is highly appreciated. The addition of the lzo tree to linux-next caused this problem for ARM imx_v6_v7_defconfig: In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0: /home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error: lzo/lzo1x_decompress.c: No such file or directory Since the file was renamed, anything including it needs to be updated to the new file name. Signed-off-by: Arnd Bergmann <arnd@arndb.de> diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c index 4531294..960183d 100644 --- a/lib/decompress_unlzo.c +++ b/lib/decompress_unlzo.c @@ -31,7 +31,7 @@ */ #ifdef STATIC -#include "lzo/lzo1x_decompress.c" +#include "lzo/lzo1x_decompress_safe.c" #else #include <linux/decompress/unlzo.h> #endif ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] Update LZO compression 2012-10-11 11:41 ` Arnd Bergmann @ 2012-10-11 16:28 ` Markus F.X.J. Oberhumer 2012-12-21 2:03 ` Dan Magenheimer 0 siblings, 1 reply; 10+ messages in thread From: Markus F.X.J. Oberhumer @ 2012-10-11 16:28 UTC (permalink / raw) To: Arnd Bergmann Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Magenheimer, Dan Carpenter Thanks Arnd, On 2012-10-11 13:41, Arnd Bergmann wrote: > On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote: >>> >>> : This commit updates the kernel LZO code to the current upsteam version >>> : which features a significant speed improvement - benchmarking the Calgary >>> : and Silesia test corpora typically shows a doubled performance in >>> : both compression and decompression on modern i386/x86_64/powerpc machines. >>> >>> There are significant clients of the LZO library - crypto, btrfs, >>> jffs2, ubifs, squashfs and zcache. So let's give all those people a cc >>> and ask that they test the LZO changes once they land in linux-next. >>> For correctness and performance, please. >> >> The core compression and decompression code has been thoroughly tested, so I >> do not expect major problems. >> >> Good testing after the merge and feedback about build or performance issues >> (and improvements!) is highly appreciated. > > The addition of the lzo tree to linux-next caused this problem for ARM > imx_v6_v7_defconfig: > > In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0: > /home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error: lzo/lzo1x_decompress.c: No such file or directory > > Since the file was renamed, anything including it needs to be updated to the > new file name. I will add that patch to my tree. Cheers, Markus > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c > index 4531294..960183d 100644 > --- a/lib/decompress_unlzo.c > +++ b/lib/decompress_unlzo.c > @@ -31,7 +31,7 @@ > */ > > #ifdef STATIC > -#include "lzo/lzo1x_decompress.c" > +#include "lzo/lzo1x_decompress_safe.c" > #else > #include <linux/decompress/unlzo.h> > #endif -- Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 0/3] Update LZO compression 2012-10-11 16:28 ` Markus F.X.J. Oberhumer @ 2012-12-21 2:03 ` Dan Magenheimer 0 siblings, 0 replies; 10+ messages in thread From: Dan Magenheimer @ 2012-12-21 2:03 UTC (permalink / raw) To: Markus F.X.J. Oberhumer, Arnd Bergmann Cc: Stephen Rothwell, Andrew Morton, linux-kernel, Andi Kleen, Johannes Stezenbach, richard -rw- weinberger, linux-btrfs, linux-crypto, Artem Bityutskiy, Adrian Hunter, David Woodhouse, Phillip Lougher, Dan Carpenter > From: Markus F.X.J. Oberhumer [mailto:markus@oberhumer.com] > Subject: Re: [PATCH 0/3] Update LZO compression > > Thanks Arnd, > > On 2012-10-11 13:41, Arnd Bergmann wrote: > > On Tuesday 09 October 2012, Markus F.X.J. Oberhumer wrote: > >>> > >>> : This commit updates the kernel LZO code to the current upsteam version > >>> : which features a significant speed improvement - benchmarking the Calgary > >>> : and Silesia test corpora typically shows a doubled performance in > >>> : both compression and decompression on modern i386/x86_64/powerpc machines. > >>> > >>> There are significant clients of the LZO library - crypto, btrfs, > >>> jffs2, ubifs, squashfs and zcache. So let's give all those people a cc > >>> and ask that they test the LZO changes once they land in linux-next. > >>> For correctness and performance, please. > >> > >> The core compression and decompression code has been thoroughly tested, so I > >> do not expect major problems. > >> > >> Good testing after the merge and feedback about build or performance issues > >> (and improvements!) is highly appreciated. > > > > The addition of the lzo tree to linux-next caused this problem for ARM > > imx_v6_v7_defconfig: > > > > In file included from /home/arnd/linux-arm/arch/arm/boot/compressed/decompress.c:40:0: > > /home/arnd/linux-arm/arch/arm/boot/compressed/../../../../lib/decompress_unlzo.c:34:34: fatal error: > lzo/lzo1x_decompress.c: No such file or directory > > > > Since the file was renamed, anything including it needs to be updated to the > > new file name. > > I will add that patch to my tree. > > Cheers, > Markus Sorry if I missed it (bad connectivity this week), but is someone going to send a pull request to get this LZO update from linux-next into Linus's tree? The window is closing soon isn't it? Dan ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-12-21 2:04 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-10-07 15:07 [PATCH 0/3] Update LZO compression Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 1/3] lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Markus F.X.J. Oberhumer 2012-10-07 15:08 ` [PATCH 2/3] lib/lzo: Update LZO compression to current upstream version Markus F.X.J. Oberhumer 2012-10-07 15:09 ` [PATCH 3/3] lib/lzo: Optimize code for CPUs with inefficient unaligned access Markus F.X.J. Oberhumer 2012-10-09 19:26 ` [PATCH 0/3] Update LZO compression Andrew Morton 2012-10-09 19:54 ` Markus F.X.J. Oberhumer 2012-10-09 22:43 ` Stephen Rothwell 2012-10-11 11:41 ` Arnd Bergmann 2012-10-11 16:28 ` Markus F.X.J. Oberhumer 2012-12-21 2:03 ` Dan Magenheimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).