All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 01/10] crc32-remove-trailing-whitespace.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
@ 2011-08-31 22:29 ` Bob Pearson
  2011-08-31 22:29 ` [PATCH v6 02/10] crc32-move-to-documentation.diff Bob Pearson
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

removed two instances of trailing whitespaces
	- remove trailing whitespace from lib/crc32.c
	- remove trailing whitespace from lib/crc32defs.h

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c     |    2 +-
 lib/crc32defs.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -316,7 +316,7 @@ EXPORT_SYMBOL(crc32_be);
  * in the correct multiple to subtract, we can shift a byte at a time.
  * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
  * but again the multiple of the polynomial to subtract depends only on
- * the high bits, the high 8 bits in this case.  
+ * the high bits, the high 8 bits in this case.
  *
  * The multiple we need in that case is the low 32 bits of a 40-bit
  * value whose high 8 bits are given, and which is a multiple of the
Index: for-next/lib/crc32defs.h
===================================================================
--- for-next.orig/lib/crc32defs.h
+++ for-next/lib/crc32defs.h
@@ -8,7 +8,7 @@
 
 /* How many bits at a time to use.  Requires a table of 4<<CRC_xx_BITS bytes. */
 /* For less performance-sensitive, use 4 */
-#ifndef CRC_LE_BITS 
+#ifndef CRC_LE_BITS
 # define CRC_LE_BITS 8
 #endif
 #ifndef CRC_BE_BITS



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 02/10] crc32-move-to-documentation.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
  2011-08-31 22:29 ` [PATCH v6 01/10] crc32-remove-trailing-whitespace.diff Bob Pearson
@ 2011-08-31 22:29 ` Bob Pearson
  2011-08-31 22:29 ` [PATCH v6 03/10] crc32-replace-self-test.diff Bob Pearson
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Moved a long comment from lib/crc32.c to Documentation/crc32.txt
where it will more likely get read.
	- Edited the resulting document to add an explanation of the slicing-by-n
	  algorithm.

Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 Documentation/00-INDEX  |    2 
 Documentation/crc32.txt |  183 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/crc32.c             |  127 ---------------------------------
 3 files changed, 185 insertions(+), 127 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -20,6 +20,8 @@
  * Version 2.  See the file COPYING for more details.
  */
 
+/* see: Documentation/crc32.txt for a description of algorithms */
+
 #include <linux/crc32.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -208,133 +210,6 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 EXPORT_SYMBOL(crc32_le);
 EXPORT_SYMBOL(crc32_be);
 
-/*
- * A brief CRC tutorial.
- *
- * A CRC is a long-division remainder.  You add the CRC to the message,
- * and the whole thing (message+CRC) is a multiple of the given
- * CRC polynomial.  To check the CRC, you can either check that the
- * CRC matches the recomputed value, *or* you can check that the
- * remainder computed on the message+CRC is 0.  This latter approach
- * is used by a lot of hardware implementations, and is why so many
- * protocols put the end-of-frame flag after the CRC.
- *
- * It's actually the same long division you learned in school, except that
- * - We're working in binary, so the digits are only 0 and 1, and
- * - When dividing polynomials, there are no carries.  Rather than add and
- *   subtract, we just xor.  Thus, we tend to get a bit sloppy about
- *   the difference between adding and subtracting.
- *
- * A 32-bit CRC polynomial is actually 33 bits long.  But since it's
- * 33 bits long, bit 32 is always going to be set, so usually the CRC
- * is written in hex with the most significant bit omitted.  (If you're
- * familiar with the IEEE 754 floating-point format, it's the same idea.)
- *
- * Note that a CRC is computed over a string of *bits*, so you have
- * to decide on the endianness of the bits within each byte.  To get
- * the best error-detecting properties, this should correspond to the
- * order they're actually sent.  For example, standard RS-232 serial is
- * little-endian; the most significant bit (sometimes used for parity)
- * is sent last.  And when appending a CRC word to a message, you should
- * do it in the right order, matching the endianness.
- *
- * Just like with ordinary division, the remainder is always smaller than
- * the divisor (the CRC polynomial) you're dividing by.  Each step of the
- * division, you take one more digit (bit) of the dividend and append it
- * to the current remainder.  Then you figure out the appropriate multiple
- * of the divisor to subtract to being the remainder back into range.
- * In binary, it's easy - it has to be either 0 or 1, and to make the
- * XOR cancel, it's just a copy of bit 32 of the remainder.
- *
- * When computing a CRC, we don't care about the quotient, so we can
- * throw the quotient bit away, but subtract the appropriate multiple of
- * the polynomial from the remainder and we're back to where we started,
- * ready to process the next bit.
- *
- * A big-endian CRC written this way would be coded like:
- * for (i = 0; i < input_bits; i++) {
- * 	multiple = remainder & 0x80000000 ? CRCPOLY : 0;
- * 	remainder = (remainder << 1 | next_input_bit()) ^ multiple;
- * }
- * Notice how, to get at bit 32 of the shifted remainder, we look
- * at bit 31 of the remainder *before* shifting it.
- *
- * But also notice how the next_input_bit() bits we're shifting into
- * the remainder don't actually affect any decision-making until
- * 32 bits later.  Thus, the first 32 cycles of this are pretty boring.
- * Also, to add the CRC to a message, we need a 32-bit-long hole for it at
- * the end, so we have to add 32 extra cycles shifting in zeros at the
- * end of every message,
- *
- * So the standard trick is to rearrage merging in the next_input_bit()
- * until the moment it's needed.  Then the first 32 cycles can be precomputed,
- * and merging in the final 32 zero bits to make room for the CRC can be
- * skipped entirely.
- * This changes the code to:
- * for (i = 0; i < input_bits; i++) {
- *      remainder ^= next_input_bit() << 31;
- * 	multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
- * 	remainder = (remainder << 1) ^ multiple;
- * }
- * With this optimization, the little-endian code is simpler:
- * for (i = 0; i < input_bits; i++) {
- *      remainder ^= next_input_bit();
- * 	multiple = (remainder & 1) ? CRCPOLY : 0;
- * 	remainder = (remainder >> 1) ^ multiple;
- * }
- *
- * Note that the other details of endianness have been hidden in CRCPOLY
- * (which must be bit-reversed) and next_input_bit().
- *
- * However, as long as next_input_bit is returning the bits in a sensible
- * order, we can actually do the merging 8 or more bits at a time rather
- * than one bit at a time:
- * for (i = 0; i < input_bytes; i++) {
- * 	remainder ^= next_input_byte() << 24;
- * 	for (j = 0; j < 8; j++) {
- * 		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
- * 		remainder = (remainder << 1) ^ multiple;
- * 	}
- * }
- * Or in little-endian:
- * for (i = 0; i < input_bytes; i++) {
- * 	remainder ^= next_input_byte();
- * 	for (j = 0; j < 8; j++) {
- * 		multiple = (remainder & 1) ? CRCPOLY : 0;
- * 		remainder = (remainder << 1) ^ multiple;
- * 	}
- * }
- * If the input is a multiple of 32 bits, you can even XOR in a 32-bit
- * word at a time and increase the inner loop count to 32.
- *
- * You can also mix and match the two loop styles, for example doing the
- * bulk of a message byte-at-a-time and adding bit-at-a-time processing
- * for any fractional bytes at the end.
- *
- * The only remaining optimization is to the byte-at-a-time table method.
- * Here, rather than just shifting one bit of the remainder to decide
- * in the correct multiple to subtract, we can shift a byte at a time.
- * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
- * but again the multiple of the polynomial to subtract depends only on
- * the high bits, the high 8 bits in this case.
- *
- * The multiple we need in that case is the low 32 bits of a 40-bit
- * value whose high 8 bits are given, and which is a multiple of the
- * generator polynomial.  This is simply the CRC-32 of the given
- * one-byte message.
- *
- * Two more details: normally, appending zero bits to a message which
- * is already a multiple of a polynomial produces a larger multiple of that
- * polynomial.  To enable a CRC to detect this condition, it's common to
- * invert the CRC before appending it.  This makes the remainder of the
- * message+crc come out not as zero, but some fixed non-zero value.
- *
- * The same problem applies to zero bits prepended to the message, and
- * a similar solution is used.  Instead of starting with a remainder of
- * 0, an initial remainder of all ones is used.  As long as you start
- * the same way on decoding, it doesn't make a difference.
- */
-
 #ifdef UNITTEST
 
 #include <stdlib.h>
Index: for-next/Documentation/crc32.txt
===================================================================
--- /dev/null
+++ for-next/Documentation/crc32.txt
@@ -0,0 +1,183 @@
+A brief CRC tutorial.
+
+A CRC is a long-division remainder.  You add the CRC to the message,
+and the whole thing (message+CRC) is a multiple of the given
+CRC polynomial.  To check the CRC, you can either check that the
+CRC matches the recomputed value, *or* you can check that the
+remainder computed on the message+CRC is 0.  This latter approach
+is used by a lot of hardware implementations, and is why so many
+protocols put the end-of-frame flag after the CRC.
+
+It's actually the same long division you learned in school, except that
+- We're working in binary, so the digits are only 0 and 1, and
+- When dividing polynomials, there are no carries.  Rather than add and
+  subtract, we just xor.  Thus, we tend to get a bit sloppy about
+  the difference between adding and subtracting.
+
+Like all division, the remainder is always smaller than the divisor.
+To produce a 32-bit CRC, the divisor is actually a 33-bit CRC polynomial.
+Since it's 33 bits long, bit 32 is always going to be set, so usually the
+CRC is written in hex with the most significant bit omitted.  (If you're
+familiar with the IEEE 754 floating-point format, it's the same idea.)
+
+Note that a CRC is computed over a string of *bits*, so you have
+to decide on the endianness of the bits within each byte.  To get
+the best error-detecting properties, this should correspond to the
+order they're actually sent.  For example, standard RS-232 serial is
+little-endian; the most significant bit (sometimes used for parity)
+is sent last.  And when appending a CRC word to a message, you should
+do it in the right order, matching the endianness.
+
+Just like with ordinary division, you proceed one digit (bit) at a time.
+Each step of the division, division, you take one more digit (bit) of the
+dividend and append it to the current remainder.  Then you figure out the
+appropriate multiple of the divisor to subtract to being the remainder
+back into range.  In binary, this is easy - it has to be either 0 or 1,
+and to make the XOR cancel, it's just a copy of bit 32 of the remainder.
+
+When computing a CRC, we don't care about the quotient, so we can
+throw the quotient bit away, but subtract the appropriate multiple of
+the polynomial from the remainder and we're back to where we started,
+ready to process the next bit.
+
+A big-endian CRC written this way would be coded like:
+for (i = 0; i < input_bits; i++) {
+	multiple = remainder & 0x80000000 ? CRCPOLY : 0;
+	remainder = (remainder << 1 | next_input_bit()) ^ multiple;
+}
+
+Notice how, to get at bit 32 of the shifted remainder, we look
+at bit 31 of the remainder *before* shifting it.
+
+But also notice how the next_input_bit() bits we're shifting into
+the remainder don't actually affect any decision-making until
+32 bits later.  Thus, the first 32 cycles of this are pretty boring.
+Also, to add the CRC to a message, we need a 32-bit-long hole for it at
+the end, so we have to add 32 extra cycles shifting in zeros at the
+end of every message,
+
+These details lead to a standard trick: rearrange merging in the
+next_input_bit() until the moment it's needed.  Then the first 32 cycles
+can be precomputed, and merging in the final 32 zero bits to make room
+for the CRC can be skipped entirely.  This changes the code to:
+
+for (i = 0; i < input_bits; i++) {
+	remainder ^= next_input_bit() << 31;
+	multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+	remainder = (remainder << 1) ^ multiple;
+}
+
+With this optimization, the little-endian code is particularly simple:
+for (i = 0; i < input_bits; i++) {
+	remainder ^= next_input_bit();
+	multiple = (remainder & 1) ? CRCPOLY : 0;
+	remainder = (remainder >> 1) ^ multiple;
+}
+
+The most significant coefficient of the remainder polynomial is stored
+in the least significant bit of the binary "remainder" variable.
+The other details of endianness have been hidden in CRCPOLY (which must
+be bit-reversed) and next_input_bit().
+
+As long as next_input_bit is returning the bits in a sensible order, we don't
+*have* to wait until the last possible moment to merge in additional bits.
+We can do it 8 bits at a time rather than 1 bit at a time:
+for (i = 0; i < input_bytes; i++) {
+	remainder ^= next_input_byte() << 24;
+	for (j = 0; j < 8; j++) {
+		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+		remainder = (remainder << 1) ^ multiple;
+	}
+}
+
+Or in little-endian:
+for (i = 0; i < input_bytes; i++) {
+	remainder ^= next_input_byte();
+	for (j = 0; j < 8; j++) {
+		multiple = (remainder & 1) ? CRCPOLY : 0;
+		remainder = (remainder >> 1) ^ multiple;
+	}
+}
+
+If the input is a multiple of 32 bits, you can even XOR in a 32-bit
+word at a time and increase the inner loop count to 32.
+
+You can also mix and match the two loop styles, for example doing the
+bulk of a message byte-at-a-time and adding bit-at-a-time processing
+for any fractional bytes at the end.
+
+To reduce the number of conditional branches, software commonly uses
+the byte-at-a-time table method, popularized by Dilip V. Sarwate,
+"Computation of Cyclic Redundancy Checks via Table Look-Up", Comm. ACM
+v.31 no.8 (August 1998) p. 1008-1013.
+
+Here, rather than just shifting one bit of the remainder to decide
+in the correct multiple to subtract, we can shift a byte at a time.
+This produces a 40-bit (rather than a 33-bit) intermediate remainder,
+and the correct multiple of the polynomial to subtract is found using
+a 256-entry lookup table indexed by the high 8 bits.
+
+(The table entries are simply the CRC-32 of the given one-byte messages.)
+
+When space is more constrained, smaller tables can be used, e.g. two
+4-bit shifts followed by a lookup in a 16-entry table.
+
+It is not practical to process much more than 8 bits at a time using this
+technique, because tables larger than 256 entries use too much memory and,
+more importantly, too much of the L1 cache.
+
+To get higher software performance, a "slicing" technique can be used.
+See "High Octane CRC Generation with the Intel Slicing-by-8 Algorithm",
+ftp://download.intel.com/technology/comms/perfnet/download/slicing-by-8.pdf
+
+This does not change the number of table lookups, but does increase
+the parallelism.  With the classic Sarwate algorithm, each table lookup
+must be completed before the index of the next can be computed.
+
+A "slicing by 2" technique would shift the remainder 16 bits at a time,
+producing a 48-bit intermediate remainder.  Rather than doing a single
+lookup in a 65536-entry table, the two high bytes are looked up in
+two different 256-entry tables.  Each contains the remainder required
+to cancel out the corresponding byte.  The tables are different because the
+polynomials to cancel are different.  One has non-zero coefficients from
+x^32 to x^39, while the other goes from x^40 to x^47.
+
+Since modern processors can handle many parallel memory operations, this
+takes barely longer than a single table look-up and thus performs almost
+twice as fast as the basic Sarwate algorithm.
+
+This can be extended to "slicing by 4" using 4 256-entry tables.
+Each step, 32 bits of data is fetched, XORed with the CRC, and the result
+broken into bytes and looked up in the tables.  Because the 32-bit shift
+leaves the low-order bits of the intermediate remainder zero, the
+final CRC is simply the XOR of the 4 table look-ups.
+
+But this still enforces sequential execution: a second group of table
+look-ups cannot begin until the previous groups 4 table look-ups have all
+been completed.  Thus, the processor's load/store unit is sometimes idle.
+
+To make maximum use of the processor, "slicing by 8" performs 8 look-ups
+in parallel.  Each step, the 32-bit CRC is shifted 64 bits and XORed
+with 64 bits of input data.  What is important to note is that 4 of
+those 8 bytes are simply copies of the input data; they do not depend
+on the previous CRC at all.  Thus, those 4 table look-ups may commence
+immediately, without waiting for the previous loop iteration.
+
+By always having 4 loads in flight, a modern superscalar processor can
+be kept busy and make full use of its L1 cache.
+
+Two more details about CRC implementation in the real world:
+
+Normally, appending zero bits to a message which is already a multiple
+of a polynomial produces a larger multiple of that polynomial.  Thus,
+a basic CRC will not detect appended zero bits (or bytes).  To enable
+a CRC to detect this condition, it's common to invert the CRC before
+appending it.  This makes the remainder of the message+crc come out not
+as zero, but some fixed non-zero value.  (The CRC of the inversion
+pattern, 0xffffffff.)
+
+The same problem applies to zero bits prepended to the message, and a
+similar solution is used.  Instead of starting the CRC computation with
+a remainder of 0, an initial remainder of all ones is used.  As long as
+you start the same way on decoding, it doesn't make a difference.
+
Index: for-next/Documentation/00-INDEX
===================================================================
--- for-next.orig/Documentation/00-INDEX
+++ for-next/Documentation/00-INDEX
@@ -104,6 +104,8 @@ cpuidle/
 	- info on CPU_IDLE, CPU idle state management subsystem.
 cputopology.txt
 	- documentation on how CPU topology info is exported via sysfs.
+crc32.txt
+	- brief tutorial on CRC computation
 cris/
 	- directory with info about Linux on CRIS architecture.
 crypto/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 03/10] crc32-replace-self-test.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
  2011-08-31 22:29 ` [PATCH v6 01/10] crc32-remove-trailing-whitespace.diff Bob Pearson
  2011-08-31 22:29 ` [PATCH v6 02/10] crc32-move-to-documentation.diff Bob Pearson
@ 2011-08-31 22:29 ` Bob Pearson
  2011-09-02 23:51   ` Andrew Morton
  2011-08-31 22:30 ` [PATCH v6 04/10] crc32-add-pointer-to-tab.diff Bob Pearson
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Replaced the unit test provided in crc32.c, which doesn't have a
makefile and doesn't compile with current headers, with a simpler
self test routine that also gives a measure of performance and
runs at module init time. The self test option can be enabled
through a configuration option CONFIG_CRC32_SELFTEST.

The test stresses the pre and post loops and is thus not very
realistic since actual uses will likely have addresses and lengths
that are at least 4 byte aligned. However, the main loop is long
enough so that the performance is dominated by that loop.

The expected values for crc32_le and crc32_be were generated
with the original version of crc32.c using CRC_BITS_LE = 8 and
CRC_BITS_BE = 8. These values were then used to check all the
values of the BITS parameters in both the original and new versions.

The performance results show some variability from run to run
in spite of attempts to both warm the cache and reduce the amount
of OS noise by limiting interrutps during the test. To get comparable
results and to analyse options wrt performance the best time
reported over a small sample of runs has been taken.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/Kconfig |   10 
 lib/crc32.c |  794 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 689 insertions(+), 115 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -210,137 +210,701 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 EXPORT_SYMBOL(crc32_le);
 EXPORT_SYMBOL(crc32_be);
 
-#ifdef UNITTEST
+#ifdef CONFIG_CRC32_SELFTEST
 
-#include <stdlib.h>
-#include <stdio.h>
-
-#if 0				/*Not used at present */
-static void
-buf_dump(char const *prefix, unsigned char const *buf, size_t len)
+/* 4096 random bytes */
+static u8 __attribute__((__aligned__(8))) test_buf[] =
 {
-	fputs(prefix, stdout);
-	while (len--)
-		printf(" %02x", *buf++);
-	putchar('\n');
-
-}
-#endif
-
-static void bytereverse(unsigned char *buf, size_t len)
+	0x5b, 0x85, 0x21, 0xcb, 0x09, 0x68, 0x7d, 0x30,
+	0xc7, 0x69, 0xd7, 0x30, 0x92, 0xde, 0x59, 0xe4,
+	0xc9, 0x6e, 0x8b, 0xdb, 0x98, 0x6b, 0xaa, 0x60,
+	0xa8, 0xb5, 0xbc, 0x6c, 0xa9, 0xb1, 0x5b, 0x2c,
+	0xea, 0xb4, 0x92, 0x6a, 0x3f, 0x79, 0x91, 0xe4,
+	0xe9, 0x70, 0x51, 0x8c, 0x7f, 0x95, 0x6f, 0x1a,
+	0x56, 0xa1, 0x5c, 0x27, 0x03, 0x67, 0x9f, 0x3a,
+	0xe2, 0x31, 0x11, 0x29, 0x6b, 0x98, 0xfc, 0xc4,
+	0x53, 0x24, 0xc5, 0x8b, 0xce, 0x47, 0xb2, 0xb9,
+	0x32, 0xcb, 0xc1, 0xd0, 0x03, 0x57, 0x4e, 0xd4,
+	0xe9, 0x3c, 0xa1, 0x63, 0xcf, 0x12, 0x0e, 0xca,
+	0xe1, 0x13, 0xd1, 0x93, 0xa6, 0x88, 0x5c, 0x61,
+	0x5b, 0xbb, 0xf0, 0x19, 0x46, 0xb4, 0xcf, 0x9e,
+	0xb6, 0x6b, 0x4c, 0x3a, 0xcf, 0x60, 0xf9, 0x7a,
+	0x8d, 0x07, 0x63, 0xdb, 0x40, 0xe9, 0x0b, 0x6f,
+	0xad, 0x97, 0xf1, 0xed, 0xd0, 0x1e, 0x26, 0xfd,
+	0xbf, 0xb7, 0xc8, 0x04, 0x94, 0xf8, 0x8b, 0x8c,
+	0xf1, 0xab, 0x7a, 0xd4, 0xdd, 0xf3, 0xe8, 0x88,
+	0xc3, 0xed, 0x17, 0x8a, 0x9b, 0x40, 0x0d, 0x53,
+	0x62, 0x12, 0x03, 0x5f, 0x1b, 0x35, 0x32, 0x1f,
+	0xb4, 0x7b, 0x93, 0x78, 0x0d, 0xdb, 0xce, 0xa4,
+	0xc0, 0x47, 0xd5, 0xbf, 0x68, 0xe8, 0x5d, 0x74,
+	0x8f, 0x8e, 0x75, 0x1c, 0xb2, 0x4f, 0x9a, 0x60,
+	0xd1, 0xbe, 0x10, 0xf4, 0x5c, 0xa1, 0x53, 0x09,
+	0xa5, 0xe0, 0x09, 0x54, 0x85, 0x5c, 0xdc, 0x07,
+	0xe7, 0x21, 0x69, 0x7b, 0x8a, 0xfd, 0x90, 0xf1,
+	0x22, 0xd0, 0xb4, 0x36, 0x28, 0xe6, 0xb8, 0x0f,
+	0x39, 0xde, 0xc8, 0xf3, 0x86, 0x60, 0x34, 0xd2,
+	0x5e, 0xdf, 0xfd, 0xcf, 0x0f, 0xa9, 0x65, 0xf0,
+	0xd5, 0x4d, 0x96, 0x40, 0xe3, 0xdf, 0x3f, 0x95,
+	0x5a, 0x39, 0x19, 0x93, 0xf4, 0x75, 0xce, 0x22,
+	0x00, 0x1c, 0x93, 0xe2, 0x03, 0x66, 0xf4, 0x93,
+	0x73, 0x86, 0x81, 0x8e, 0x29, 0x44, 0x48, 0x86,
+	0x61, 0x7c, 0x48, 0xa3, 0x43, 0xd2, 0x9c, 0x8d,
+	0xd4, 0x95, 0xdd, 0xe1, 0x22, 0x89, 0x3a, 0x40,
+	0x4c, 0x1b, 0x8a, 0x04, 0xa8, 0x09, 0x69, 0x8b,
+	0xea, 0xc6, 0x55, 0x8e, 0x57, 0xe6, 0x64, 0x35,
+	0xf0, 0xc7, 0x16, 0x9f, 0x5d, 0x5e, 0x86, 0x40,
+	0x46, 0xbb, 0xe5, 0x45, 0x88, 0xfe, 0xc9, 0x63,
+	0x15, 0xfb, 0xf5, 0xbd, 0x71, 0x61, 0xeb, 0x7b,
+	0x78, 0x70, 0x07, 0x31, 0x03, 0x9f, 0xb2, 0xc8,
+	0xa7, 0xab, 0x47, 0xfd, 0xdf, 0xa0, 0x78, 0x72,
+	0xa4, 0x2a, 0xe4, 0xb6, 0xba, 0xc0, 0x1e, 0x86,
+	0x71, 0xe6, 0x3d, 0x18, 0x37, 0x70, 0xe6, 0xff,
+	0xe0, 0xbc, 0x0b, 0x22, 0xa0, 0x1f, 0xd3, 0xed,
+	0xa2, 0x55, 0x39, 0xab, 0xa8, 0x13, 0x73, 0x7c,
+	0x3f, 0xb2, 0xd6, 0x19, 0xac, 0xff, 0x99, 0xed,
+	0xe8, 0xe6, 0xa6, 0x22, 0xe3, 0x9c, 0xf1, 0x30,
+	0xdc, 0x01, 0x0a, 0x56, 0xfa, 0xe4, 0xc9, 0x99,
+	0xdd, 0xa8, 0xd8, 0xda, 0x35, 0x51, 0x73, 0xb4,
+	0x40, 0x86, 0x85, 0xdb, 0x5c, 0xd5, 0x85, 0x80,
+	0x14, 0x9c, 0xfd, 0x98, 0xa9, 0x82, 0xc5, 0x37,
+	0xff, 0x32, 0x5d, 0xd0, 0x0b, 0xfa, 0xdc, 0x04,
+	0x5e, 0x09, 0xd2, 0xca, 0x17, 0x4b, 0x1a, 0x8e,
+	0x15, 0xe1, 0xcc, 0x4e, 0x52, 0x88, 0x35, 0xbd,
+	0x48, 0xfe, 0x15, 0xa0, 0x91, 0xfd, 0x7e, 0x6c,
+	0x0e, 0x5d, 0x79, 0x1b, 0x81, 0x79, 0xd2, 0x09,
+	0x34, 0x70, 0x3d, 0x81, 0xec, 0xf6, 0x24, 0xbb,
+	0xfb, 0xf1, 0x7b, 0xdf, 0x54, 0xea, 0x80, 0x9b,
+	0xc7, 0x99, 0x9e, 0xbd, 0x16, 0x78, 0x12, 0x53,
+	0x5e, 0x01, 0xa7, 0x4e, 0xbd, 0x67, 0xe1, 0x9b,
+	0x4c, 0x0e, 0x61, 0x45, 0x97, 0xd2, 0xf0, 0x0f,
+	0xfe, 0x15, 0x08, 0xb7, 0x11, 0x4c, 0xe7, 0xff,
+	0x81, 0x53, 0xff, 0x91, 0x25, 0x38, 0x7e, 0x40,
+	0x94, 0xe5, 0xe0, 0xad, 0xe6, 0xd9, 0x79, 0xb6,
+	0x92, 0xc9, 0xfc, 0xde, 0xc3, 0x1a, 0x23, 0xbb,
+	0xdd, 0xc8, 0x51, 0x0c, 0x3a, 0x72, 0xfa, 0x73,
+	0x6f, 0xb7, 0xee, 0x61, 0x39, 0x03, 0x01, 0x3f,
+	0x7f, 0x94, 0x2e, 0x2e, 0xba, 0x3a, 0xbb, 0xb4,
+	0xfa, 0x6a, 0x17, 0xfe, 0xea, 0xef, 0x5e, 0x66,
+	0x97, 0x3f, 0x32, 0x3d, 0xd7, 0x3e, 0xb1, 0xf1,
+	0x6c, 0x14, 0x4c, 0xfd, 0x37, 0xd3, 0x38, 0x80,
+	0xfb, 0xde, 0xa6, 0x24, 0x1e, 0xc8, 0xca, 0x7f,
+	0x3a, 0x93, 0xd8, 0x8b, 0x18, 0x13, 0xb2, 0xe5,
+	0xe4, 0x93, 0x05, 0x53, 0x4f, 0x84, 0x66, 0xa7,
+	0x58, 0x5c, 0x7b, 0x86, 0x52, 0x6d, 0x0d, 0xce,
+	0xa4, 0x30, 0x7d, 0xb6, 0x18, 0x9f, 0xeb, 0xff,
+	0x22, 0xbb, 0x72, 0x29, 0xb9, 0x44, 0x0b, 0x48,
+	0x1e, 0x84, 0x71, 0x81, 0xe3, 0x6d, 0x73, 0x26,
+	0x92, 0xb4, 0x4d, 0x2a, 0x29, 0xb8, 0x1f, 0x72,
+	0xed, 0xd0, 0xe1, 0x64, 0x77, 0xea, 0x8e, 0x88,
+	0x0f, 0xef, 0x3f, 0xb1, 0x3b, 0xad, 0xf9, 0xc9,
+	0x8b, 0xd0, 0xac, 0xc6, 0xcc, 0xa9, 0x40, 0xcc,
+	0x76, 0xf6, 0x3b, 0x53, 0xb5, 0x88, 0xcb, 0xc8,
+	0x37, 0xf1, 0xa2, 0xba, 0x23, 0x15, 0x99, 0x09,
+	0xcc, 0xe7, 0x7a, 0x3b, 0x37, 0xf7, 0x58, 0xc8,
+	0x46, 0x8c, 0x2b, 0x2f, 0x4e, 0x0e, 0xa6, 0x5c,
+	0xea, 0x85, 0x55, 0xba, 0x02, 0x0e, 0x0e, 0x48,
+	0xbc, 0xe1, 0xb1, 0x01, 0x35, 0x79, 0x13, 0x3d,
+	0x1b, 0xc0, 0x53, 0x68, 0x11, 0xe7, 0x95, 0x0f,
+	0x9d, 0x3f, 0x4c, 0x47, 0x7b, 0x4d, 0x1c, 0xae,
+	0x50, 0x9b, 0xcb, 0xdd, 0x05, 0x8d, 0x9a, 0x97,
+	0xfd, 0x8c, 0xef, 0x0c, 0x1d, 0x67, 0x73, 0xa8,
+	0x28, 0x36, 0xd5, 0xb6, 0x92, 0x33, 0x40, 0x75,
+	0x0b, 0x51, 0xc3, 0x64, 0xba, 0x1d, 0xc2, 0xcc,
+	0xee, 0x7d, 0x54, 0x0f, 0x27, 0x69, 0xa7, 0x27,
+	0x63, 0x30, 0x29, 0xd9, 0xc8, 0x84, 0xd8, 0xdf,
+	0x9f, 0x68, 0x8d, 0x04, 0xca, 0xa6, 0xc5, 0xc7,
+	0x7a, 0x5c, 0xc8, 0xd1, 0xcb, 0x4a, 0xec, 0xd0,
+	0xd8, 0x20, 0x69, 0xc5, 0x17, 0xcd, 0x78, 0xc8,
+	0x75, 0x23, 0x30, 0x69, 0xc9, 0xd4, 0xea, 0x5c,
+	0x4f, 0x6b, 0x86, 0x3f, 0x8b, 0xfe, 0xee, 0x44,
+	0xc9, 0x7c, 0xb7, 0xdd, 0x3e, 0xe5, 0xec, 0x54,
+	0x03, 0x3e, 0xaa, 0x82, 0xc6, 0xdf, 0xb2, 0x38,
+	0x0e, 0x5d, 0xb3, 0x88, 0xd9, 0xd3, 0x69, 0x5f,
+	0x8f, 0x70, 0x8a, 0x7e, 0x11, 0xd9, 0x1e, 0x7b,
+	0x38, 0xf1, 0x42, 0x1a, 0xc0, 0x35, 0xf5, 0xc7,
+	0x36, 0x85, 0xf5, 0xf7, 0xb8, 0x7e, 0xc7, 0xef,
+	0x18, 0xf1, 0x63, 0xd6, 0x7a, 0xc6, 0xc9, 0x0e,
+	0x4d, 0x69, 0x4f, 0x84, 0xef, 0x26, 0x41, 0x0c,
+	0xec, 0xc7, 0xe0, 0x7e, 0x3c, 0x67, 0x01, 0x4c,
+	0x62, 0x1a, 0x20, 0x6f, 0xee, 0x47, 0x4d, 0xc0,
+	0x99, 0x13, 0x8d, 0x91, 0x4a, 0x26, 0xd4, 0x37,
+	0x28, 0x90, 0x58, 0x75, 0x66, 0x2b, 0x0a, 0xdf,
+	0xda, 0xee, 0x92, 0x25, 0x90, 0x62, 0x39, 0x9e,
+	0x44, 0x98, 0xad, 0xc1, 0x88, 0xed, 0xe4, 0xb4,
+	0xaf, 0xf5, 0x8c, 0x9b, 0x48, 0x4d, 0x56, 0x60,
+	0x97, 0x0f, 0x61, 0x59, 0x9e, 0xa6, 0x27, 0xfe,
+	0xc1, 0x91, 0x15, 0x38, 0xb8, 0x0f, 0xae, 0x61,
+	0x7d, 0x26, 0x13, 0x5a, 0x73, 0xff, 0x1c, 0xa3,
+	0x61, 0x04, 0x58, 0x48, 0x55, 0x44, 0x11, 0xfe,
+	0x15, 0xca, 0xc3, 0xbd, 0xca, 0xc5, 0xb4, 0x40,
+	0x5d, 0x1b, 0x7f, 0x39, 0xb5, 0x9c, 0x35, 0xec,
+	0x61, 0x15, 0x32, 0x32, 0xb8, 0x4e, 0x40, 0x9f,
+	0x17, 0x1f, 0x0a, 0x4d, 0xa9, 0x91, 0xef, 0xb7,
+	0xb0, 0xeb, 0xc2, 0x83, 0x9a, 0x6c, 0xd2, 0x79,
+	0x43, 0x78, 0x5e, 0x2f, 0xe5, 0xdd, 0x1a, 0x3c,
+	0x45, 0xab, 0x29, 0x40, 0x3a, 0x37, 0x5b, 0x6f,
+	0xd7, 0xfc, 0x48, 0x64, 0x3c, 0x49, 0xfb, 0x21,
+	0xbe, 0xc3, 0xff, 0x07, 0xfb, 0x17, 0xe9, 0xc9,
+	0x0c, 0x4c, 0x5c, 0x15, 0x9e, 0x8e, 0x22, 0x30,
+	0x0a, 0xde, 0x48, 0x7f, 0xdb, 0x0d, 0xd1, 0x2b,
+	0x87, 0x38, 0x9e, 0xcc, 0x5a, 0x01, 0x16, 0xee,
+	0x75, 0x49, 0x0d, 0x30, 0x01, 0x34, 0x6a, 0xb6,
+	0x9a, 0x5a, 0x2a, 0xec, 0xbb, 0x48, 0xac, 0xd3,
+	0x77, 0x83, 0xd8, 0x08, 0x86, 0x4f, 0x48, 0x09,
+	0x29, 0x41, 0x79, 0xa1, 0x03, 0x12, 0xc4, 0xcd,
+	0x90, 0x55, 0x47, 0x66, 0x74, 0x9a, 0xcc, 0x4f,
+	0x35, 0x8c, 0xd6, 0x98, 0xef, 0xeb, 0x45, 0xb9,
+	0x9a, 0x26, 0x2f, 0x39, 0xa5, 0x70, 0x6d, 0xfc,
+	0xb4, 0x51, 0xee, 0xf4, 0x9c, 0xe7, 0x38, 0x59,
+	0xad, 0xf4, 0xbc, 0x46, 0xff, 0x46, 0x8e, 0x60,
+	0x9c, 0xa3, 0x60, 0x1d, 0xf8, 0x26, 0x72, 0xf5,
+	0x72, 0x9d, 0x68, 0x80, 0x04, 0xf6, 0x0b, 0xa1,
+	0x0a, 0xd5, 0xa7, 0x82, 0x3a, 0x3e, 0x47, 0xa8,
+	0x5a, 0xde, 0x59, 0x4f, 0x7b, 0x07, 0xb3, 0xe9,
+	0x24, 0x19, 0x3d, 0x34, 0x05, 0xec, 0xf1, 0xab,
+	0x6e, 0x64, 0x8f, 0xd3, 0xe6, 0x41, 0x86, 0x80,
+	0x70, 0xe3, 0x8d, 0x60, 0x9c, 0x34, 0x25, 0x01,
+	0x07, 0x4d, 0x19, 0x41, 0x4e, 0x3d, 0x5c, 0x7e,
+	0xa8, 0xf5, 0xcc, 0xd5, 0x7b, 0xe2, 0x7d, 0x3d,
+	0x49, 0x86, 0x7d, 0x07, 0xb7, 0x10, 0xe3, 0x35,
+	0xb8, 0x84, 0x6d, 0x76, 0xab, 0x17, 0xc6, 0x38,
+	0xb4, 0xd3, 0x28, 0x57, 0xad, 0xd3, 0x88, 0x5a,
+	0xda, 0xea, 0xc8, 0x94, 0xcc, 0x37, 0x19, 0xac,
+	0x9c, 0x9f, 0x4b, 0x00, 0x15, 0xc0, 0xc8, 0xca,
+	0x1f, 0x15, 0xaa, 0xe0, 0xdb, 0xf9, 0x2f, 0x57,
+	0x1b, 0x24, 0xc7, 0x6f, 0x76, 0x29, 0xfb, 0xed,
+	0x25, 0x0d, 0xc0, 0xfe, 0xbd, 0x5a, 0xbf, 0x20,
+	0x08, 0x51, 0x05, 0xec, 0x71, 0xa3, 0xbf, 0xef,
+	0x5e, 0x99, 0x75, 0xdb, 0x3c, 0x5f, 0x9a, 0x8c,
+	0xbb, 0x19, 0x5c, 0x0e, 0x93, 0x19, 0xf8, 0x6a,
+	0xbc, 0xf2, 0x12, 0x54, 0x2f, 0xcb, 0x28, 0x64,
+	0x88, 0xb3, 0x92, 0x0d, 0x96, 0xd1, 0xa6, 0xe4,
+	0x1f, 0xf1, 0x4d, 0xa4, 0xab, 0x1c, 0xee, 0x54,
+	0xf2, 0xad, 0x29, 0x6d, 0x32, 0x37, 0xb2, 0x16,
+	0x77, 0x5c, 0xdc, 0x2e, 0x54, 0xec, 0x75, 0x26,
+	0xc6, 0x36, 0xd9, 0x17, 0x2c, 0xf1, 0x7a, 0xdc,
+	0x4b, 0xf1, 0xe2, 0xd9, 0x95, 0xba, 0xac, 0x87,
+	0xc1, 0xf3, 0x8e, 0x58, 0x08, 0xd8, 0x87, 0x60,
+	0xc9, 0xee, 0x6a, 0xde, 0xa4, 0xd2, 0xfc, 0x0d,
+	0xe5, 0x36, 0xc4, 0x5c, 0x52, 0xb3, 0x07, 0x54,
+	0x65, 0x24, 0xc1, 0xb1, 0xd1, 0xb1, 0x53, 0x13,
+	0x31, 0x79, 0x7f, 0x05, 0x76, 0xeb, 0x37, 0x59,
+	0x15, 0x2b, 0xd1, 0x3f, 0xac, 0x08, 0x97, 0xeb,
+	0x91, 0x98, 0xdf, 0x6c, 0x09, 0x0d, 0x04, 0x9f,
+	0xdc, 0x3b, 0x0e, 0x60, 0x68, 0x47, 0x23, 0x15,
+	0x16, 0xc6, 0x0b, 0x35, 0xf8, 0x77, 0xa2, 0x78,
+	0x50, 0xd4, 0x64, 0x22, 0x33, 0xff, 0xfb, 0x93,
+	0x71, 0x46, 0x50, 0x39, 0x1b, 0x9c, 0xea, 0x4e,
+	0x8d, 0x0c, 0x37, 0xe5, 0x5c, 0x51, 0x3a, 0x31,
+	0xb2, 0x85, 0x84, 0x3f, 0x41, 0xee, 0xa2, 0xc1,
+	0xc6, 0x13, 0x3b, 0x54, 0x28, 0xd2, 0x18, 0x37,
+	0xcc, 0x46, 0x9f, 0x6a, 0x91, 0x3d, 0x5a, 0x15,
+	0x3c, 0x89, 0xa3, 0x61, 0x06, 0x7d, 0x2e, 0x78,
+	0xbe, 0x7d, 0x40, 0xba, 0x2f, 0x95, 0xb1, 0x2f,
+	0x87, 0x3b, 0x8a, 0xbe, 0x6a, 0xf4, 0xc2, 0x31,
+	0x74, 0xee, 0x91, 0xe0, 0x23, 0xaa, 0x5d, 0x7f,
+	0xdd, 0xf0, 0x44, 0x8c, 0x0b, 0x59, 0x2b, 0xfc,
+	0x48, 0x3a, 0xdf, 0x07, 0x05, 0x38, 0x6c, 0xc9,
+	0xeb, 0x18, 0x24, 0x68, 0x8d, 0x58, 0x98, 0xd3,
+	0x31, 0xa3, 0xe4, 0x70, 0x59, 0xb1, 0x21, 0xbe,
+	0x7e, 0x65, 0x7d, 0xb8, 0x04, 0xab, 0xf6, 0xe4,
+	0xd7, 0xda, 0xec, 0x09, 0x8f, 0xda, 0x6d, 0x24,
+	0x07, 0xcc, 0x29, 0x17, 0x05, 0x78, 0x1a, 0xc1,
+	0xb1, 0xce, 0xfc, 0xaa, 0x2d, 0xe7, 0xcc, 0x85,
+	0x84, 0x84, 0x03, 0x2a, 0x0c, 0x3f, 0xa9, 0xf8,
+	0xfd, 0x84, 0x53, 0x59, 0x5c, 0xf0, 0xd4, 0x09,
+	0xf0, 0xd2, 0x6c, 0x32, 0x03, 0xb0, 0xa0, 0x8c,
+	0x52, 0xeb, 0x23, 0x91, 0x88, 0x43, 0x13, 0x46,
+	0xf6, 0x1e, 0xb4, 0x1b, 0xf5, 0x8e, 0x3a, 0xb5,
+	0x3d, 0x00, 0xf6, 0xe5, 0x08, 0x3d, 0x5f, 0x39,
+	0xd3, 0x21, 0x69, 0xbc, 0x03, 0x22, 0x3a, 0xd2,
+	0x5c, 0x84, 0xf8, 0x15, 0xc4, 0x80, 0x0b, 0xbc,
+	0x29, 0x3c, 0xf3, 0x95, 0x98, 0xcd, 0x8f, 0x35,
+	0xbc, 0xa5, 0x3e, 0xfc, 0xd4, 0x13, 0x9e, 0xde,
+	0x4f, 0xce, 0x71, 0x9d, 0x09, 0xad, 0xf2, 0x80,
+	0x6b, 0x65, 0x7f, 0x03, 0x00, 0x14, 0x7c, 0x15,
+	0x85, 0x40, 0x6d, 0x70, 0xea, 0xdc, 0xb3, 0x63,
+	0x35, 0x4f, 0x4d, 0xe0, 0xd9, 0xd5, 0x3c, 0x58,
+	0x56, 0x23, 0x80, 0xe2, 0x36, 0xdd, 0x75, 0x1d,
+	0x94, 0x11, 0x41, 0x8e, 0xe0, 0x81, 0x8e, 0xcf,
+	0xe0, 0xe5, 0xf6, 0xde, 0xd1, 0xe7, 0x04, 0x12,
+	0x79, 0x92, 0x2b, 0x71, 0x2a, 0x79, 0x8b, 0x7c,
+	0x44, 0x79, 0x16, 0x30, 0x4e, 0xf4, 0xf6, 0x9b,
+	0xb7, 0x40, 0xa3, 0x5a, 0xa7, 0x69, 0x3e, 0xc1,
+	0x3a, 0x04, 0xd0, 0x88, 0xa0, 0x3b, 0xdd, 0xc6,
+	0x9e, 0x7e, 0x1e, 0x1e, 0x8f, 0x44, 0xf7, 0x73,
+	0x67, 0x1e, 0x1a, 0x78, 0xfa, 0x62, 0xf4, 0xa9,
+	0xa8, 0xc6, 0x5b, 0xb8, 0xfa, 0x06, 0x7d, 0x5e,
+	0x38, 0x1c, 0x9a, 0x39, 0xe9, 0x39, 0x98, 0x22,
+	0x0b, 0xa7, 0xac, 0x0b, 0xf3, 0xbc, 0xf1, 0xeb,
+	0x8c, 0x81, 0xe3, 0x48, 0x8a, 0xed, 0x42, 0xc2,
+	0x38, 0xcf, 0x3e, 0xda, 0xd2, 0x89, 0x8d, 0x9c,
+	0x53, 0xb5, 0x2f, 0x41, 0x01, 0x26, 0x84, 0x9c,
+	0xa3, 0x56, 0xf6, 0x49, 0xc7, 0xd4, 0x9f, 0x93,
+	0x1b, 0x96, 0x49, 0x5e, 0xad, 0xb3, 0x84, 0x1f,
+	0x3c, 0xa4, 0xe0, 0x9b, 0xd1, 0x90, 0xbc, 0x38,
+	0x6c, 0xdd, 0x95, 0x4d, 0x9d, 0xb1, 0x71, 0x57,
+	0x2d, 0x34, 0xe8, 0xb8, 0x42, 0xc7, 0x99, 0x03,
+	0xc7, 0x07, 0x30, 0x65, 0x91, 0x55, 0xd5, 0x90,
+	0x70, 0x97, 0x37, 0x68, 0xd4, 0x11, 0xf9, 0xe8,
+	0xce, 0xec, 0xdc, 0x34, 0xd5, 0xd3, 0xb7, 0xc4,
+	0xb8, 0x97, 0x05, 0x92, 0xad, 0xf8, 0xe2, 0x36,
+	0x64, 0x41, 0xc9, 0xc5, 0x41, 0x77, 0x52, 0xd7,
+	0x2c, 0xa5, 0x24, 0x2f, 0xd9, 0x34, 0x0b, 0x47,
+	0x35, 0xa7, 0x28, 0x8b, 0xc5, 0xcd, 0xe9, 0x46,
+	0xac, 0x39, 0x94, 0x3c, 0x10, 0xc6, 0x29, 0x73,
+	0x0e, 0x0e, 0x5d, 0xe0, 0x71, 0x03, 0x8a, 0x72,
+	0x0e, 0x26, 0xb0, 0x7d, 0x84, 0xed, 0x95, 0x23,
+	0x49, 0x5a, 0x45, 0x83, 0x45, 0x60, 0x11, 0x4a,
+	0x46, 0x31, 0xd4, 0xd8, 0x16, 0x54, 0x98, 0x58,
+	0xed, 0x6d, 0xcc, 0x5d, 0xd6, 0x50, 0x61, 0x9f,
+	0x9d, 0xc5, 0x3e, 0x9d, 0x32, 0x47, 0xde, 0x96,
+	0xe1, 0x5d, 0xd8, 0xf8, 0xb4, 0x69, 0x6f, 0xb9,
+	0x15, 0x90, 0x57, 0x7a, 0xf6, 0xad, 0xb0, 0x5b,
+	0xf5, 0xa6, 0x36, 0x94, 0xfd, 0x84, 0xce, 0x1c,
+	0x0f, 0x4b, 0xd0, 0xc2, 0x5b, 0x6b, 0x56, 0xef,
+	0x73, 0x93, 0x0b, 0xc3, 0xee, 0xd9, 0xcf, 0xd3,
+	0xa4, 0x22, 0x58, 0xcd, 0x50, 0x6e, 0x65, 0xf4,
+	0xe9, 0xb7, 0x71, 0xaf, 0x4b, 0xb3, 0xb6, 0x2f,
+	0x0f, 0x0e, 0x3b, 0xc9, 0x85, 0x14, 0xf5, 0x17,
+	0xe8, 0x7a, 0x3a, 0xbf, 0x5f, 0x5e, 0xf8, 0x18,
+	0x48, 0xa6, 0x72, 0xab, 0x06, 0x95, 0xe9, 0xc8,
+	0xa7, 0xf4, 0x32, 0x44, 0x04, 0x0c, 0x84, 0x98,
+	0x73, 0xe3, 0x89, 0x8d, 0x5f, 0x7e, 0x4a, 0x42,
+	0x8f, 0xc5, 0x28, 0xb1, 0x82, 0xef, 0x1c, 0x97,
+	0x31, 0x3b, 0x4d, 0xe0, 0x0e, 0x10, 0x10, 0x97,
+	0x93, 0x49, 0x78, 0x2f, 0x0d, 0x86, 0x8b, 0xa1,
+	0x53, 0xa9, 0x81, 0x20, 0x79, 0xe7, 0x07, 0x77,
+	0xb6, 0xac, 0x5e, 0xd2, 0x05, 0xcd, 0xe9, 0xdb,
+	0x8a, 0x94, 0x82, 0x8a, 0x23, 0xb9, 0x3d, 0x1c,
+	0xa9, 0x7d, 0x72, 0x4a, 0xed, 0x33, 0xa3, 0xdb,
+	0x21, 0xa7, 0x86, 0x33, 0x45, 0xa5, 0xaa, 0x56,
+	0x45, 0xb5, 0x83, 0x29, 0x40, 0x47, 0x79, 0x04,
+	0x6e, 0xb9, 0x95, 0xd0, 0x81, 0x77, 0x2d, 0x48,
+	0x1e, 0xfe, 0xc3, 0xc2, 0x1e, 0xe5, 0xf2, 0xbe,
+	0xfd, 0x3b, 0x94, 0x9f, 0xc4, 0xc4, 0x26, 0x9d,
+	0xe4, 0x66, 0x1e, 0x19, 0xee, 0x6c, 0x79, 0x97,
+	0x11, 0x31, 0x4b, 0x0d, 0x01, 0xcb, 0xde, 0xa8,
+	0xf6, 0x6d, 0x7c, 0x39, 0x46, 0x4e, 0x7e, 0x3f,
+	0x94, 0x17, 0xdf, 0xa1, 0x7d, 0xd9, 0x1c, 0x8e,
+	0xbc, 0x7d, 0x33, 0x7d, 0xe3, 0x12, 0x40, 0xca,
+	0xab, 0x37, 0x11, 0x46, 0xd4, 0xae, 0xef, 0x44,
+	0xa2, 0xb3, 0x6a, 0x66, 0x0e, 0x0c, 0x90, 0x7f,
+	0xdf, 0x5c, 0x66, 0x5f, 0xf2, 0x94, 0x9f, 0xa6,
+	0x73, 0x4f, 0xeb, 0x0d, 0xad, 0xbf, 0xc0, 0x63,
+	0x5c, 0xdc, 0x46, 0x51, 0xe8, 0x8e, 0x90, 0x19,
+	0xa8, 0xa4, 0x3c, 0x91, 0x79, 0xfa, 0x7e, 0x58,
+	0x85, 0x13, 0x55, 0xc5, 0x19, 0x82, 0x37, 0x1b,
+	0x0a, 0x02, 0x1f, 0x99, 0x6b, 0x18, 0xf1, 0x28,
+	0x08, 0xa2, 0x73, 0xb8, 0x0f, 0x2e, 0xcd, 0xbf,
+	0xf3, 0x86, 0x7f, 0xea, 0xef, 0xd0, 0xbb, 0xa6,
+	0x21, 0xdf, 0x49, 0x73, 0x51, 0xcc, 0x36, 0xd3,
+	0x3e, 0xa0, 0xf8, 0x44, 0xdf, 0xd3, 0xa6, 0xbe,
+	0x8a, 0xd4, 0x57, 0xdd, 0x72, 0x94, 0x61, 0x0f,
+	0x82, 0xd1, 0x07, 0xb8, 0x7c, 0x18, 0x83, 0xdf,
+	0x3a, 0xe5, 0x50, 0x6a, 0x82, 0x20, 0xac, 0xa9,
+	0xa8, 0xff, 0xd9, 0xf3, 0x77, 0x33, 0x5a, 0x9e,
+	0x7f, 0x6d, 0xfe, 0x5d, 0x33, 0x41, 0x42, 0xe7,
+	0x6c, 0x19, 0xe0, 0x44, 0x8a, 0x15, 0xf6, 0x70,
+	0x98, 0xb7, 0x68, 0x4d, 0xfa, 0x97, 0x39, 0xb0,
+	0x8e, 0xe8, 0x84, 0x8b, 0x75, 0x30, 0xb7, 0x7d,
+	0x92, 0x69, 0x20, 0x9c, 0x81, 0xfb, 0x4b, 0xf4,
+	0x01, 0x50, 0xeb, 0xce, 0x0c, 0x1c, 0x6c, 0xb5,
+	0x4a, 0xd7, 0x27, 0x0c, 0xce, 0xbb, 0xe5, 0x85,
+	0xf0, 0xb6, 0xee, 0xd5, 0x70, 0xdd, 0x3b, 0xfc,
+	0xd4, 0x99, 0xf1, 0x33, 0xdd, 0x8b, 0xc4, 0x2f,
+	0xae, 0xab, 0x74, 0x96, 0x32, 0xc7, 0x4c, 0x56,
+	0x3c, 0x89, 0x0f, 0x96, 0x0b, 0x42, 0xc0, 0xcb,
+	0xee, 0x0f, 0x0b, 0x8c, 0xfb, 0x7e, 0x47, 0x7b,
+	0x64, 0x48, 0xfd, 0xb2, 0x00, 0x80, 0x89, 0xa5,
+	0x13, 0x55, 0x62, 0xfc, 0x8f, 0xe2, 0x42, 0x03,
+	0xb7, 0x4e, 0x2a, 0x79, 0xb4, 0x82, 0xea, 0x23,
+	0x49, 0xda, 0xaf, 0x52, 0x63, 0x1e, 0x60, 0x03,
+	0x89, 0x06, 0x44, 0x46, 0x08, 0xc3, 0xc4, 0x87,
+	0x70, 0x2e, 0xda, 0x94, 0xad, 0x6b, 0xe0, 0xe4,
+	0xd1, 0x8a, 0x06, 0xc2, 0xa8, 0xc0, 0xa7, 0x43,
+	0x3c, 0x47, 0x52, 0x0e, 0xc3, 0x77, 0x81, 0x11,
+	0x67, 0x0e, 0xa0, 0x70, 0x04, 0x47, 0x29, 0x40,
+	0x86, 0x0d, 0x34, 0x56, 0xa7, 0xc9, 0x35, 0x59,
+	0x68, 0xdc, 0x93, 0x81, 0x70, 0xee, 0x86, 0xd9,
+	0x80, 0x06, 0x40, 0x4f, 0x1a, 0x0d, 0x40, 0x30,
+	0x0b, 0xcb, 0x96, 0x47, 0xc1, 0xb7, 0x52, 0xfd,
+	0x56, 0xe0, 0x72, 0x4b, 0xfb, 0xbd, 0x92, 0x45,
+	0x61, 0x71, 0xc2, 0x33, 0x11, 0xbf, 0x52, 0x83,
+	0x79, 0x26, 0xe0, 0x49, 0x6b, 0xb7, 0x05, 0x8b,
+	0xe8, 0x0e, 0x87, 0x31, 0xd7, 0x9d, 0x8a, 0xf5,
+	0xc0, 0x5f, 0x2e, 0x58, 0x4a, 0xdb, 0x11, 0xb3,
+	0x6c, 0x30, 0x2a, 0x46, 0x19, 0xe3, 0x27, 0x84,
+	0x1f, 0x63, 0x6e, 0xf6, 0x57, 0xc7, 0xc9, 0xd8,
+	0x5e, 0xba, 0xb3, 0x87, 0xd5, 0x83, 0x26, 0x34,
+	0x21, 0x9e, 0x65, 0xde, 0x42, 0xd3, 0xbe, 0x7b,
+	0xbc, 0x91, 0x71, 0x44, 0x4d, 0x99, 0x3b, 0x31,
+	0xe5, 0x3f, 0x11, 0x4e, 0x7f, 0x13, 0x51, 0x3b,
+	0xae, 0x79, 0xc9, 0xd3, 0x81, 0x8e, 0x25, 0x40,
+	0x10, 0xfc, 0x07, 0x1e, 0xf9, 0x7b, 0x9a, 0x4b,
+	0x6c, 0xe3, 0xb3, 0xad, 0x1a, 0x0a, 0xdd, 0x9e,
+	0x59, 0x0c, 0xa2, 0xcd, 0xae, 0x48, 0x4a, 0x38,
+	0x5b, 0x47, 0x41, 0x94, 0x65, 0x6b, 0xbb, 0xeb,
+	0x5b, 0xe3, 0xaf, 0x07, 0x5b, 0xd4, 0x4a, 0xa2,
+	0xc9, 0x5d, 0x2f, 0x64, 0x03, 0xd7, 0x3a, 0x2c,
+	0x6e, 0xce, 0x76, 0x95, 0xb4, 0xb3, 0xc0, 0xf1,
+	0xe2, 0x45, 0x73, 0x7a, 0x5c, 0xab, 0xc1, 0xfc,
+	0x02, 0x8d, 0x81, 0x29, 0xb3, 0xac, 0x07, 0xec,
+	0x40, 0x7d, 0x45, 0xd9, 0x7a, 0x59, 0xee, 0x34,
+	0xf0, 0xe9, 0xd5, 0x7b, 0x96, 0xb1, 0x3d, 0x95,
+	0xcc, 0x86, 0xb5, 0xb6, 0x04, 0x2d, 0xb5, 0x92,
+	0x7e, 0x76, 0xf4, 0x06, 0xa9, 0xa3, 0x12, 0x0f,
+	0xb1, 0xaf, 0x26, 0xba, 0x7c, 0xfc, 0x7e, 0x1c,
+	0xbc, 0x2c, 0x49, 0x97, 0x53, 0x60, 0x13, 0x0b,
+	0xa6, 0x61, 0x83, 0x89, 0x42, 0xd4, 0x17, 0x0c,
+	0x6c, 0x26, 0x52, 0xc3, 0xb3, 0xd4, 0x67, 0xf5,
+	0xe3, 0x04, 0xb7, 0xf4, 0xcb, 0x80, 0xb8, 0xcb,
+	0x77, 0x56, 0x3e, 0xaa, 0x57, 0x54, 0xee, 0xb4,
+	0x2c, 0x67, 0xcf, 0xf2, 0xdc, 0xbe, 0x55, 0xf9,
+	0x43, 0x1f, 0x6e, 0x22, 0x97, 0x67, 0x7f, 0xc4,
+	0xef, 0xb1, 0x26, 0x31, 0x1e, 0x27, 0xdf, 0x41,
+	0x80, 0x47, 0x6c, 0xe2, 0xfa, 0xa9, 0x8c, 0x2a,
+	0xf6, 0xf2, 0xab, 0xf0, 0x15, 0xda, 0x6c, 0xc8,
+	0xfe, 0xb5, 0x23, 0xde, 0xa9, 0x05, 0x3f, 0x06,
+	0x54, 0x4c, 0xcd, 0xe1, 0xab, 0xfc, 0x0e, 0x62,
+	0x33, 0x31, 0x73, 0x2c, 0x76, 0xcb, 0xb4, 0x47,
+	0x1e, 0x20, 0xad, 0xd8, 0xf2, 0x31, 0xdd, 0xc4,
+	0x8b, 0x0c, 0x77, 0xbe, 0xe1, 0x8b, 0x26, 0x00,
+	0x02, 0x58, 0xd6, 0x8d, 0xef, 0xad, 0x74, 0x67,
+	0xab, 0x3f, 0xef, 0xcb, 0x6f, 0xb0, 0xcc, 0x81,
+	0x44, 0x4c, 0xaf, 0xe9, 0x49, 0x4f, 0xdb, 0xa0,
+	0x25, 0xa4, 0xf0, 0x89, 0xf1, 0xbe, 0xd8, 0x10,
+	0xff, 0xb1, 0x3b, 0x4b, 0xfa, 0x98, 0xf5, 0x79,
+	0x6d, 0x1e, 0x69, 0x4d, 0x57, 0xb1, 0xc8, 0x19,
+	0x1b, 0xbd, 0x1e, 0x8c, 0x84, 0xb7, 0x7b, 0xe8,
+	0xd2, 0x2d, 0x09, 0x41, 0x41, 0x37, 0x3d, 0xb1,
+	0x6f, 0x26, 0x5d, 0x71, 0x16, 0x3d, 0xb7, 0x83,
+	0x27, 0x2c, 0xa7, 0xb6, 0x50, 0xbd, 0x91, 0x86,
+	0xab, 0x24, 0xa1, 0x38, 0xfd, 0xea, 0x71, 0x55,
+	0x7e, 0x9a, 0x07, 0x77, 0x4b, 0xfa, 0x61, 0x66,
+	0x20, 0x1e, 0x28, 0x95, 0x18, 0x1b, 0xa4, 0xa0,
+	0xfd, 0xc0, 0x89, 0x72, 0x43, 0xd9, 0x3b, 0x49,
+	0x5a, 0x3f, 0x9d, 0xbf, 0xdb, 0xb4, 0x46, 0xea,
+	0x42, 0x01, 0x77, 0x23, 0x68, 0x95, 0xb6, 0x24,
+	0xb3, 0xa8, 0x6c, 0x28, 0x3b, 0x11, 0x40, 0x7e,
+	0x18, 0x65, 0x6d, 0xd8, 0x24, 0x42, 0x7d, 0x88,
+	0xc0, 0x52, 0xd9, 0x05, 0xe4, 0x95, 0x90, 0x87,
+	0x8c, 0xf4, 0xd0, 0x6b, 0xb9, 0x83, 0x99, 0x34,
+	0x6d, 0xfe, 0x54, 0x40, 0x94, 0x52, 0x21, 0x4f,
+	0x14, 0x25, 0xc5, 0xd6, 0x5e, 0x95, 0xdc, 0x0a,
+	0x2b, 0x89, 0x20, 0x11, 0x84, 0x48, 0xd6, 0x3a,
+	0xcd, 0x5c, 0x24, 0xad, 0x62, 0xe3, 0xb1, 0x93,
+	0x25, 0x8d, 0xcd, 0x7e, 0xfc, 0x27, 0xa3, 0x37,
+	0xfd, 0x84, 0xfc, 0x1b, 0xb2, 0xf1, 0x27, 0x38,
+	0x5a, 0xb7, 0xfc, 0xf2, 0xfa, 0x95, 0x66, 0xd4,
+	0xfb, 0xba, 0xa7, 0xd7, 0xa3, 0x72, 0x69, 0x48,
+	0x48, 0x8c, 0xeb, 0x28, 0x89, 0xfe, 0x33, 0x65,
+	0x5a, 0x36, 0x01, 0x7e, 0x06, 0x79, 0x0a, 0x09,
+	0x3b, 0x74, 0x11, 0x9a, 0x6e, 0xbf, 0xd4, 0x9e,
+	0x58, 0x90, 0x49, 0x4f, 0x4d, 0x08, 0xd4, 0xe5,
+	0x4a, 0x09, 0x21, 0xef, 0x8b, 0xb8, 0x74, 0x3b,
+	0x91, 0xdd, 0x36, 0x85, 0x60, 0x2d, 0xfa, 0xd4,
+	0x45, 0x7b, 0x45, 0x53, 0xf5, 0x47, 0x87, 0x7e,
+	0xa6, 0x37, 0xc8, 0x78, 0x7a, 0x68, 0x9d, 0x8d,
+	0x65, 0x2c, 0x0e, 0x91, 0x5c, 0xa2, 0x60, 0xf0,
+	0x8e, 0x3f, 0xe9, 0x1a, 0xcd, 0xaa, 0xe7, 0xd5,
+	0x77, 0x18, 0xaf, 0xc9, 0xbc, 0x18, 0xea, 0x48,
+	0x1b, 0xfb, 0x22, 0x48, 0x70, 0x16, 0x29, 0x9e,
+	0x5b, 0xc1, 0x2c, 0x66, 0x23, 0xbc, 0xf0, 0x1f,
+	0xef, 0xaf, 0xe4, 0xd6, 0x04, 0x19, 0x82, 0x7a,
+	0x0b, 0xba, 0x4b, 0x46, 0xb1, 0x6a, 0x85, 0x5d,
+	0xb4, 0x73, 0xd6, 0x21, 0xa1, 0x71, 0x60, 0x14,
+	0xee, 0x0a, 0x77, 0xc4, 0x66, 0x2e, 0xf9, 0x69,
+	0x30, 0xaf, 0x41, 0x0b, 0xc8, 0x83, 0x3c, 0x53,
+	0x99, 0x19, 0x27, 0x46, 0xf7, 0x41, 0x6e, 0x56,
+	0xdc, 0x94, 0x28, 0x67, 0x4e, 0xb7, 0x25, 0x48,
+	0x8a, 0xc2, 0xe0, 0x60, 0x96, 0xcc, 0x18, 0xf4,
+	0x84, 0xdd, 0xa7, 0x5e, 0x3e, 0x05, 0x0b, 0x26,
+	0x26, 0xb2, 0x5c, 0x1f, 0x57, 0x1a, 0x04, 0x7e,
+	0x6a, 0xe3, 0x2f, 0xb4, 0x35, 0xb6, 0x38, 0x40,
+	0x40, 0xcd, 0x6f, 0x87, 0x2e, 0xef, 0xa3, 0xd7,
+	0xa9, 0xc2, 0xe8, 0x0d, 0x27, 0xdf, 0x44, 0x62,
+	0x99, 0xa0, 0xfc, 0xcf, 0x81, 0x78, 0xcb, 0xfe,
+	0xe5, 0xa0, 0x03, 0x4e, 0x6c, 0xd7, 0xf4, 0xaf,
+	0x7a, 0xbb, 0x61, 0x82, 0xfe, 0x71, 0x89, 0xb2,
+	0x22, 0x7c, 0x8e, 0x83, 0x04, 0xce, 0xf6, 0x5d,
+	0x84, 0x8f, 0x95, 0x6a, 0x7f, 0xad, 0xfd, 0x32,
+	0x9c, 0x5e, 0xe4, 0x9c, 0x89, 0x60, 0x54, 0xaa,
+	0x96, 0x72, 0xd2, 0xd7, 0x36, 0x85, 0xa9, 0x45,
+	0xd2, 0x2a, 0xa1, 0x81, 0x49, 0x6f, 0x7e, 0x04,
+	0xfa, 0xe2, 0xfe, 0x90, 0x26, 0x77, 0x5a, 0x33,
+	0xb8, 0x04, 0x9a, 0x7a, 0xe6, 0x4c, 0x4f, 0xad,
+	0x72, 0x96, 0x08, 0x28, 0x58, 0x13, 0xf8, 0xc4,
+	0x1c, 0xf0, 0xc3, 0x45, 0x95, 0x49, 0x20, 0x8c,
+	0x9f, 0x39, 0x70, 0xe1, 0x77, 0xfe, 0xd5, 0x4b,
+	0xaf, 0x86, 0xda, 0xef, 0x22, 0x06, 0x83, 0x36,
+	0x29, 0x12, 0x11, 0x40, 0xbc, 0x3b, 0x86, 0xaa,
+	0xaa, 0x65, 0x60, 0xc3, 0x80, 0xca, 0xed, 0xa9,
+	0xf3, 0xb0, 0x79, 0x96, 0xa2, 0x55, 0x27, 0x28,
+	0x55, 0x73, 0x26, 0xa5, 0x50, 0xea, 0x92, 0x4b,
+	0x3c, 0x5c, 0x82, 0x33, 0xf0, 0x01, 0x3f, 0x03,
+	0xc1, 0x08, 0x05, 0xbf, 0x98, 0xf4, 0x9b, 0x6d,
+	0xa5, 0xa8, 0xb4, 0x82, 0x0c, 0x06, 0xfa, 0xff,
+	0x2d, 0x08, 0xf3, 0x05, 0x4f, 0x57, 0x2a, 0x39,
+	0xd4, 0x83, 0x0d, 0x75, 0x51, 0xd8, 0x5b, 0x1b,
+	0xd3, 0x51, 0x5a, 0x32, 0x2a, 0x9b, 0x32, 0xb2,
+	0xf2, 0xa4, 0x96, 0x12, 0xf2, 0xae, 0x40, 0x34,
+	0x67, 0xa8, 0xf5, 0x44, 0xd5, 0x35, 0x53, 0xfe,
+	0xa3, 0x60, 0x96, 0x63, 0x0f, 0x1f, 0x6e, 0xb0,
+	0x5a, 0x42, 0xa6, 0xfc, 0x51, 0x0b, 0x60, 0x27,
+	0xbc, 0x06, 0x71, 0xed, 0x65, 0x5b, 0x23, 0x86,
+	0x4a, 0x07, 0x3b, 0x22, 0x07, 0x46, 0xe6, 0x90,
+	0x3e, 0xf3, 0x25, 0x50, 0x1b, 0x4c, 0x7f, 0x03,
+	0x08, 0xa8, 0x36, 0x6b, 0x87, 0xe5, 0xe3, 0xdb,
+	0x9a, 0x38, 0x83, 0xff, 0x9f, 0x1a, 0x9f, 0x57,
+	0xa4, 0x2a, 0xf6, 0x37, 0xbc, 0x1a, 0xff, 0xc9,
+	0x1e, 0x35, 0x0c, 0xc3, 0x7c, 0xa3, 0xb2, 0xe5,
+	0xd2, 0xc6, 0xb4, 0x57, 0x47, 0xe4, 0x32, 0x16,
+	0x6d, 0xa9, 0xae, 0x64, 0xe6, 0x2d, 0x8d, 0xc5,
+	0x8d, 0x50, 0x8e, 0xe8, 0x1a, 0x22, 0x34, 0x2a,
+	0xd9, 0xeb, 0x51, 0x90, 0x4a, 0xb1, 0x41, 0x7d,
+	0x64, 0xf9, 0xb9, 0x0d, 0xf6, 0x23, 0x33, 0xb0,
+	0x33, 0xf4, 0xf7, 0x3f, 0x27, 0x84, 0xc6, 0x0f,
+	0x54, 0xa5, 0xc0, 0x2e, 0xec, 0x0b, 0x3a, 0x48,
+	0x6e, 0x80, 0x35, 0x81, 0x43, 0x9b, 0x90, 0xb1,
+	0xd0, 0x2b, 0xea, 0x21, 0xdc, 0xda, 0x5b, 0x09,
+	0xf4, 0xcc, 0x10, 0xb4, 0xc7, 0xfe, 0x79, 0x51,
+	0xc3, 0xc5, 0xac, 0x88, 0x74, 0x84, 0x0b, 0x4b,
+	0xca, 0x79, 0x16, 0x29, 0xfb, 0x69, 0x54, 0xdf,
+	0x41, 0x7e, 0xe9, 0xc7, 0x8e, 0xea, 0xa5, 0xfe,
+	0xfc, 0x76, 0x0e, 0x90, 0xc4, 0x92, 0x38, 0xad,
+	0x7b, 0x48, 0xe6, 0x6e, 0xf7, 0x21, 0xfd, 0x4e,
+	0x93, 0x0a, 0x7b, 0x41, 0x83, 0x68, 0xfb, 0x57,
+	0x51, 0x76, 0x34, 0xa9, 0x6c, 0x00, 0xaa, 0x4f,
+	0x66, 0x65, 0x98, 0x4a, 0x4f, 0xa3, 0xa0, 0xef,
+	0x69, 0x3f, 0xe3, 0x1c, 0x92, 0x8c, 0xfd, 0xd8,
+	0xe8, 0xde, 0x7c, 0x7f, 0x3e, 0x84, 0x8e, 0x69,
+	0x3c, 0xf1, 0xf2, 0x05, 0x46, 0xdc, 0x2f, 0x9d,
+	0x5e, 0x6e, 0x4c, 0xfb, 0xb5, 0x99, 0x2a, 0x59,
+	0x63, 0xc1, 0x34, 0xbc, 0x57, 0xc0, 0x0d, 0xb9,
+	0x61, 0x25, 0xf3, 0x33, 0x23, 0x51, 0xb6, 0x0d,
+	0x07, 0xa6, 0xab, 0x94, 0x4a, 0xb7, 0x2a, 0xea,
+	0xee, 0xac, 0xa3, 0xc3, 0x04, 0x8b, 0x0e, 0x56,
+	0xfe, 0x44, 0xa7, 0x39, 0xe2, 0xed, 0xed, 0xb4,
+	0x22, 0x2b, 0xac, 0x12, 0x32, 0x28, 0x91, 0xd8,
+	0xa5, 0xab, 0xff, 0x5f, 0xe0, 0x4b, 0xda, 0x78,
+	0x17, 0xda, 0xf1, 0x01, 0x5b, 0xcd, 0xe2, 0x5f,
+	0x50, 0x45, 0x73, 0x2b, 0xe4, 0x76, 0x77, 0xf4,
+	0x64, 0x1d, 0x43, 0xfb, 0x84, 0x7a, 0xea, 0x91,
+	0xae, 0xf9, 0x9e, 0xb7, 0xb4, 0xb0, 0x91, 0x5f,
+	0x16, 0x35, 0x9a, 0x11, 0xb8, 0xc7, 0xc1, 0x8c,
+	0xc6, 0x10, 0x8d, 0x2f, 0x63, 0x4a, 0xa7, 0x57,
+	0x3a, 0x51, 0xd6, 0x32, 0x2d, 0x64, 0x72, 0xd4,
+	0x66, 0xdc, 0x10, 0xa6, 0x67, 0xd6, 0x04, 0x23,
+	0x9d, 0x0a, 0x11, 0x77, 0xdd, 0x37, 0x94, 0x17,
+	0x3c, 0xbf, 0x8b, 0x65, 0xb0, 0x2e, 0x5e, 0x66,
+	0x47, 0x64, 0xac, 0xdd, 0xf0, 0x84, 0xfd, 0x39,
+	0xfa, 0x15, 0x5d, 0xef, 0xae, 0xca, 0xc1, 0x36,
+	0xa7, 0x5c, 0xbf, 0xc7, 0x08, 0xc2, 0x66, 0x00,
+	0x74, 0x74, 0x4e, 0x27, 0x3f, 0x55, 0x8a, 0xb7,
+	0x38, 0x66, 0x83, 0x6d, 0xcf, 0x99, 0x9e, 0x60,
+	0x8f, 0xdd, 0x2e, 0x62, 0x22, 0x0e, 0xef, 0x0c,
+	0x98, 0xa7, 0x85, 0x74, 0x3b, 0x9d, 0xec, 0x9e,
+	0xa9, 0x19, 0x72, 0xa5, 0x7f, 0x2c, 0x39, 0xb7,
+	0x7d, 0xb7, 0xf1, 0x12, 0x65, 0x27, 0x4b, 0x5a,
+	0xde, 0x17, 0xfe, 0xad, 0x44, 0xf3, 0x20, 0x4d,
+	0xfd, 0xe4, 0x1f, 0xb5, 0x81, 0xb0, 0x36, 0x37,
+	0x08, 0x6f, 0xc3, 0x0c, 0xe9, 0x85, 0x98, 0x82,
+	0xa9, 0x62, 0x0c, 0xc4, 0x97, 0xc0, 0x50, 0xc8,
+	0xa7, 0x3c, 0x50, 0x9f, 0x43, 0xb9, 0xcd, 0x5e,
+	0x4d, 0xfa, 0x1c, 0x4b, 0x0b, 0xa9, 0x98, 0x85,
+	0x38, 0x92, 0xac, 0x8d, 0xe4, 0xad, 0x9b, 0x98,
+	0xab, 0xd9, 0x38, 0xac, 0x62, 0x52, 0xa3, 0x22,
+	0x63, 0x0f, 0xbf, 0x95, 0x48, 0xdf, 0x69, 0xe7,
+	0x8b, 0x33, 0xd5, 0xb2, 0xbd, 0x05, 0x49, 0x49,
+	0x9d, 0x57, 0x73, 0x19, 0x33, 0xae, 0xfa, 0x33,
+	0xf1, 0x19, 0xa8, 0x80, 0xce, 0x04, 0x9f, 0xbc,
+	0x1d, 0x65, 0x82, 0x1b, 0xe5, 0x3a, 0x51, 0xc8,
+	0x1c, 0x21, 0xe3, 0x5d, 0xf3, 0x7d, 0x9b, 0x2f,
+	0x2c, 0x1d, 0x4a, 0x7f, 0x9b, 0x68, 0x35, 0xa3,
+	0xb2, 0x50, 0xf7, 0x62, 0x79, 0xcd, 0xf4, 0x98,
+	0x4f, 0xe5, 0x63, 0x7c, 0x3e, 0x45, 0x31, 0x8c,
+	0x16, 0xa0, 0x12, 0xc8, 0x58, 0xce, 0x39, 0xa6,
+	0xbc, 0x54, 0xdb, 0xc5, 0xe0, 0xd5, 0xba, 0xbc,
+	0xb9, 0x04, 0xf4, 0x8d, 0xe8, 0x2f, 0x15, 0x9d,
+};
+
+/* 100 test cases */
+static struct crc_test {
+	u32 crc;	/* random starting crc */
+	u32 start;	/* random 6 bit offset in buf */
+	u32 length;	/* random 11 bit length of test */
+	u32 crc_le;	/* expected crc32_le result */
+	u32 crc_be;	/* expected crc32_be result */
+} test[] =
 {
-	while (len--) {
-		unsigned char x = bitrev8(*buf);
-		*buf++ = x;
-	}
-}
+	{0x674bf11d, 0x00000038, 0x00000542, 0x0af6d466, 0xd8b6e4c1},
+	{0x35c672c6, 0x0000003a, 0x000001aa, 0xc6d3dfba, 0x28aaf3ad},
+	{0x496da28e, 0x00000039, 0x000005af, 0xd933660f, 0x5d57e81f},
+	{0x09a9b90e, 0x00000027, 0x000001f8, 0xb45fe007, 0xf45fca9a},
+	{0xdc97e5a9, 0x00000025, 0x000003b6, 0xf81a3562, 0xe0126ba2},
+	{0x47c58900, 0x0000000a, 0x000000b9, 0x8e58eccf, 0xf3afc793},
+	{0x292561e8, 0x0000000c, 0x00000403, 0xa2ba8aaf, 0x0b797aed},
+	{0x415037f6, 0x00000003, 0x00000676, 0xa17d52e8, 0x7f0fdf35},
+	{0x3466e707, 0x00000026, 0x00000042, 0x258319be, 0x75c484a2},
+	{0xafd1281b, 0x00000023, 0x000002ee, 0x4428eaf8, 0x06c7ad10},
+	{0xd3857b18, 0x00000028, 0x000004a2, 0x5c430821, 0xb062b7cb},
+	{0x1d825a8f, 0x0000002b, 0x0000050b, 0xd2c45f0c, 0xd68634e0},
+	{0x5033e3bc, 0x0000000b, 0x00000078, 0xa3ea4113, 0xac6d31fb},
+	{0x94f1fb5e, 0x0000000f, 0x000003a2, 0xfbfc50b1, 0x3cfe50ed},
+	{0xc9a0fe14, 0x00000009, 0x00000473, 0x5fb61894, 0x87070591},
+	{0x88a034b1, 0x0000001c, 0x000005ad, 0xc1b16053, 0x46f95c67},
+	{0xf0f72239, 0x00000020, 0x0000026d, 0xa6fa58f3, 0xf8c2c1dd},
+	{0xcc20a5e3, 0x0000003b, 0x0000067a, 0x7740185a, 0x308b979a},
+	{0xce589c95, 0x0000002b, 0x00000641, 0xd055e987, 0x40aae25b},
+	{0x78edc885, 0x00000035, 0x000005be, 0xa39cb14b, 0x035b0d1f},
+	{0x9d40a377, 0x0000003b, 0x00000038, 0x1f47ccd2, 0x197fbc9d},
+	{0x703d0e01, 0x0000003c, 0x000006f1, 0x88735e7c, 0xfed57c5a},
+	{0x776bf505, 0x0000000f, 0x000005b2, 0x5cc4fc01, 0xf32efb97},
+	{0x4a3e7854, 0x00000027, 0x000004b8, 0x8d923c82, 0x0cbfb4a2},
+	{0x209172dd, 0x0000003b, 0x00000356, 0xb89e9c2b, 0xd7868138},
+	{0x3ba4cc5b, 0x0000002f, 0x00000203, 0xe51601a9, 0x5b2a1032},
+	{0xfc62f297, 0x00000000, 0x00000079, 0x71a8e1a2, 0x5d88685f},
+	{0x64280b8b, 0x00000016, 0x000007ab, 0x0fa7a30c, 0xda3a455f},
+	{0x97dd724b, 0x00000033, 0x000007ad, 0x5788b2f4, 0xd7326d32},
+	{0x61394b52, 0x00000035, 0x00000571, 0xc66525f1, 0xcabe7fef},
+	{0x29b4faff, 0x00000024, 0x0000006e, 0xca13751e, 0x993648e0},
+	{0x29bfb1dc, 0x0000000b, 0x00000244, 0x436c43f7, 0x429f7a59},
+	{0x86ae934b, 0x00000035, 0x00000104, 0x0760ec93, 0x9cf7d0f4},
+	{0xc4c1024e, 0x0000002e, 0x000006b1, 0x6516a3ec, 0x19321f9c},
+	{0x3287a80a, 0x00000026, 0x00000496, 0x0b257eb1, 0x754ebd51},
+	{0xa4db423e, 0x00000023, 0x0000045d, 0x9b3a66dc, 0x873e9f11},
+	{0x7a1078df, 0x00000015, 0x0000014a, 0x8c2484c5, 0x6a628659},
+	{0x6048bd5b, 0x00000006, 0x0000006a, 0x897e3559, 0xac9961af},
+	{0xd8f9ea20, 0x0000003d, 0x00000277, 0x60eb905b, 0xed2aaf99},
+	{0xea5ec3b4, 0x0000002a, 0x000004fe, 0x869965dc, 0x6c1f833b},
+	{0x2dfb005d, 0x00000016, 0x00000345, 0x6a3b117e, 0xf05e8521},
+	{0x5a214ade, 0x00000020, 0x000005b6, 0x467f70be, 0xcb22ccd3},
+	{0xf0ab9cca, 0x00000032, 0x00000515, 0xed223df3, 0x7f3ef01d},
+	{0x91b444f9, 0x0000002e, 0x000007f8, 0x84e9a983, 0x5676756f},
+	{0x1b5d2ddb, 0x0000002e, 0x0000012c, 0xba638c4c, 0x3f42047b},
+	{0xd824d1bb, 0x0000003a, 0x000007b5, 0x6288653b, 0x3a3ebea0},
+	{0x0470180c, 0x00000034, 0x000001f0, 0x9d5b80d6, 0x3de08195},
+	{0xffaa3a3f, 0x00000036, 0x00000299, 0xf3a82ab8, 0x53e0c13d},
+	{0x6406cfeb, 0x00000023, 0x00000600, 0xa920b8e8, 0xe4e2acf4},
+	{0xb24aaa38, 0x0000003e, 0x000004a1, 0x657cc328, 0x5077b2c3},
+	{0x58b2ab7c, 0x00000039, 0x000002b4, 0x3a17ee7e, 0x9dcb3643},
+	{0x3db85970, 0x00000006, 0x000002b6, 0x95268b59, 0xb9812c10},
+	{0x857830c5, 0x00000003, 0x00000590, 0x4ef439d5, 0xf042161d},
+	{0xe1fcd978, 0x0000003e, 0x000007d8, 0xae8d8699, 0xce0a1ef5},
+	{0xb982a768, 0x00000016, 0x000006e0, 0x62fad3df, 0x5f8a067b},
+	{0x1d581ce8, 0x0000001e, 0x0000058b, 0xf0f5da53, 0x26e39eee},
+	{0x2456719b, 0x00000025, 0x00000503, 0x4296ac64, 0xd50e4c14},
+	{0xfae6d8f2, 0x00000000, 0x0000055d, 0x057fdf2e, 0x2a31391a},
+	{0xcba828e3, 0x00000039, 0x000002ce, 0xe3f22351, 0x8f00877b},
+	{0x13d25952, 0x0000000a, 0x0000072d, 0x76d4b4cc, 0x5eb67ec3},
+	{0x0342be3f, 0x00000015, 0x00000599, 0xec75d9f1, 0x9d4d2826},
+	{0xeaa344e0, 0x00000014, 0x000004d8, 0x72a4c981, 0x2064ea06},
+	{0xbbb52021, 0x0000003b, 0x00000272, 0x04af99fc, 0xaf042d35},
+	{0xb66384dc, 0x0000001d, 0x000007fc, 0xd7629116, 0x782bd801},
+	{0x616c01b6, 0x00000022, 0x000002c8, 0x5b1dab30, 0x783ce7d2},
+	{0xce2bdaad, 0x00000016, 0x0000062a, 0x932535c8, 0x3f02926d},
+	{0x00fe84d7, 0x00000005, 0x00000205, 0x850e50aa, 0x753d649c},
+	{0xbebdcb4c, 0x00000006, 0x0000055d, 0xbeaa37a2, 0x2d8c9eba},
+	{0xd8b1a02a, 0x00000010, 0x00000387, 0x5017d2fc, 0x503541a5},
+	{0x3b96cad2, 0x00000036, 0x00000347, 0x1d2372ae, 0x926cd90b},
+	{0xc94c1ed7, 0x00000005, 0x0000038b, 0x9e9fdb22, 0x144a9178},
+	{0x1aad454e, 0x00000025, 0x000002b2, 0xc3f6315c, 0x5c7a35b3},
+	{0xa4fec9a6, 0x00000000, 0x000006d6, 0x90be5080, 0xa4107605},
+	{0x1bbe71e2, 0x0000001f, 0x000002fd, 0x4e504c3b, 0x284ccaf1},
+	{0x4201c7e4, 0x00000002, 0x000002b7, 0x7822e3f9, 0x0cc912a9},
+	{0x23fddc96, 0x00000003, 0x00000627, 0x8a385125, 0x07767e78},
+	{0xd82ba25c, 0x00000016, 0x0000063e, 0x98e4148a, 0x283330c9},
+	{0x786f2032, 0x0000002d, 0x0000060f, 0xf201600a, 0xf561bfcd},
+	{0xfebe4e1f, 0x0000002a, 0x000004f2, 0x95e51961, 0xfd80dcab},
+	{0x1a6e0a39, 0x00000008, 0x00000672, 0x8af6c2a5, 0x78dd84cb},
+	{0x56000ab8, 0x0000000e, 0x000000e5, 0x36bacb8f, 0x22ee1f77},
+	{0x4717fe0c, 0x00000000, 0x000006ec, 0x8439f342, 0x5c8e03da},
+	{0xd5d5d68e, 0x0000003c, 0x000003a3, 0x46fff083, 0x177d1b39},
+	{0xc25dd6c6, 0x00000024, 0x000006c0, 0x5ceb8eb4, 0x892b0d16},
+	{0xe9b11300, 0x00000023, 0x00000683, 0x07a5d59a, 0x6c6a3208},
+	{0x95cd285e, 0x00000001, 0x00000047, 0x7b3a4368, 0x0202c07e},
+	{0xd9245a25, 0x0000001e, 0x000003a6, 0xd33c1841, 0x1936c0d5},
+	{0x103279db, 0x00000006, 0x0000039b, 0xca09b8a0, 0x77d62892},
+	{0x1cba3172, 0x00000027, 0x000001c8, 0xcb377194, 0xebe682db},
+	{0x8f613739, 0x0000000c, 0x000001df, 0xb4b0bc87, 0x7710bd43},
+	{0x1c6aa90d, 0x0000001b, 0x0000053c, 0x70559245, 0xda7894ac},
+	{0xaabe5b93, 0x0000003d, 0x00000715, 0xcdbf42fa, 0x0c3b99e7},
+	{0xf15dd038, 0x00000006, 0x000006db, 0x6e104aea, 0x8d5967f2},
+	{0x584dd49c, 0x00000020, 0x000007bc, 0x36b6cfd6, 0xad4e23b2},
+	{0x5d8c9506, 0x00000020, 0x00000470, 0x4c62378e, 0x31d92640},
+	{0xb80d17b0, 0x00000032, 0x00000346, 0x22a5bb88, 0x9a7ec89f},
+	{0xdaf0592e, 0x00000023, 0x000007b0, 0x3cab3f99, 0x9b1fdd99},
+	{0x4793cc85, 0x0000000d, 0x00000706, 0xe82e04f6, 0xed3db6b7},
+	{0x82ebf64e, 0x00000009, 0x000007c3, 0x69d590a9, 0x9efa8499},
+	{0xb18a0319, 0x00000026, 0x000007db, 0x1cf98dcc, 0x8fa9ad6a},
+};
 
-static void random_garbage(unsigned char *buf, size_t len)
-{
-	while (len--)
-		*buf++ = (unsigned char) random();
-}
+#include <linux/time.h>
 
-#if 0				/* Not used at present */
-static void store_le(u32 x, unsigned char *buf)
+static int __init crc32_init(void)
 {
-	buf[0] = (unsigned char) x;
-	buf[1] = (unsigned char) (x >> 8);
-	buf[2] = (unsigned char) (x >> 16);
-	buf[3] = (unsigned char) (x >> 24);
-}
-#endif
+	int i;
+	int errors = 0;
+	int bytes = 0;
+	struct timespec start, stop;
+	u64 nsec;
+	unsigned long flags;
+
+	/* keep static to prevent cache warming code from
+	 * getting eliminated by the compiler */
+	static u32 crc;
+
+	/* pre-warm the cache */
+	for (i = 0; i < 100; i++) {
+		bytes += 2*test[i].length;
 
-static void store_be(u32 x, unsigned char *buf)
-{
-	buf[0] = (unsigned char) (x >> 24);
-	buf[1] = (unsigned char) (x >> 16);
-	buf[2] = (unsigned char) (x >> 8);
-	buf[3] = (unsigned char) x;
-}
+		crc ^= crc32_le(test[i].crc, test_buf +
+		    test[i].start, test[i].length);
 
-/*
- * This checks that CRC(buf + CRC(buf)) = 0, and that
- * CRC commutes with bit-reversal.  This has the side effect
- * of bytewise bit-reversing the input buffer, and returns
- * the CRC of the reversed buffer.
- */
-static u32 test_step(u32 init, unsigned char *buf, size_t len)
-{
-	u32 crc1, crc2;
-	size_t i;
+		crc ^= crc32_be(test[i].crc, test_buf +
+		    test[i].start, test[i].length);
+	}
 
-	crc1 = crc32_be(init, buf, len);
-	store_be(crc1, buf + len);
-	crc2 = crc32_be(init, buf, len + 4);
-	if (crc2)
-		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
-		       crc2);
-
-	for (i = 0; i <= len + 4; i++) {
-		crc2 = crc32_be(init, buf, i);
-		crc2 = crc32_be(crc2, buf + i, len + 4 - i);
-		if (crc2)
-			printf("\nCRC split fail: 0x%08x\n", crc2);
+	/* reduce OS noise */
+	local_irq_save(flags);
+	local_irq_disable();
+
+	getnstimeofday(&start);
+	for (i = 0; i < 100; i++) {
+		if (test[i].crc_le != crc32_le(test[i].crc, test_buf +
+		    test[i].start, test[i].length))
+			errors++;
+
+		if (test[i].crc_be != crc32_be(test[i].crc, test_buf +
+		    test[i].start, test[i].length))
+			errors++;
 	}
+	getnstimeofday(&stop);
+
+	local_irq_restore(flags);
+	local_irq_enable();
 
-	/* Now swap it around for the other test */
+	nsec = stop.tv_nsec - start.tv_nsec +
+		1000000000 * (stop.tv_sec - start.tv_sec);
 
-	bytereverse(buf, len + 4);
-	init = bitrev32(init);
-	crc2 = bitrev32(crc1);
-	if (crc1 != bitrev32(crc2))
-		printf("\nBit reversal fail: 0x%08x -> 0x%08x -> 0x%08x\n",
-		       crc1, crc2, bitrev32(crc2));
-	crc1 = crc32_le(init, buf, len);
-	if (crc1 != crc2)
-		printf("\nCRC endianness fail: 0x%08x != 0x%08x\n", crc1,
-		       crc2);
-	crc2 = crc32_le(init, buf, len + 4);
-	if (crc2)
-		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
-		       crc2);
-
-	for (i = 0; i <= len + 4; i++) {
-		crc2 = crc32_le(init, buf, i);
-		crc2 = crc32_le(crc2, buf + i, len + 4 - i);
-		if (crc2)
-			printf("\nCRC split fail: 0x%08x\n", crc2);
+	pr_info("crc32: CRC_LE_BITS = %d, CRC_BE BITS = %d\n",
+		 CRC_LE_BITS, CRC_BE_BITS);
+
+	if (errors)
+		pr_warn("crc32: %d self tests failed\n", errors);
+	else {
+		pr_info("crc32: self tests passed, processed %d bytes in %lld nsec\n",
+			bytes, nsec);
 	}
 
-	return crc1;
+	return 0;
 }
 
-#define SIZE 64
-#define INIT1 0
-#define INIT2 0
-
-int main(void)
+static void __exit crc32_exit(void)
 {
-	unsigned char buf1[SIZE + 4];
-	unsigned char buf2[SIZE + 4];
-	unsigned char buf3[SIZE + 4];
-	int i, j;
-	u32 crc1, crc2, crc3;
-
-	for (i = 0; i <= SIZE; i++) {
-		printf("\rTesting length %d...", i);
-		fflush(stdout);
-		random_garbage(buf1, i);
-		random_garbage(buf2, i);
-		for (j = 0; j < i; j++)
-			buf3[j] = buf1[j] ^ buf2[j];
-
-		crc1 = test_step(INIT1, buf1, i);
-		crc2 = test_step(INIT2, buf2, i);
-		/* Now check that CRC(buf1 ^ buf2) = CRC(buf1) ^ CRC(buf2) */
-		crc3 = test_step(INIT1 ^ INIT2, buf3, i);
-		if (crc3 != (crc1 ^ crc2))
-			printf("CRC XOR fail: 0x%08x != 0x%08x ^ 0x%08x\n",
-			       crc3, crc1, crc2);
-	}
-	printf("\nAll test complete.  No failures expected.\n");
-	return 0;
 }
 
-#endif				/* UNITTEST */
+module_init(crc32_init);
+module_exit(crc32_exit);
+#endif /* CONFIG_CRC32_SELFTEST */
Index: for-next/lib/Kconfig
===================================================================
--- for-next.orig/lib/Kconfig
+++ for-next/lib/Kconfig
@@ -60,6 +60,16 @@ config CRC32
 	  kernel tree does. Such modules that use library CRC32 functions
 	  require M here.
 
+config CRC32_SELFTEST
+	bool "CRC32 perform self test on init"
+	default n
+	depends on CRC32
+	help
+	  This option enables the CRC32 library functions to perform a
+	  self test on initialization. The self test computes crc32_le
+	  and crc32_be over byte strings with random alignment and length
+	  and computes the total elapsed time and number of bytes processed.
+
 config CRC7
 	tristate "CRC7 functions"
 	help



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 04/10] crc32-add-pointer-to-tab.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (2 preceding siblings ...)
  2011-08-31 22:29 ` [PATCH v6 03/10] crc32-replace-self-test.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-09-01  8:16   ` Joakim Tjernlund
  2011-08-31 22:30 ` [PATCH v6 05/10] crc32-misc-cleanup.diff Bob Pearson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Replace 2D array references by pointer references in loops.
This change has no effect on X86 code but improves PPC
performance.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c |   21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -53,20 +53,21 @@ static inline u32
 crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
 {
 # ifdef __LITTLE_ENDIAN
-#  define DO_CRC(x) crc = tab[0][(crc ^ (x)) & 255] ^ (crc >> 8)
-#  define DO_CRC4 crc = tab[3][(crc) & 255] ^ \
-		tab[2][(crc >> 8) & 255] ^ \
-		tab[1][(crc >> 16) & 255] ^ \
-		tab[0][(crc >> 24) & 255]
+#  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
+#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
+			t2[(crc >> 8) & 255] ^ \
+			t1[(crc >> 16) & 255] ^ \
+			t0[(crc >> 24) & 255]
 # else
-#  define DO_CRC(x) crc = tab[0][((crc >> 24) ^ (x)) & 255] ^ (crc << 8)
-#  define DO_CRC4 crc = tab[0][(crc) & 255] ^ \
-		tab[1][(crc >> 8) & 255] ^ \
-		tab[2][(crc >> 16) & 255] ^ \
-		tab[3][(crc >> 24) & 255]
+#  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
+#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
+			t1[(crc >> 8) & 255] ^ \
+			t2[(crc >> 16) & 255] ^ \
+			t3[(crc >> 24) & 255]
 # endif
 	const u32 *b;
 	size_t    rem_len;
+	const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
 
 	/* Align it */
 	if (unlikely((long)buf & 3 && len)) {



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 05/10] crc32-misc-cleanup.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (3 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 04/10] crc32-add-pointer-to-tab.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-09-02 23:50   ` Andrew Morton
  2011-08-31 22:30 ` [PATCH v6 06/10] crc32-fix-check-endian-warnings.diff Bob Pearson
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Misc cleanup of lib/crc32.c and related files
	- removed unnecessary header files.
	- straightened out some convoluted ifdef's
	- rewrote some references to 2 dimensional arrays as 1 dimensional
	  arrays to make them correct. I.e. replaced tab[i] with tab[0][i].
	- a few trivial whitespace changes
	- fixed a warning in gen_crc32tables.c caused by a mismatch in the
	  type of the pointer passed to output table. Since the table is
	  only used at kernel compile time, it is simpler to make the table
	  big enough to hold the largest column size used. One cannot make the
	  column size smaller in output_table because it has to be used by
	  both the le and be tables and they can have different column sizes.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c          |  104 +++++++++++++++++----------------------------------
 lib/gen_crc32table.c |    6 +-
 2 files changed, 39 insertions(+), 71 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -23,13 +23,10 @@
 /* see: Documentation/crc32.txt for a description of algorithms */
 
 #include <linux/crc32.h>
-#include <linux/kernel.h>
 #include <linux/module.h>
-#include <linux/compiler.h>
 #include <linux/types.h>
-#include <linux/init.h>
-#include <asm/atomic.h>
 #include "crc32defs.h"
+
 #if CRC_LE_BITS == 8
 # define tole(x) __constant_cpu_to_le32(x)
 #else
@@ -41,6 +38,7 @@
 #else
 # define tobe(x) (x)
 #endif
+
 #include "crc32table.h"
 
 MODULE_AUTHOR("Matt Domsch <Matt_Domsch@dell.com>");
@@ -96,6 +94,7 @@ crc32_body(u32 crc, unsigned char const 
 #undef DO_CRC4
 }
 #endif
+
 /**
  * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32
  * @crc: seed value for computation.  ~0 for Ethernet, sometimes 0 for
@@ -103,53 +102,39 @@ crc32_body(u32 crc, unsigned char const 
  * @p: pointer to buffer over which CRC is run
  * @len: length of buffer @p
  */
-u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len);
-
-#if CRC_LE_BITS == 1
-/*
- * In fact, the table-based code will work in this case, but it can be
- * simplified by inlining the table in ?: form.
- */
-
 u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len)
 {
+#if CRC_LE_BITS == 1
 	int i;
 	while (len--) {
 		crc ^= *p++;
 		for (i = 0; i < 8; i++)
 			crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0);
 	}
-	return crc;
-}
-#else				/* Table-based approach */
-
-u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len)
-{
-# if CRC_LE_BITS == 8
-	const u32      (*tab)[] = crc32table_le;
-
-	crc = __cpu_to_le32(crc);
-	crc = crc32_body(crc, p, len, tab);
-	return __le32_to_cpu(crc);
-# elif CRC_LE_BITS == 4
+# elif CRC_LE_BITS == 2
 	while (len--) {
 		crc ^= *p++;
-		crc = (crc >> 4) ^ crc32table_le[crc & 15];
-		crc = (crc >> 4) ^ crc32table_le[crc & 15];
+		crc = (crc >> 2) ^ crc32table_le[0][crc & 3];
+		crc = (crc >> 2) ^ crc32table_le[0][crc & 3];
+		crc = (crc >> 2) ^ crc32table_le[0][crc & 3];
+		crc = (crc >> 2) ^ crc32table_le[0][crc & 3];
 	}
-	return crc;
-# elif CRC_LE_BITS == 2
+# elif CRC_LE_BITS == 4
 	while (len--) {
 		crc ^= *p++;
-		crc = (crc >> 2) ^ crc32table_le[crc & 3];
-		crc = (crc >> 2) ^ crc32table_le[crc & 3];
-		crc = (crc >> 2) ^ crc32table_le[crc & 3];
-		crc = (crc >> 2) ^ crc32table_le[crc & 3];
+		crc = (crc >> 4) ^ crc32table_le[0][crc & 15];
+		crc = (crc >> 4) ^ crc32table_le[0][crc & 15];
 	}
+# elif CRC_LE_BITS == 8
+	const u32      (*tab)[] = crc32table_le;
+
+	crc = __cpu_to_le32(crc);
+	crc = crc32_body(crc, p, len, tab);
+	crc = __le32_to_cpu(crc);
+#endif
 	return crc;
-# endif
 }
-#endif
+EXPORT_SYMBOL(crc32_le);
 
 /**
  * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32
@@ -158,16 +143,9 @@ u32 __pure crc32_le(u32 crc, unsigned ch
  * @p: pointer to buffer over which CRC is run
  * @len: length of buffer @p
  */
-u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len);
-
-#if CRC_BE_BITS == 1
-/*
- * In fact, the table-based code will work in this case, but it can be
- * simplified by inlining the table in ?: form.
- */
-
 u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len)
 {
+#if CRC_BE_BITS == 1
 	int i;
 	while (len--) {
 		crc ^= *p++ << 24;
@@ -176,39 +154,29 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 			    (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE :
 					  0);
 	}
-	return crc;
-}
-
-#else				/* Table-based approach */
-u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len)
-{
-# if CRC_BE_BITS == 8
-	const u32      (*tab)[] = crc32table_be;
-
-	crc = __cpu_to_be32(crc);
-	crc = crc32_body(crc, p, len, tab);
-	return __be32_to_cpu(crc);
-# elif CRC_BE_BITS == 4
+# elif CRC_BE_BITS == 2
 	while (len--) {
 		crc ^= *p++ << 24;
-		crc = (crc << 4) ^ crc32table_be[crc >> 28];
-		crc = (crc << 4) ^ crc32table_be[crc >> 28];
+		crc = (crc << 2) ^ crc32table_be[0][crc >> 30];
+		crc = (crc << 2) ^ crc32table_be[0][crc >> 30];
+		crc = (crc << 2) ^ crc32table_be[0][crc >> 30];
+		crc = (crc << 2) ^ crc32table_be[0][crc >> 30];
 	}
-	return crc;
-# elif CRC_BE_BITS == 2
+# elif CRC_BE_BITS == 4
 	while (len--) {
 		crc ^= *p++ << 24;
-		crc = (crc << 2) ^ crc32table_be[crc >> 30];
-		crc = (crc << 2) ^ crc32table_be[crc >> 30];
-		crc = (crc << 2) ^ crc32table_be[crc >> 30];
-		crc = (crc << 2) ^ crc32table_be[crc >> 30];
+		crc = (crc << 4) ^ crc32table_be[0][crc >> 28];
+		crc = (crc << 4) ^ crc32table_be[0][crc >> 28];
 	}
-	return crc;
+# elif CRC_BE_BITS == 8
+	const u32      (*tab)[] = crc32table_be;
+
+	crc = __cpu_to_be32(crc);
+	crc = crc32_body(crc, p, len, tab);
+	crc = __be32_to_cpu(crc);
 # endif
+	return crc;
 }
-#endif
-
-EXPORT_SYMBOL(crc32_le);
 EXPORT_SYMBOL(crc32_be);
 
 #ifdef CONFIG_CRC32_SELFTEST
Index: for-next/lib/gen_crc32table.c
===================================================================
--- for-next.orig/lib/gen_crc32table.c
+++ for-next/lib/gen_crc32table.c
@@ -7,8 +7,8 @@
 #define LE_TABLE_SIZE (1 << CRC_LE_BITS)
 #define BE_TABLE_SIZE (1 << CRC_BE_BITS)
 
-static uint32_t crc32table_le[4][LE_TABLE_SIZE];
-static uint32_t crc32table_be[4][BE_TABLE_SIZE];
+static uint32_t crc32table_le[4][256];
+static uint32_t crc32table_be[4][256];
 
 /**
  * crc32init_le() - allocate and initialize LE table data
@@ -62,7 +62,7 @@ static void crc32init_be(void)
 	}
 }
 
-static void output_table(uint32_t table[4][256], int len, char *trans)
+static void output_table(uint32_t (*table)[256], int len, char *trans)
 {
 	int i, j;
 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 06/10] crc32-fix-check-endian-warnings.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (4 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 05/10] crc32-misc-cleanup.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-08-31 22:30 ` [PATCH v6 07/10] crc32-add-real-8-bit.diff Bob Pearson
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

crc32.c in its original version freely mixed u32, __le32 and __be32 types
which caused warnings from sparse with __CHECK_ENDIAN__.
This patch fixes these by forcing the types to u32.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -28,13 +28,13 @@
 #include "crc32defs.h"
 
 #if CRC_LE_BITS == 8
-# define tole(x) __constant_cpu_to_le32(x)
+# define tole(x) (__force u32) __constant_cpu_to_le32(x)
 #else
 # define tole(x) (x)
 #endif
 
 #if CRC_BE_BITS == 8
-# define tobe(x) __constant_cpu_to_be32(x)
+# define tobe(x) (__force u32) __constant_cpu_to_be32(x)
 #else
 # define tobe(x) (x)
 #endif
@@ -128,9 +128,9 @@ u32 __pure crc32_le(u32 crc, unsigned ch
 # elif CRC_LE_BITS == 8
 	const u32      (*tab)[] = crc32table_le;
 
-	crc = __cpu_to_le32(crc);
+	crc = (__force u32) __cpu_to_le32(crc);
 	crc = crc32_body(crc, p, len, tab);
-	crc = __le32_to_cpu(crc);
+	crc = __le32_to_cpu((__force __le32)crc);
 #endif
 	return crc;
 }
@@ -171,9 +171,9 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 # elif CRC_BE_BITS == 8
 	const u32      (*tab)[] = crc32table_be;
 
-	crc = __cpu_to_be32(crc);
+	crc = (__force u32) __cpu_to_be32(crc);
 	crc = crc32_body(crc, p, len, tab);
-	crc = __be32_to_cpu(crc);
+	crc = __be32_to_cpu((__force __be32)crc);
 # endif
 	return crc;
 }



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 07/10] crc32-add-real-8-bit.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (5 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 06/10] crc32-fix-check-endian-warnings.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-08-31 22:30 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

crc32.c provides a choice of one of several algorithms for
computing the LSB and LSB versions of the CRC32 checksum
based on the parameters CRC_LE_BITS and CRC_BE_BITS. In the
original version the values 1, 2, 4 and 8 respectively selected
versions of the alrogithm that computed the crc 1, 2, 4 and 32
bits as a time. This patch series adds a new version that computes
the CRC 64 bits at a time. To make things easier to understand
the parameter has been reinterpreted to actually stand for the
number of bits processed in each step of the algorithm so that
the old value 8 has been replaced with the value 32. This also
allows us to add in a widely used crc algorithm that
computes the crc 8 bits at a time called the Sarwate algorithm.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c          |   17 ++++++++++++++---
 lib/crc32defs.h      |   18 ++++++++++--------
 lib/gen_crc32table.c |   11 ++++++++++-
 3 files changed, 34 insertions(+), 12 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -27,13 +27,13 @@
 #include <linux/types.h>
 #include "crc32defs.h"
 
-#if CRC_LE_BITS == 8
+#if CRC_LE_BITS > 8
 # define tole(x) (__force u32) __constant_cpu_to_le32(x)
 #else
 # define tole(x) (x)
 #endif
 
-#if CRC_BE_BITS == 8
+#if CRC_BE_BITS > 8
 # define tobe(x) (__force u32) __constant_cpu_to_be32(x)
 #else
 # define tobe(x) (x)
@@ -45,7 +45,7 @@ MODULE_AUTHOR("Matt Domsch <Matt_Domsch@
 MODULE_DESCRIPTION("Ethernet CRC32 calculations");
 MODULE_LICENSE("GPL");
 
-#if CRC_LE_BITS == 8 || CRC_BE_BITS == 8
+#if CRC_LE_BITS > 8 || CRC_BE_BITS > 8
 
 static inline u32
 crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
@@ -126,6 +126,12 @@ u32 __pure crc32_le(u32 crc, unsigned ch
 		crc = (crc >> 4) ^ crc32table_le[0][crc & 15];
 	}
 # elif CRC_LE_BITS == 8
+	/* aka Sarwate algorithm */
+	while (len--) {
+		crc ^= *p++;
+		crc = (crc >> 8) ^ crc32table_le[0][crc & 255];
+	}
+# else
 	const u32      (*tab)[] = crc32table_le;
 
 	crc = (__force u32) __cpu_to_le32(crc);
@@ -169,6 +175,11 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 		crc = (crc << 4) ^ crc32table_be[0][crc >> 28];
 	}
 # elif CRC_BE_BITS == 8
+	while (len--) {
+		crc ^= *p++ << 24;
+		crc = (crc << 8) ^ crc32table_be[0][crc >> 24];
+	}
+# else
 	const u32      (*tab)[] = crc32table_be;
 
 	crc = (__force u32) __cpu_to_be32(crc);
Index: for-next/lib/crc32defs.h
===================================================================
--- for-next.orig/lib/crc32defs.h
+++ for-next/lib/crc32defs.h
@@ -6,27 +6,29 @@
 #define CRCPOLY_LE 0xedb88320
 #define CRCPOLY_BE 0x04c11db7
 
-/* How many bits at a time to use.  Requires a table of 4<<CRC_xx_BITS bytes. */
-/* For less performance-sensitive, use 4 */
+/* How many bits at a time to use.  Valid values are 1, 2, 4, 8, and 32. */
+/* For less performance-sensitive, use 4 or 8 */
 #ifndef CRC_LE_BITS
-# define CRC_LE_BITS 8
+# define CRC_LE_BITS 32
 #endif
 #ifndef CRC_BE_BITS
-# define CRC_BE_BITS 8
+# define CRC_BE_BITS 32
 #endif
 
 /*
  * Little-endian CRC computation.  Used with serial bit streams sent
  * lsbit-first.  Be sure to use cpu_to_le32() to append the computed CRC.
  */
-#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1
-# error CRC_LE_BITS must be a power of 2 between 1 and 8
+#if CRC_LE_BITS > 32 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \
+	CRC_LE_BITS & CRC_LE_BITS-1
+# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32}"
 #endif
 
 /*
  * Big-endian CRC computation.  Used with serial bit streams sent
  * msbit-first.  Be sure to use cpu_to_be32() to append the computed CRC.
  */
-#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1
-# error CRC_BE_BITS must be a power of 2 between 1 and 8
+#if CRC_BE_BITS > 32 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \
+	CRC_BE_BITS & CRC_BE_BITS-1
+# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32}"
 #endif
Index: for-next/lib/gen_crc32table.c
===================================================================
--- for-next.orig/lib/gen_crc32table.c
+++ for-next/lib/gen_crc32table.c
@@ -4,8 +4,17 @@
 
 #define ENTRIES_PER_LINE 4
 
+#if CRC_LE_BITS <= 8
 #define LE_TABLE_SIZE (1 << CRC_LE_BITS)
+#else
+#define LE_TABLE_SIZE 256
+#endif
+
+#if CRC_BE_BITS <= 8
 #define BE_TABLE_SIZE (1 << CRC_BE_BITS)
+#else
+#define BE_TABLE_SIZE 256
+#endif
 
 static uint32_t crc32table_le[4][256];
 static uint32_t crc32table_be[4][256];
@@ -24,7 +33,7 @@ static void crc32init_le(void)
 
 	crc32table_le[0][0] = 0;
 
-	for (i = 1 << (CRC_LE_BITS - 1); i; i >>= 1) {
+	for (i = LE_TABLE_SIZE >> 1; i; i >>= 1) {
 		crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0);
 		for (j = 0; j < LE_TABLE_SIZE; j += 2 * i)
 			crc32table_le[0][i + j] = crc ^ crc32table_le[0][j];



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 08/10] crc32-add-slicing-by-8.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (6 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 07/10] crc32-add-real-8-bit.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-09-07  7:31   ` Joakim Tjernlund
       [not found]   ` <OF3D37A60B.7A33B855-ONC1257904.00276B5B-C1257904.002951AF@LocalDomain>
  2011-08-31 22:30 ` [PATCH v6 09/10] crc32-optimize-loops-for-x86.diff Bob Pearson
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

add slicing-by-8 algorithm to the existing
slicing-by-4 algorithm. This consists of:
	- extend largest BITS size from 32 to 64
	- extend tables from tab[4][256] to up to tab[8][256]
	- Add code for inner loop.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c          |   40 ++++++++++++++++++++++++++++------------
 lib/crc32defs.h      |   29 +++++++++++++++++++++--------
 lib/gen_crc32table.c |   43 +++++++++++++++++++++++++++----------------
 3 files changed, 76 insertions(+), 36 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -47,25 +47,28 @@ MODULE_LICENSE("GPL");
 
 #if CRC_LE_BITS > 8 || CRC_BE_BITS > 8
 
+/* implements slicing-by-4 or slicing-by-8 algorithm */
 static inline u32
 crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
 {
 # ifdef __LITTLE_ENDIAN
 #  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
-#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
-			t2[(crc >> 8) & 255] ^ \
-			t1[(crc >> 16) & 255] ^ \
-			t0[(crc >> 24) & 255]
+#  define DO_CRC4 (t3[(q) & 255] ^ t2[(q >> 8) & 255] ^ \
+		   t1[(q >> 16) & 255] ^ t0[(q >> 24) & 255])
+#  define DO_CRC8 (t7[(q) & 255] ^ t6[(q >> 8) & 255] ^ \
+		   t5[(q >> 16) & 255] ^ t4[(q >> 24) & 255])
 # else
 #  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
-#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
-			t1[(crc >> 8) & 255] ^ \
-			t2[(crc >> 16) & 255] ^ \
-			t3[(crc >> 24) & 255]
+#  define DO_CRC4 (t0[(q) & 255] ^ t1[(q >> 8) & 255] ^ \
+		   t2[(q >> 16) & 255] ^ t3[(q >> 24) & 255])
+#  define DO_CRC8 (t4[(q) & 255] ^ t5[(q >> 8) & 255] ^ \
+		   t6[(q >> 16) & 255] ^ t7[(q >> 24) & 255])
 # endif
 	const u32 *b;
-	size_t    rem_len;
+	size_t rem_len;
 	const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
+	const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
+	u32 q;
 
 	/* Align it */
 	if (unlikely((long)buf & 3 && len)) {
@@ -73,13 +76,25 @@ crc32_body(u32 crc, unsigned char const 
 			DO_CRC(*buf++);
 		} while ((--len) && ((long)buf)&3);
 	}
+
+# if CRC_LE_BITS == 32
 	rem_len = len & 3;
-	/* load data 32 bits wide, xor data 32 bits wide. */
 	len = len >> 2;
+# else
+	rem_len = len & 7;
+	len = len >> 3;
+# endif
+
 	b = (const u32 *)buf;
 	for (--b; len; --len) {
-		crc ^= *++b; /* use pre increment for speed */
-		DO_CRC4;
+		q = crc ^ *++b; /* use pre increment for speed */
+# if CRC_LE_BITS == 32
+		crc = DO_CRC4;
+# else
+		crc = DO_CRC8;
+		q = *++b;
+		crc ^= DO_CRC4;
+# endif
 	}
 	len = rem_len;
 	/* And the last few bytes */
@@ -92,6 +107,7 @@ crc32_body(u32 crc, unsigned char const 
 	return crc;
 #undef DO_CRC
 #undef DO_CRC4
+#undef DO_CRC8
 }
 #endif
 
Index: for-next/lib/crc32defs.h
===================================================================
--- for-next.orig/lib/crc32defs.h
+++ for-next/lib/crc32defs.h
@@ -6,29 +6,42 @@
 #define CRCPOLY_LE 0xedb88320
 #define CRCPOLY_BE 0x04c11db7
 
-/* How many bits at a time to use.  Valid values are 1, 2, 4, 8, and 32. */
-/* For less performance-sensitive, use 4 or 8 */
+/*
+ * How many bits at a time to use.  Valid values are 1, 2, 4, 8, 32 and 64.
+ * For less performance-sensitive, use 4 or 8 to save table size.
+ * For larger systems choose same as CPU architecture as default.
+ * This works well on X86_64, SPARC64 systems. This may require some
+ * elaboration after experiments with other architectures.
+ */
 #ifndef CRC_LE_BITS
-# define CRC_LE_BITS 32
+#  ifdef CONFIG_64BIT
+#  define CRC_LE_BITS 64
+#  else
+#  define CRC_LE_BITS 32
+#  endif
 #endif
 #ifndef CRC_BE_BITS
-# define CRC_BE_BITS 32
+#  ifdef CONFIG_64BIT
+#  define CRC_BE_BITS 64
+#  else
+#  define CRC_BE_BITS 32
+#  endif
 #endif
 
 /*
  * Little-endian CRC computation.  Used with serial bit streams sent
  * lsbit-first.  Be sure to use cpu_to_le32() to append the computed CRC.
  */
-#if CRC_LE_BITS > 32 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \
+#if CRC_LE_BITS > 64 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \
 	CRC_LE_BITS & CRC_LE_BITS-1
-# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32}"
+# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32, 64}"
 #endif
 
 /*
  * Big-endian CRC computation.  Used with serial bit streams sent
  * msbit-first.  Be sure to use cpu_to_be32() to append the computed CRC.
  */
-#if CRC_BE_BITS > 32 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \
+#if CRC_BE_BITS > 64 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \
 	CRC_BE_BITS & CRC_BE_BITS-1
-# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32}"
+# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32, 64}"
 #endif
Index: for-next/lib/gen_crc32table.c
===================================================================
--- for-next.orig/lib/gen_crc32table.c
+++ for-next/lib/gen_crc32table.c
@@ -1,23 +1,28 @@
 #include <stdio.h>
+#include "../include/generated/autoconf.h"
 #include "crc32defs.h"
 #include <inttypes.h>
 
 #define ENTRIES_PER_LINE 4
 
-#if CRC_LE_BITS <= 8
-#define LE_TABLE_SIZE (1 << CRC_LE_BITS)
+#if CRC_LE_BITS > 8
+# define LE_TABLE_ROWS (CRC_LE_BITS/8)
+# define LE_TABLE_SIZE 256
 #else
-#define LE_TABLE_SIZE 256
+# define LE_TABLE_ROWS 1
+# define LE_TABLE_SIZE (1 << CRC_LE_BITS)
 #endif
 
-#if CRC_BE_BITS <= 8
-#define BE_TABLE_SIZE (1 << CRC_BE_BITS)
+#if CRC_BE_BITS > 8
+# define BE_TABLE_ROWS (CRC_BE_BITS/8)
+# define BE_TABLE_SIZE 256
 #else
-#define BE_TABLE_SIZE 256
+# define BE_TABLE_ROWS 1
+# define BE_TABLE_SIZE (1 << CRC_BE_BITS)
 #endif
 
-static uint32_t crc32table_le[4][256];
-static uint32_t crc32table_be[4][256];
+static uint32_t crc32table_le[LE_TABLE_ROWS][256];
+static uint32_t crc32table_be[BE_TABLE_ROWS][256];
 
 /**
  * crc32init_le() - allocate and initialize LE table data
@@ -40,7 +45,7 @@ static void crc32init_le(void)
 	}
 	for (i = 0; i < LE_TABLE_SIZE; i++) {
 		crc = crc32table_le[0][i];
-		for (j = 1; j < 4; j++) {
+		for (j = 1; j < LE_TABLE_ROWS; j++) {
 			crc = crc32table_le[0][crc & 0xff] ^ (crc >> 8);
 			crc32table_le[j][i] = crc;
 		}
@@ -64,18 +69,18 @@ static void crc32init_be(void)
 	}
 	for (i = 0; i < BE_TABLE_SIZE; i++) {
 		crc = crc32table_be[0][i];
-		for (j = 1; j < 4; j++) {
+		for (j = 1; j < BE_TABLE_ROWS; j++) {
 			crc = crc32table_be[0][(crc >> 24) & 0xff] ^ (crc << 8);
 			crc32table_be[j][i] = crc;
 		}
 	}
 }
 
-static void output_table(uint32_t (*table)[256], int len, char *trans)
+static void output_table(uint32_t (*table)[256], int rows, int len, char *trans)
 {
 	int i, j;
 
-	for (j = 0 ; j < 4; j++) {
+	for (j = 0 ; j < rows; j++) {
 		printf("{");
 		for (i = 0; i < len - 1; i++) {
 			if (i % ENTRIES_PER_LINE == 0)
@@ -92,15 +97,21 @@ int main(int argc, char** argv)
 
 	if (CRC_LE_BITS > 1) {
 		crc32init_le();
-		printf("static const u32 crc32table_le[4][256] = {");
-		output_table(crc32table_le, LE_TABLE_SIZE, "tole");
+		printf("static const u32 __cacheline_aligned "
+		       "crc32table_le[%d][%d] = {",
+		       LE_TABLE_ROWS, LE_TABLE_SIZE);
+		output_table(crc32table_le, LE_TABLE_ROWS,
+			     LE_TABLE_SIZE, "tole");
 		printf("};\n");
 	}
 
 	if (CRC_BE_BITS > 1) {
 		crc32init_be();
-		printf("static const u32 crc32table_be[4][256] = {");
-		output_table(crc32table_be, BE_TABLE_SIZE, "tobe");
+		printf("static const u32 __cacheline_aligned "
+		       "crc32table_be[%d][%d] = {",
+		       BE_TABLE_ROWS, BE_TABLE_SIZE);
+		output_table(crc32table_be, LE_TABLE_ROWS,
+			     BE_TABLE_SIZE, "tobe");
 		printf("};\n");
 	}
 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 09/10] crc32-optimize-loops-for-x86.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (7 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-08-31 22:30 ` [PATCH v6 10/10] crc32-final.diff Bob Pearson
  2011-09-01  3:03 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Add two changes that improve the performance of x86 systems
	1. replace main loop with incrementing counter
	   this change improves the performance of the selftest
	   by about 5-6% on Nehalem CPUs. The apparent
	   reason is that the compiler can use the loop index
	   to perform an indexed memory access. This is
	   reported to make the performance of PowerPC CPUs
	   to get worse.
	2. replace the rem_len loop with incrementing counter
	   this change improves the performance of the selftest,
	   which has more than the usual number of occurances,
	   by about 1-2% on x86 CPUs. In actual work loads
	   the length is most often a multiple of 4 bytes and
	   this code does not get executed as often if at all.
	   Again this change is reported to make the performance
	   of PowerPC get worse.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -66,6 +66,9 @@ crc32_body(u32 crc, unsigned char const 
 # endif
 	const u32 *b;
 	size_t rem_len;
+# ifdef CONFIG_X86
+	size_t i;
+# endif
 	const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
 	const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
 	u32 q;
@@ -86,7 +89,12 @@ crc32_body(u32 crc, unsigned char const 
 # endif
 
 	b = (const u32 *)buf;
+# ifdef CONFIG_X86
+	--b;
+	for (i = 0; i < len; i++) {
+# else
 	for (--b; len; --len) {
+# endif
 		q = crc ^ *++b; /* use pre increment for speed */
 # if CRC_LE_BITS == 32
 		crc = DO_CRC4;
@@ -100,9 +108,14 @@ crc32_body(u32 crc, unsigned char const 
 	/* And the last few bytes */
 	if (len) {
 		u8 *p = (u8 *)(b + 1) - 1;
+# ifdef CONFIG_X86
+		for (i = 0; i < len; i++)
+			DO_CRC(*++p); /* use pre increment for speed */
+# else
 		do {
 			DO_CRC(*++p); /* use pre increment for speed */
 		} while (--len);
+# endif
 	}
 	return crc;
 #undef DO_CRC



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 10/10] crc32-final.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (8 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 09/10] crc32-optimize-loops-for-x86.diff Bob Pearson
@ 2011-08-31 22:30 ` Bob Pearson
  2011-09-01  3:03 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
  10 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-08-31 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

Some final changes
	- added a comment at the top of crc32.c

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -1,4 +1,8 @@
 /*
+ * Aug 8, 2011 Bob Pearson with help from Joakim Tjernlund and George Spelvin
+ * cleaned up code to current version of sparse and added the slicing-by-8
+ * algorithm to the closely similar existing slicing-by-4 algorithm.
+ *
  * Oct 15, 2000 Matt Domsch <Matt_Domsch@dell.com>
  * Nicer crc32 functions/docs submitted by linux@horizon.com.  Thanks!
  * Code was from the public domain, copyright abandoned.  Code was



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
       [not found] <20110831213729.395283830@systemfabricworks.com>
                   ` (9 preceding siblings ...)
  2011-08-31 22:30 ` [PATCH v6 10/10] crc32-final.diff Bob Pearson
@ 2011-09-01  3:03 ` Bob Pearson
  2011-09-07  7:32   ` Joakim Tjernlund
  10 siblings, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-09-01  3:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: fzago, rpearson, Joakim Tjernlund, George Spelvin, akpm

I've been looking at this stuff for too long! I just noticed that
crc32_body incorrectly always uses CRC_LE_BITS to pick algorithm.
Replace with a function parameter that will get optimized out
by the compiler since crc32_body is inlined.

add slicing-by-8 algorithm to the existing
slicing-by-4 algorithm. This consists of:
	- extend largest BITS size from 32 to 64
	- extend tables from tab[4][256] to up to tab[8][256]
	- Add code for inner loop.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

---
 lib/crc32.c          |   51 ++++++++++++++++++++++++++++++++++-----------------
 lib/crc32defs.h      |   29 +++++++++++++++++++++--------
 lib/gen_crc32table.c |   43 +++++++++++++++++++++++++++----------------
 3 files changed, 82 insertions(+), 41 deletions(-)

Index: for-next/lib/crc32.c
===================================================================
--- for-next.orig/lib/crc32.c
+++ for-next/lib/crc32.c
@@ -47,25 +47,29 @@ MODULE_LICENSE("GPL");
 
 #if CRC_LE_BITS > 8 || CRC_BE_BITS > 8
 
+/* implements slicing-by-4 or slicing-by-8 algorithm */
 static inline u32
-crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
+crc32_body(u32 crc, unsigned char const *buf, size_t len,
+	   const u32 (*tab)[256], const unsigned bits)
 {
 # ifdef __LITTLE_ENDIAN
 #  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
-#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
-			t2[(crc >> 8) & 255] ^ \
-			t1[(crc >> 16) & 255] ^ \
-			t0[(crc >> 24) & 255]
+#  define DO_CRC4 (t3[(q) & 255] ^ t2[(q >> 8) & 255] ^ \
+		   t1[(q >> 16) & 255] ^ t0[(q >> 24) & 255])
+#  define DO_CRC8 (t7[(q) & 255] ^ t6[(q >> 8) & 255] ^ \
+		   t5[(q >> 16) & 255] ^ t4[(q >> 24) & 255])
 # else
 #  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
-#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
-			t1[(crc >> 8) & 255] ^ \
-			t2[(crc >> 16) & 255] ^ \
-			t3[(crc >> 24) & 255]
+#  define DO_CRC4 (t0[(q) & 255] ^ t1[(q >> 8) & 255] ^ \
+		   t2[(q >> 16) & 255] ^ t3[(q >> 24) & 255])
+#  define DO_CRC8 (t4[(q) & 255] ^ t5[(q >> 8) & 255] ^ \
+		   t6[(q >> 16) & 255] ^ t7[(q >> 24) & 255])
 # endif
 	const u32 *b;
-	size_t    rem_len;
+	size_t rem_len;
 	const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
+	const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
+	u32 q;
 
 	/* Align it */
 	if (unlikely((long)buf & 3 && len)) {
@@ -73,13 +77,25 @@ crc32_body(u32 crc, unsigned char const 
 			DO_CRC(*buf++);
 		} while ((--len) && ((long)buf)&3);
 	}
-	rem_len = len & 3;
-	/* load data 32 bits wide, xor data 32 bits wide. */
-	len = len >> 2;
+
+	if (bits == 32) {
+		rem_len = len & 3;
+		len = len >> 2;
+	} else {
+		rem_len = len & 7;
+		len = len >> 3;
+	}
+
 	b = (const u32 *)buf;
 	for (--b; len; --len) {
-		crc ^= *++b; /* use pre increment for speed */
-		DO_CRC4;
+		q = crc ^ *++b; /* use pre increment for speed */
+		if (bits == 32)
+			crc = DO_CRC4;
+		else {
+			crc = DO_CRC8;
+			q = *++b;
+			crc ^= DO_CRC4;
+		}
 	}
 	len = rem_len;
 	/* And the last few bytes */
@@ -92,6 +108,7 @@ crc32_body(u32 crc, unsigned char const 
 	return crc;
 #undef DO_CRC
 #undef DO_CRC4
+#undef DO_CRC8
 }
 #endif
 
@@ -135,7 +152,7 @@ u32 __pure crc32_le(u32 crc, unsigned ch
 	const u32      (*tab)[] = crc32table_le;
 
 	crc = (__force u32) __cpu_to_le32(crc);
-	crc = crc32_body(crc, p, len, tab);
+	crc = crc32_body(crc, p, len, tab, CRC_LE_BITS);
 	crc = __le32_to_cpu((__force __le32)crc);
 #endif
 	return crc;
@@ -183,7 +200,7 @@ u32 __pure crc32_be(u32 crc, unsigned ch
 	const u32      (*tab)[] = crc32table_be;
 
 	crc = (__force u32) __cpu_to_be32(crc);
-	crc = crc32_body(crc, p, len, tab);
+	crc = crc32_body(crc, p, len, tab, CRC_BE_BITS);
 	crc = __be32_to_cpu((__force __be32)crc);
 # endif
 	return crc;
Index: for-next/lib/crc32defs.h
===================================================================
--- for-next.orig/lib/crc32defs.h
+++ for-next/lib/crc32defs.h
@@ -6,29 +6,42 @@
 #define CRCPOLY_LE 0xedb88320
 #define CRCPOLY_BE 0x04c11db7
 
-/* How many bits at a time to use.  Valid values are 1, 2, 4, 8, and 32. */
-/* For less performance-sensitive, use 4 or 8 */
+/*
+ * How many bits at a time to use.  Valid values are 1, 2, 4, 8, 32 and 64.
+ * For less performance-sensitive, use 4 or 8 to save table size.
+ * For larger systems choose same as CPU architecture as default.
+ * This works well on X86_64, SPARC64 systems. This may require some
+ * elaboration after experiments with other architectures.
+ */
 #ifndef CRC_LE_BITS
-# define CRC_LE_BITS 32
+#  ifdef CONFIG_64BIT
+#  define CRC_LE_BITS 64
+#  else
+#  define CRC_LE_BITS 32
+#  endif
 #endif
 #ifndef CRC_BE_BITS
-# define CRC_BE_BITS 32
+#  ifdef CONFIG_64BIT
+#  define CRC_BE_BITS 64
+#  else
+#  define CRC_BE_BITS 32
+#  endif
 #endif
 
 /*
  * Little-endian CRC computation.  Used with serial bit streams sent
  * lsbit-first.  Be sure to use cpu_to_le32() to append the computed CRC.
  */
-#if CRC_LE_BITS > 32 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \
+#if CRC_LE_BITS > 64 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \
 	CRC_LE_BITS & CRC_LE_BITS-1
-# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32}"
+# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32, 64}"
 #endif
 
 /*
  * Big-endian CRC computation.  Used with serial bit streams sent
  * msbit-first.  Be sure to use cpu_to_be32() to append the computed CRC.
  */
-#if CRC_BE_BITS > 32 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \
+#if CRC_BE_BITS > 64 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \
 	CRC_BE_BITS & CRC_BE_BITS-1
-# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32}"
+# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32, 64}"
 #endif
Index: for-next/lib/gen_crc32table.c
===================================================================
--- for-next.orig/lib/gen_crc32table.c
+++ for-next/lib/gen_crc32table.c
@@ -1,23 +1,28 @@
 #include <stdio.h>
+#include "../include/generated/autoconf.h"
 #include "crc32defs.h"
 #include <inttypes.h>
 
 #define ENTRIES_PER_LINE 4
 
-#if CRC_LE_BITS <= 8
-#define LE_TABLE_SIZE (1 << CRC_LE_BITS)
+#if CRC_LE_BITS > 8
+# define LE_TABLE_ROWS (CRC_LE_BITS/8)
+# define LE_TABLE_SIZE 256
 #else
-#define LE_TABLE_SIZE 256
+# define LE_TABLE_ROWS 1
+# define LE_TABLE_SIZE (1 << CRC_LE_BITS)
 #endif
 
-#if CRC_BE_BITS <= 8
-#define BE_TABLE_SIZE (1 << CRC_BE_BITS)
+#if CRC_BE_BITS > 8
+# define BE_TABLE_ROWS (CRC_BE_BITS/8)
+# define BE_TABLE_SIZE 256
 #else
-#define BE_TABLE_SIZE 256
+# define BE_TABLE_ROWS 1
+# define BE_TABLE_SIZE (1 << CRC_BE_BITS)
 #endif
 
-static uint32_t crc32table_le[4][256];
-static uint32_t crc32table_be[4][256];
+static uint32_t crc32table_le[LE_TABLE_ROWS][256];
+static uint32_t crc32table_be[BE_TABLE_ROWS][256];
 
 /**
  * crc32init_le() - allocate and initialize LE table data
@@ -40,7 +45,7 @@ static void crc32init_le(void)
 	}
 	for (i = 0; i < LE_TABLE_SIZE; i++) {
 		crc = crc32table_le[0][i];
-		for (j = 1; j < 4; j++) {
+		for (j = 1; j < LE_TABLE_ROWS; j++) {
 			crc = crc32table_le[0][crc & 0xff] ^ (crc >> 8);
 			crc32table_le[j][i] = crc;
 		}
@@ -64,18 +69,18 @@ static void crc32init_be(void)
 	}
 	for (i = 0; i < BE_TABLE_SIZE; i++) {
 		crc = crc32table_be[0][i];
-		for (j = 1; j < 4; j++) {
+		for (j = 1; j < BE_TABLE_ROWS; j++) {
 			crc = crc32table_be[0][(crc >> 24) & 0xff] ^ (crc << 8);
 			crc32table_be[j][i] = crc;
 		}
 	}
 }
 
-static void output_table(uint32_t (*table)[256], int len, char *trans)
+static void output_table(uint32_t (*table)[256], int rows, int len, char *trans)
 {
 	int i, j;
 
-	for (j = 0 ; j < 4; j++) {
+	for (j = 0 ; j < rows; j++) {
 		printf("{");
 		for (i = 0; i < len - 1; i++) {
 			if (i % ENTRIES_PER_LINE == 0)
@@ -92,15 +97,21 @@ int main(int argc, char** argv)
 
 	if (CRC_LE_BITS > 1) {
 		crc32init_le();
-		printf("static const u32 crc32table_le[4][256] = {");
-		output_table(crc32table_le, LE_TABLE_SIZE, "tole");
+		printf("static const u32 __cacheline_aligned "
+		       "crc32table_le[%d][%d] = {",
+		       LE_TABLE_ROWS, LE_TABLE_SIZE);
+		output_table(crc32table_le, LE_TABLE_ROWS,
+			     LE_TABLE_SIZE, "tole");
 		printf("};\n");
 	}
 
 	if (CRC_BE_BITS > 1) {
 		crc32init_be();
-		printf("static const u32 crc32table_be[4][256] = {");
-		output_table(crc32table_be, BE_TABLE_SIZE, "tobe");
+		printf("static const u32 __cacheline_aligned "
+		       "crc32table_be[%d][%d] = {",
+		       BE_TABLE_ROWS, BE_TABLE_SIZE);
+		output_table(crc32table_be, LE_TABLE_ROWS,
+			     BE_TABLE_SIZE, "tobe");
 		printf("};\n");
 	}
 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 04/10] crc32-add-pointer-to-tab.diff
  2011-08-31 22:30 ` [PATCH v6 04/10] crc32-add-pointer-to-tab.diff Bob Pearson
@ 2011-09-01  8:16   ` Joakim Tjernlund
  0 siblings, 0 replies; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-01  8:16 UTC (permalink / raw)
  To: Bob Pearson; +Cc: akpm, fzago, George Spelvin, linux-kernel

Bob Pearson <rpearson@systemfabricworks.com> wrote on 2011/09/01 00:30:06:

> From: Bob Pearson <rpearson@systemfabricworks.com>
> To: linux-kernel@vger.kernel.org
> Cc: fzago@systemfabricworks.com, rpearson@systemfabricworks.com, Joakim Tjernlund <joakim.tjernlund@transmode.se>, George Spelvin <linux@horizon.com>, akpm@linux-foundation.org
> Date: 2011/09/01 00:30
> Subject: [PATCH v6 04/10] crc32-add-pointer-to-tab.diff
>
> Replace 2D array references by pointer references in loops.
> This change has no effect on X86 code but improves PPC
> performance.
>
> Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>

Yes, this makes a significant difference on ppc32, your self test
went from
  crc32: self tests passed, processed 225944 bytes in 2257673 nsec
to
  crc32: self tests passed, processed 225944 bytes in 1949869 nsec
About 15% faster.

Technically this is my patch so I am adding my SOB:

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>

>
> ---
>  lib/crc32.c |   21 +++++++++++----------
>  1 file changed, 11 insertions(+), 10 deletions(-)
>
> Index: for-next/lib/crc32.c
> ===================================================================
> --- for-next.orig/lib/crc32.c
> +++ for-next/lib/crc32.c
> @@ -53,20 +53,21 @@ static inline u32
>  crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
>  {
>  # ifdef __LITTLE_ENDIAN
> -#  define DO_CRC(x) crc = tab[0][(crc ^ (x)) & 255] ^ (crc >> 8)
> -#  define DO_CRC4 crc = tab[3][(crc) & 255] ^ \
> -      tab[2][(crc >> 8) & 255] ^ \
> -      tab[1][(crc >> 16) & 255] ^ \
> -      tab[0][(crc >> 24) & 255]
> +#  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
> +#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
> +         t2[(crc >> 8) & 255] ^ \
> +         t1[(crc >> 16) & 255] ^ \
> +         t0[(crc >> 24) & 255]
>  # else
> -#  define DO_CRC(x) crc = tab[0][((crc >> 24) ^ (x)) & 255] ^ (crc << 8)
> -#  define DO_CRC4 crc = tab[0][(crc) & 255] ^ \
> -      tab[1][(crc >> 8) & 255] ^ \
> -      tab[2][(crc >> 16) & 255] ^ \
> -      tab[3][(crc >> 24) & 255]
> +#  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
> +#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
> +         t1[(crc >> 8) & 255] ^ \
> +         t2[(crc >> 16) & 255] ^ \
> +         t3[(crc >> 24) & 255]
>  # endif
>     const u32 *b;
>     size_t    rem_len;
> +   const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
>
>     /* Align it */
>     if (unlikely((long)buf & 3 && len)) {
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-08-31 22:30 ` [PATCH v6 05/10] crc32-misc-cleanup.diff Bob Pearson
@ 2011-09-02 23:50   ` Andrew Morton
  2011-09-03  1:44     ` Stephen Rothwell
  2011-09-06 16:05     ` Bob Pearson
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2011-09-02 23:50 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-kernel, fzago, Joakim Tjernlund, George Spelvin

On Wed, 31 Aug 2011 17:30:12 -0500
Bob Pearson <rpearson@systemfabricworks.com> wrote:

> Misc cleanup of lib/crc32.c and related files
> 	- removed unnecessary header files.
> 	- straightened out some convoluted ifdef's
> 	- rewrote some references to 2 dimensional arrays as 1 dimensional
> 	  arrays to make them correct. I.e. replaced tab[i] with tab[0][i].
> 	- a few trivial whitespace changes
> 	- fixed a warning in gen_crc32tables.c caused by a mismatch in the
> 	  type of the pointer passed to output table. Since the table is
> 	  only used at kernel compile time, it is simpler to make the table
> 	  big enough to hold the largest column size used. One cannot make the
> 	  column size smaller in output_table because it has to be used by
> 	  both the le and be tables and they can have different column sizes.
>
> ...
>
> --- for-next.orig/lib/crc32.c
> +++ for-next/lib/crc32.c
> @@ -23,13 +23,10 @@
>  /* see: Documentation/crc32.txt for a description of algorithms */
>  
>  #include <linux/crc32.h>
> -#include <linux/kernel.h>
>  #include <linux/module.h>
> -#include <linux/compiler.h>
>  #include <linux/types.h>
> -#include <linux/init.h>
> -#include <asm/atomic.h>
>  #include "crc32defs.h"

I don't like this bit much.  Surely there's _something_ in here which
needs kernel.h, and crc32_init() is marked __init so init.h is
certainly needed.

Sure, these these things may be accidentally dragged in via nested
includes but it's bad to depend upon that - such things regularly cause
breakage when configs are changed.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 03/10] crc32-replace-self-test.diff
  2011-08-31 22:29 ` [PATCH v6 03/10] crc32-replace-self-test.diff Bob Pearson
@ 2011-09-02 23:51   ` Andrew Morton
  2011-09-06 16:14     ` Bob Pearson
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2011-09-02 23:51 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-kernel, fzago, Joakim Tjernlund, George Spelvin

On Wed, 31 Aug 2011 17:29:58 -0500
Bob Pearson <rpearson@systemfabricworks.com> wrote:

> Replaced the unit test provided in crc32.c, which doesn't have a
> makefile and doesn't compile with current headers, with a simpler
> self test routine that also gives a measure of performance and
> runs at module init time. The self test option can be enabled
> through a configuration option CONFIG_CRC32_SELFTEST.
> 
> The test stresses the pre and post loops and is thus not very
> realistic since actual uses will likely have addresses and lengths
> that are at least 4 byte aligned. However, the main loop is long
> enough so that the performance is dominated by that loop.
> 
> The expected values for crc32_le and crc32_be were generated
> with the original version of crc32.c using CRC_BITS_LE = 8 and
> CRC_BITS_BE = 8. These values were then used to check all the
> values of the BITS parameters in both the original and new versions.
> 
> The performance results show some variability from run to run
> in spite of attempts to both warm the cache and reduce the amount
> of OS noise by limiting interrutps during the test. To get comparable
> results and to analyse options wrt performance the best time
> reported over a small sample of runs has been taken.
> 

I don't object to a self-test which actually works, but it seems pretty
lame that the self-test exists in kernel mode when it is so simple to
prepare a much more useful and powerful correctness/performance test
harness in userspace.

> ...
>
> -static u32 test_step(u32 init, unsigned char *buf, size_t len)
> -{
> -	u32 crc1, crc2;
> -	size_t i;
> +		crc ^= crc32_be(test[i].crc, test_buf +
> +		    test[i].start, test[i].length);
> +	}
>  
> -	crc1 = crc32_be(init, buf, len);
> -	store_be(crc1, buf + len);
> -	crc2 = crc32_be(init, buf, len + 4);
> -	if (crc2)
> -		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
> -		       crc2);
> -
> -	for (i = 0; i <= len + 4; i++) {
> -		crc2 = crc32_be(init, buf, i);
> -		crc2 = crc32_be(crc2, buf + i, len + 4 - i);
> -		if (crc2)
> -			printf("\nCRC split fail: 0x%08x\n", crc2);
> +	/* reduce OS noise */

This comment is useless.

> +	local_irq_save(flags);
> +	local_irq_disable();

local_irq_save() already does local_irq_disable().

local_irq_disable() doesn't protect against actions of other CPUs.  I'd
know if this was a bug if the comment wasn't useless :)

> +	getnstimeofday(&start);
> +	for (i = 0; i < 100; i++) {
> +		if (test[i].crc_le != crc32_le(test[i].crc, test_buf +
> +		    test[i].start, test[i].length))
> +			errors++;
> +
> +		if (test[i].crc_be != crc32_be(test[i].crc, test_buf +
> +		    test[i].start, test[i].length))
> +			errors++;
>  	}
> ...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-02 23:50   ` Andrew Morton
@ 2011-09-03  1:44     ` Stephen Rothwell
  2011-09-06 13:40       ` Joakim Tjernlund
  2011-09-06 16:05     ` Bob Pearson
  1 sibling, 1 reply; 28+ messages in thread
From: Stephen Rothwell @ 2011-09-03  1:44 UTC (permalink / raw)
  To: Bob Pearson
  Cc: Andrew Morton, linux-kernel, fzago, Joakim Tjernlund, George Spelvin

[-- Attachment #1: Type: text/plain, Size: 513 bytes --]

On Fri, 2 Sep 2011 16:50:47 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Sure, these these things may be accidentally dragged in via nested
> includes but it's bad to depend upon that - such things regularly cause
> breakage when configs are changed.

And even the same config on different architectures or platforms.  That's
why we have Rule 1 in Documentation/SubmitChecklist ...

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-03  1:44     ` Stephen Rothwell
@ 2011-09-06 13:40       ` Joakim Tjernlund
  2011-09-06 14:50         ` Stephen Rothwell
  0 siblings, 1 reply; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-06 13:40 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Andrew Morton, fzago, George Spelvin, linux-kernel, Bob Pearson

Stephen Rothwell <sfr@canb.auug.org.au> wrote on 2011/09/03 03:44:03:
>
> On Fri, 2 Sep 2011 16:50:47 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > Sure, these these things may be accidentally dragged in via nested
> > includes but it's bad to depend upon that - such things regularly cause
> > breakage when configs are changed.
>
> And even the same config on different architectures or platforms.  That's
> why we have Rule 1 in Documentation/SubmitChecklist ...

It seems like an early version of the slice-by-8 crc32 algo. is in linux-next,
committed by you.
That version is crap, whats the plan?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-06 13:40       ` Joakim Tjernlund
@ 2011-09-06 14:50         ` Stephen Rothwell
  2011-09-06 19:38           ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Stephen Rothwell @ 2011-09-06 14:50 UTC (permalink / raw)
  To: Joakim Tjernlund
  Cc: Andrew Morton, fzago, George Spelvin, linux-kernel, Bob Pearson

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

Hi,

On Tue, 6 Sep 2011 15:40:10 +0200 Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
>
> It seems like an early version of the slice-by-8 crc32 algo. is in linux-next,
> committed by you.
> That version is crap, whats the plan?

I assume that it is one of the patches in Andrew's tree. When things get
back closer to normality with kernel.org, I assume that Andrew will
replace it with a newer version (if he has been sent one).

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-02 23:50   ` Andrew Morton
  2011-09-03  1:44     ` Stephen Rothwell
@ 2011-09-06 16:05     ` Bob Pearson
  1 sibling, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-09-06 16:05 UTC (permalink / raw)
  To: 'Andrew Morton'
  Cc: linux-kernel, fzago, 'Joakim Tjernlund',
	'George Spelvin'



> -----Original Message-----
> From: Andrew Morton [mailto:akpm@linux-foundation.org]
> Sent: Friday, September 02, 2011 6:51 PM
> To: Bob Pearson
> Cc: linux-kernel@vger.kernel.org; fzago@systemfabricworks.com; Joakim
> Tjernlund; George Spelvin
> Subject: Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
> 
> On Wed, 31 Aug 2011 17:30:12 -0500
> Bob Pearson <rpearson@systemfabricworks.com> wrote:
> 
> > Misc cleanup of lib/crc32.c and related files
> > 	- removed unnecessary header files.
> > 	- straightened out some convoluted ifdef's
> > 	- rewrote some references to 2 dimensional arrays as 1 dimensional
> > 	  arrays to make them correct. I.e. replaced tab[i] with tab[0][i].
> > 	- a few trivial whitespace changes
> > 	- fixed a warning in gen_crc32tables.c caused by a mismatch in the
> > 	  type of the pointer passed to output table. Since the table is
> > 	  only used at kernel compile time, it is simpler to make the table
> > 	  big enough to hold the largest column size used. One cannot make
> the
> > 	  column size smaller in output_table because it has to be used by
> > 	  both the le and be tables and they can have different column
sizes.
> >
> > ...
> >
> > --- for-next.orig/lib/crc32.c
> > +++ for-next/lib/crc32.c
> > @@ -23,13 +23,10 @@
> >  /* see: Documentation/crc32.txt for a description of algorithms */
> >
> >  #include <linux/crc32.h>
> > -#include <linux/kernel.h>
> >  #include <linux/module.h>
> > -#include <linux/compiler.h>
> >  #include <linux/types.h>
> > -#include <linux/init.h>
> > -#include <asm/atomic.h>
> >  #include "crc32defs.h"
> 
> I don't like this bit much.  Surely there's _something_ in here which
> needs kernel.h, and crc32_init() is marked __init so init.h is
> certainly needed.
> 
> Sure, these these things may be accidentally dragged in via nested
> includes but it's bad to depend upon that - such things regularly cause
> breakage when configs are changed.

I tried to copy the usage in other drivers. Happy to oblige.

> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 03/10] crc32-replace-self-test.diff
  2011-09-02 23:51   ` Andrew Morton
@ 2011-09-06 16:14     ` Bob Pearson
  0 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-09-06 16:14 UTC (permalink / raw)
  To: 'Andrew Morton'
  Cc: linux-kernel, fzago, 'Joakim Tjernlund',
	'George Spelvin'



> -----Original Message-----
> From: Andrew Morton [mailto:akpm@linux-foundation.org]
> Sent: Friday, September 02, 2011 6:52 PM
> To: Bob Pearson
> Cc: linux-kernel@vger.kernel.org; fzago@systemfabricworks.com; Joakim
> Tjernlund; George Spelvin
> Subject: Re: [PATCH v6 03/10] crc32-replace-self-test.diff
> 
> On Wed, 31 Aug 2011 17:29:58 -0500
> Bob Pearson <rpearson@systemfabricworks.com> wrote:
> 
> > Replaced the unit test provided in crc32.c, which doesn't have a
> > makefile and doesn't compile with current headers, with a simpler
> > self test routine that also gives a measure of performance and
> > runs at module init time. The self test option can be enabled
> > through a configuration option CONFIG_CRC32_SELFTEST.
> >
> > The test stresses the pre and post loops and is thus not very
> > realistic since actual uses will likely have addresses and lengths
> > that are at least 4 byte aligned. However, the main loop is long
> > enough so that the performance is dominated by that loop.
> >
> > The expected values for crc32_le and crc32_be were generated
> > with the original version of crc32.c using CRC_BITS_LE = 8 and
> > CRC_BITS_BE = 8. These values were then used to check all the
> > values of the BITS parameters in both the original and new versions.
> >
> > The performance results show some variability from run to run
> > in spite of attempts to both warm the cache and reduce the amount
> > of OS noise by limiting interrutps during the test. To get comparable
> > results and to analyse options wrt performance the best time
> > reported over a small sample of runs has been taken.
> >
> 
> I don't object to a self-test which actually works, but it seems pretty
> lame that the self-test exists in kernel mode when it is so simple to
> prepare a much more useful and powerful correctness/performance test
> harness in userspace.
> 
> > ...
> >
> > -static u32 test_step(u32 init, unsigned char *buf, size_t len)
> > -{
> > -	u32 crc1, crc2;
> > -	size_t i;
> > +		crc ^= crc32_be(test[i].crc, test_buf +
> > +		    test[i].start, test[i].length);
> > +	}
> >
> > -	crc1 = crc32_be(init, buf, len);
> > -	store_be(crc1, buf + len);
> > -	crc2 = crc32_be(init, buf, len + 4);
> > -	if (crc2)
> > -		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
> > -		       crc2);
> > -
> > -	for (i = 0; i <= len + 4; i++) {
> > -		crc2 = crc32_be(init, buf, i);
> > -		crc2 = crc32_be(crc2, buf + i, len + 4 - i);
> > -		if (crc2)
> > -			printf("\nCRC split fail: 0x%08x\n", crc2);
> > +	/* reduce OS noise */
> 
> This comment is useless.

I wasn't trying to claim perfection here. The variance in resulting
performance results was significantly reduced. I didn't make a detailed
statistical study but the range of outliers was smaller by at least 10X.
Without this change the effect of the coding changes that were being debated
was swamped by random variation.

> 
> > +	local_irq_save(flags);
> > +	local_irq_disable();
> 
> local_irq_save() already does local_irq_disable().

OK

> 
> local_irq_disable() doesn't protect against actions of other CPUs.  I'd
> know if this was a bug if the comment wasn't useless :)
> 
> > +	getnstimeofday(&start);
> > +	for (i = 0; i < 100; i++) {
> > +		if (test[i].crc_le != crc32_le(test[i].crc, test_buf +
> > +		    test[i].start, test[i].length))
> > +			errors++;
> > +
> > +		if (test[i].crc_be != crc32_be(test[i].crc, test_buf +
> > +		    test[i].start, test[i].length))
> > +			errors++;
> >  	}
> > ...


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-06 14:50         ` Stephen Rothwell
@ 2011-09-06 19:38           ` Andrew Morton
  2011-09-06 20:18             ` Bob Pearson
  2011-09-07 16:30             ` Bob Pearson
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2011-09-06 19:38 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Joakim Tjernlund, fzago, George Spelvin, linux-kernel, Bob Pearson

On Wed, 7 Sep 2011 00:50:13 +1000
Stephen Rothwell <sfr@canb.auug.org.au> wrote:

> Hi,
> 
> On Tue, 6 Sep 2011 15:40:10 +0200 Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> >
> > It seems like an early version of the slice-by-8 crc32 algo. is in linux-next,
> > committed by you.
> > That version is crap, whats the plan?
> 
> I assume that it is one of the patches in Andrew's tree. When things get
> back closer to normality with kernel.org, I assume that Andrew will
> replace it with a newer version (if he has been sent one).

Yup.  lib-crc-add-slice-by-8-algorithm-to-crc32c.patch is dead meat.  I
sometimes keep things like that around to get them a bit of testing
while reminding myself that there's an open issue to track.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-06 19:38           ` Andrew Morton
@ 2011-09-06 20:18             ` Bob Pearson
  2011-09-07  7:39               ` Joakim Tjernlund
  2011-09-07 16:30             ` Bob Pearson
  1 sibling, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-09-06 20:18 UTC (permalink / raw)
  To: 'Andrew Morton', 'Stephen Rothwell'
  Cc: 'Joakim Tjernlund', fzago, 'George Spelvin',
	linux-kernel

> 
> Yup.  lib-crc-add-slice-by-8-algorithm-to-crc32c.patch is dead meat.  I
> sometimes keep things like that around to get them a bit of testing
> while reminding myself that there's an open issue to track.

I thought I was getting close until recently someone sent out a patch set
for crc32c.c, the other 32 bit CRC in common use, based on an earlier
version of the changes we have been working on for crc32.c. This has brought
in other interested parties and created a bit of duplicated code. I am at a
loss as to the best way to proceed. Personally I would like to see this
change go upstream and then let the rest of the world figure out how to best
merge things.

The list of needed changes based on recent comments I am aware of are:
 - put back in a couple of header files per Andrew
 - fix the summary phrases to conform to coding standards per Andrew
 - add signed off by for Joakim to patch 04/10 per Joakim
 - fix bug in patch 06/10 noted in my email by passing bits as a parameter
to crc32_body

If anyone wants additional changes please let me know and I can put out a
clean version.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
  2011-08-31 22:30 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
@ 2011-09-07  7:31   ` Joakim Tjernlund
  2011-09-07 19:44     ` Bob Pearson
       [not found]   ` <OF3D37A60B.7A33B855-ONC1257904.00276B5B-C1257904.002951AF@LocalDomain>
  1 sibling, 1 reply; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-07  7:31 UTC (permalink / raw)
  To: Bob Pearson; +Cc: akpm, fzago, George Spelvin, linux-kernel

Bob Pearson <rpearson@systemfabricworks.com> wrote on 2011/09/01 00:30:32:
>
> add slicing-by-8 algorithm to the existing
> slicing-by-4 algorithm. This consists of:
>    - extend largest BITS size from 32 to 64
>    - extend tables from tab[4][256] to up to tab[8][256]
>    - Add code for inner loop.
>
> Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
>
> ---
>  lib/crc32.c          |   40 ++++++++++++++++++++++++++++------------
>  lib/crc32defs.h      |   29 +++++++++++++++++++++--------
>  lib/gen_crc32table.c |   43 +++++++++++++++++++++++++++----------------
>  3 files changed, 76 insertions(+), 36 deletions(-)
>
> Index: for-next/lib/crc32.c
> ===================================================================
> --- for-next.orig/lib/crc32.c
> +++ for-next/lib/crc32.c
> @@ -47,25 +47,28 @@ MODULE_LICENSE("GPL");
>
>  #if CRC_LE_BITS > 8 || CRC_BE_BITS > 8
>
> +/* implements slicing-by-4 or slicing-by-8 algorithm */
>  static inline u32
>  crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
>  {
>  # ifdef __LITTLE_ENDIAN
>  #  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
> -#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
> -         t2[(crc >> 8) & 255] ^ \
> -         t1[(crc >> 16) & 255] ^ \
> -         t0[(crc >> 24) & 255]
> +#  define DO_CRC4 (t3[(q) & 255] ^ t2[(q >> 8) & 255] ^ \
> +         t1[(q >> 16) & 255] ^ t0[(q >> 24) & 255])
> +#  define DO_CRC8 (t7[(q) & 255] ^ t6[(q >> 8) & 255] ^ \
> +         t5[(q >> 16) & 255] ^ t4[(q >> 24) & 255])
>  # else
>  #  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
> -#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
> -         t1[(crc >> 8) & 255] ^ \
> -         t2[(crc >> 16) & 255] ^ \
> -         t3[(crc >> 24) & 255]
> +#  define DO_CRC4 (t0[(q) & 255] ^ t1[(q >> 8) & 255] ^ \
> +         t2[(q >> 16) & 255] ^ t3[(q >> 24) & 255])
> +#  define DO_CRC8 (t4[(q) & 255] ^ t5[(q >> 8) & 255] ^ \
> +         t6[(q >> 16) & 255] ^ t7[(q >> 24) & 255])

Don't like the new DO_CRC8 macro. You could get by with my earlier
suggestion:
#  define DO_CRC4(crc, x0, x1, x2, x3) \
		x3[(crc) & 255] ^		\
		x2[(crc >> 8) & 255] ^	\
		x1[(crc >> 16) & 255] ^ \
		x0[(crc >> 24) & 255]

Then the code becomes something like
if (bits == 64) {
		crc = DO_CRC4(crc, t4, t5, t6, t7);
		++b;
		crc ^= DO_CRC4(*b, t0, t1, t2, t3);
} else
		crc = DO_CRC4(crc, t0, t1, t2, t3);

>  # endif
>     const u32 *b;
> -   size_t    rem_len;
> +   size_t rem_len;
>     const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
> +   const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];

t4 to t7 is only used in 64 bit mode.

BTW, it the 64 CRC bits on 32 bits BE arch bug fixed?

> +   u32 q;
>
>     /* Align it */
>     if (unlikely((long)buf & 3 && len)) {
> @@ -73,13 +76,25 @@ crc32_body(u32 crc, unsigned char const
>           DO_CRC(*buf++);
>        } while ((--len) && ((long)buf)&3);
>     }
> +
> +# if CRC_LE_BITS == 32
>     rem_len = len & 3;
> -   /* load data 32 bits wide, xor data 32 bits wide. */
>     len = len >> 2;
> +# else
> +   rem_len = len & 7;
> +   len = len >> 3;

I still fail to see why this is needed. You still do 32 bit loads so this
only makes the code uglier, harder to maintain and makes small unaligned crc bufs
slower.

....

> Index: for-next/lib/gen_crc32table.c
> ===================================================================
> --- for-next.orig/lib/gen_crc32table.c
> +++ for-next/lib/gen_crc32table.c
> @@ -1,23 +1,28 @@
..
>
> -static void output_table(uint32_t (*table)[256], int len, char *trans)
> +static void output_table(uint32_t (*table)[256], int rows, int len, char *trans)

This table is not always 256 entries. I suggested a cleaner impl. earlier.
Something like this:

-static void output_table(uint32_t table[4][256], int len, char *trans)
+static void output_table(uint32_t table[], int len, char *trans)
 {
-	int i, j;
-
-	for (j = 0 ; j < 4; j++) {
-		printf("{");
-		for (i = 0; i < len - 1; i++) {
-			if (i % ENTRIES_PER_LINE == 0)
-				printf("\n");
-			printf("%s(0x%8.8xL), ", trans, table[j][i]);
-		}
-		printf("%s(0x%8.8xL)},\n", trans, table[j][len - 1]);
+	int i;
+
+	printf("{");
+	for (i = 0; i < len - 1; i++) {
+		if (i % ENTRIES_PER_LINE == 0)
+			printf("\n");
+		printf("%s(0x%8.8xL), ", trans, table[i]);
 	}
+	printf("%s(0x%8.8xL)},\n", trans, table[len - 1]);
 }

 int main(int argc, char** argv)
 {
+	int i;
+
 	printf("/* this file is generated - do not edit */\n\n");

 	if (CRC_LE_BITS > 1) {
 		crc32init_le();
-		printf("static const u32 crc32table_le[4][256] = {");
-		output_table(crc32table_le, LE_TABLE_SIZE, "tole");
+		printf("static const u32 crc32table_le[%d][%d] = {",
+		       LE_TABLE_ROWS, LE_TABLE_SIZE);
+		for (i = 0 ; i < LE_TABLE_ROWS; i++)
+			output_table(crc32table_le[i], LE_TABLE_SIZE, "tole");
 		printf("};\n");
 	}

 	if (CRC_BE_BITS > 1) {
 		crc32init_be();
-		printf("static const u32 crc32table_be[4][256] = {");
-		output_table(crc32table_be, BE_TABLE_SIZE, "tobe");
+		printf("static const u32 crc32table_be[%d][%d] = {",
+		       BE_TABLE_ROWS, BE_TABLE_SIZE);
+		for (i = 0 ; i < BE_TABLE_ROWS; i++)
+			output_table(crc32table_be[i], BE_TABLE_SIZE, "tobe");
 		printf("};\n");
 	}



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
  2011-09-01  3:03 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
@ 2011-09-07  7:32   ` Joakim Tjernlund
  0 siblings, 0 replies; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-07  7:32 UTC (permalink / raw)
  To: Bob Pearson; +Cc: akpm, fzago, George Spelvin, linux-kernel

Bob Pearson <rpearson@systemfabricworks.com> wrote on 2011/09/01 05:03:21:
>
> I've been looking at this stuff for too long! I just noticed that
> crc32_body incorrectly always uses CRC_LE_BITS to pick algorithm.
> Replace with a function parameter that will get optimized out
> by the compiler since crc32_body is inlined.

yes, this is a nice change.

 Jocke


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-06 20:18             ` Bob Pearson
@ 2011-09-07  7:39               ` Joakim Tjernlund
  0 siblings, 0 replies; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-07  7:39 UTC (permalink / raw)
  To: Bob Pearson
  Cc: 'Andrew Morton', fzago, 'George Spelvin',
	linux-kernel, 'Stephen Rothwell'

"Bob Pearson" <rpearson@systemfabricworks.com> wrote on 2011/09/06 22:18:35:
>
> >
> > Yup.  lib-crc-add-slice-by-8-algorithm-to-crc32c.patch is dead meat.  I
> > sometimes keep things like that around to get them a bit of testing
> > while reminding myself that there's an open issue to track.
>
> I thought I was getting close until recently someone sent out a patch set
> for crc32c.c, the other 32 bit CRC in common use, based on an earlier
> version of the changes we have been working on for crc32.c. This has brought
> in other interested parties and created a bit of duplicated code. I am at a
> loss as to the best way to proceed. Personally I would like to see this
> change go upstream and then let the rest of the world figure out how to best
> merge things.

The crc32c code should be moved to lib and integrated into crc32.c. Then you get
all the optimizations for free. It think the new crypto crc32c code should just
be nacked.

 Jocke


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
       [not found]   ` <OF3D37A60B.7A33B855-ONC1257904.00276B5B-C1257904.002951AF@LocalDomain>
@ 2011-09-07  8:30     ` Joakim Tjernlund
  0 siblings, 0 replies; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-07  8:30 UTC (permalink / raw)
  Cc: Bob Pearson, akpm, fzago, George Spelvin, linux-kernel

Joakim Tjernlund/Transmode wrote on 2011/09/07 09:31:18:
> > +# if CRC_LE_BITS == 32
> >     rem_len = len & 3;
> > -   /* load data 32 bits wide, xor data 32 bits wide. */
> >     len = len >> 2;
> > +# else
> > +   rem_len = len & 7;
> > +   len = len >> 3;
>
> I still fail to see why this is needed. You still do 32 bit loads so this
> only makes the code uglier, harder to maintain and makes small unaligned crc bufs
> slower.

Sorry, misread this part. Ignore it.

 Jocke


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-06 19:38           ` Andrew Morton
  2011-09-06 20:18             ` Bob Pearson
@ 2011-09-07 16:30             ` Bob Pearson
  2011-09-07 17:51               ` Joakim Tjernlund
  1 sibling, 1 reply; 28+ messages in thread
From: Bob Pearson @ 2011-09-07 16:30 UTC (permalink / raw)
  To: 'Andrew Morton', 'Stephen Rothwell'
  Cc: 'Joakim Tjernlund', fzago, 'George Spelvin',
	linux-kernel

> 
> The list of needed changes based on recent comments I am aware of are:
>  - put back in a couple of header files per Andrew
>  - fix the summary phrases to conform to coding standards per Andrew
>  - add signed off by for Joakim to patch 04/10 per Joakim
>  - fix bug in patch 06/10 noted in my email by passing bits as a parameter
to
> crc32_body
Adding three additional changes:
 - update macro per Joakim
 - shorter generator code per George
 - remove unneeded local_irq_disable per Andrew
Any more?

Can we finish crc32 before we start boiling the ocean with crc32c, crc16
etc.? I am happy to pitch in on those but I would like to see something
actually get done. I actually need this one.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 05/10] crc32-misc-cleanup.diff
  2011-09-07 16:30             ` Bob Pearson
@ 2011-09-07 17:51               ` Joakim Tjernlund
  0 siblings, 0 replies; 28+ messages in thread
From: Joakim Tjernlund @ 2011-09-07 17:51 UTC (permalink / raw)
  To: Bob Pearson
  Cc: 'Andrew Morton', fzago, 'George Spelvin',
	linux-kernel, 'Stephen Rothwell'

"Bob Pearson" <rpearson@systemfabricworks.com> wrote on 2011/09/07 18:30:31:
>
> >
> > The list of needed changes based on recent comments I am aware of are:
> >  - put back in a couple of header files per Andrew
> >  - fix the summary phrases to conform to coding standards per Andrew
> >  - add signed off by for Joakim to patch 04/10 per Joakim
> >  - fix bug in patch 06/10 noted in my email by passing bits as a parameter
> to
> > crc32_body
> Adding three additional changes:
>  - update macro per Joakim
>  - shorter generator code per George
>  - remove unneeded local_irq_disable per Andrew
> Any more?

There was more in my comments. Don't you scroll through the whole reply
when you read it? Especially this one:
  "BTW, is the 64 CRC bits on 32 bits BE arch bug fixed?"

>
> Can we finish crc32 before we start boiling the ocean with crc32c, crc16
> etc.? I am happy to pitch in on those but I would like to see something
> actually get done. I actually need this one.

Yes, lets get this cleared first. The other stuff depends on this.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
  2011-09-07  7:31   ` Joakim Tjernlund
@ 2011-09-07 19:44     ` Bob Pearson
  0 siblings, 0 replies; 28+ messages in thread
From: Bob Pearson @ 2011-09-07 19:44 UTC (permalink / raw)
  To: 'Joakim Tjernlund'
  Cc: akpm, fzago, 'George Spelvin', linux-kernel



> -----Original Message-----
> From: Joakim Tjernlund [mailto:joakim.tjernlund@transmode.se]
> Sent: Wednesday, September 07, 2011 2:31 AM
> To: Bob Pearson
> Cc: akpm@linux-foundation.org; fzago@systemfabricworks.com; George
> Spelvin; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 08/10] crc32-add-slicing-by-8.diff
> 
> Bob Pearson <rpearson@systemfabricworks.com> wrote on 2011/09/01
> 00:30:32:
> >
> > add slicing-by-8 algorithm to the existing
> > slicing-by-4 algorithm. This consists of:
> >    - extend largest BITS size from 32 to 64
> >    - extend tables from tab[4][256] to up to tab[8][256]
> >    - Add code for inner loop.
> >
> > Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
> >
> > ---
> >  lib/crc32.c          |   40 ++++++++++++++++++++++++++++------------
> >  lib/crc32defs.h      |   29 +++++++++++++++++++++--------
> >  lib/gen_crc32table.c |   43 +++++++++++++++++++++++++++----------------
> >  3 files changed, 76 insertions(+), 36 deletions(-)
> >
> > Index: for-next/lib/crc32.c
> >
> ==========================================================
> =========
> > --- for-next.orig/lib/crc32.c
> > +++ for-next/lib/crc32.c
> > @@ -47,25 +47,28 @@ MODULE_LICENSE("GPL");
> >
> >  #if CRC_LE_BITS > 8 || CRC_BE_BITS > 8
> >
> > +/* implements slicing-by-4 or slicing-by-8 algorithm */
> >  static inline u32
> >  crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32
> (*tab)[256])
> >  {
> >  # ifdef __LITTLE_ENDIAN
> >  #  define DO_CRC(x) (crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8))
> > -#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
> > -         t2[(crc >> 8) & 255] ^ \
> > -         t1[(crc >> 16) & 255] ^ \
> > -         t0[(crc >> 24) & 255]
> > +#  define DO_CRC4 (t3[(q) & 255] ^ t2[(q >> 8) & 255] ^ \
> > +         t1[(q >> 16) & 255] ^ t0[(q >> 24) & 255])
> > +#  define DO_CRC8 (t7[(q) & 255] ^ t6[(q >> 8) & 255] ^ \
> > +         t5[(q >> 16) & 255] ^ t4[(q >> 24) & 255])
> >  # else
> >  #  define DO_CRC(x) (crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8))
> > -#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
> > -         t1[(crc >> 8) & 255] ^ \
> > -         t2[(crc >> 16) & 255] ^ \
> > -         t3[(crc >> 24) & 255]
> > +#  define DO_CRC4 (t0[(q) & 255] ^ t1[(q >> 8) & 255] ^ \
> > +         t2[(q >> 16) & 255] ^ t3[(q >> 24) & 255])
> > +#  define DO_CRC8 (t4[(q) & 255] ^ t5[(q >> 8) & 255] ^ \
> > +         t6[(q >> 16) & 255] ^ t7[(q >> 24) & 255])
> 
> Don't like the new DO_CRC8 macro. You could get by with my earlier
> suggestion:
> #  define DO_CRC4(crc, x0, x1, x2, x3) \
> 		x3[(crc) & 255] ^		\
> 		x2[(crc >> 8) & 255] ^	\
> 		x1[(crc >> 16) & 255] ^ \
> 		x0[(crc >> 24) & 255]
> 
> Then the code becomes something like
> if (bits == 64) {
> 		crc = DO_CRC4(crc, t4, t5, t6, t7);
> 		++b;
> 		crc ^= DO_CRC4(*b, t0, t1, t2, t3);
> } else
> 		crc = DO_CRC4(crc, t0, t1, t2, t3);

OK by me.

> 
> >  # endif
> >     const u32 *b;
> > -   size_t    rem_len;
> > +   size_t rem_len;
> >     const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
> > +   const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
> 
> t4 to t7 is only used in 64 bit mode.

Yes but with the CEC_XX_BITS->bits change this point is moot.

> 
> BTW, it the 64 CRC bits on 32 bits BE arch bug fixed?

I have a 64 bit sparc machine and a bunch of x86 variants. As far as I am
able to tell these are working for  all the different algorithms.
If you have a 32 bit PPC machine please try to see if this is still a
problem.

> 
> > +   u32 q;
> >
> >     /* Align it */
> >     if (unlikely((long)buf & 3 && len)) {
> > @@ -73,13 +76,25 @@ crc32_body(u32 crc, unsigned char const
> >           DO_CRC(*buf++);
> >        } while ((--len) && ((long)buf)&3);
> >     }
> > +
> > +# if CRC_LE_BITS == 32
> >     rem_len = len & 3;
> > -   /* load data 32 bits wide, xor data 32 bits wide. */
> >     len = len >> 2;
> > +# else
> > +   rem_len = len & 7;
> > +   len = len >> 3;
> 
> I still fail to see why this is needed. You still do 32 bit loads so this
> only makes the code uglier, harder to maintain and makes small unaligned
crc
> bufs
> slower.

You suggested and I changed the initial alignment code to pick the first
available 4 byte boundary.
After that, the inner loop consumes respectively 4 or 8 bytes of data per
iteration for the 32/64 version of the algorithm. So the computation of len
and rem_len *must* be made differently in the two cases.

> 
> ....
> 
> > Index: for-next/lib/gen_crc32table.c
> >
> ==========================================================
> =========
> > --- for-next.orig/lib/gen_crc32table.c
> > +++ for-next/lib/gen_crc32table.c
> > @@ -1,23 +1,28 @@
> ..
> >
> > -static void output_table(uint32_t (*table)[256], int len, char *trans)
> > +static void output_table(uint32_t (*table)[256], int rows, int len,
char
> *trans)
> 
> This table is not always 256 entries. I suggested a cleaner impl. earlier.
> Something like this:
> 
> -static void output_table(uint32_t table[4][256], int len, char *trans)
> +static void output_table(uint32_t table[], int len, char *trans)
>  {
> -	int i, j;
> -
> -	for (j = 0 ; j < 4; j++) {
> -		printf("{");
> -		for (i = 0; i < len - 1; i++) {
> -			if (i % ENTRIES_PER_LINE == 0)
> -				printf("\n");
> -			printf("%s(0x%8.8xL), ", trans, table[j][i]);
> -		}
> -		printf("%s(0x%8.8xL)},\n", trans, table[j][len - 1]);
> +	int i;
> +
> +	printf("{");
> +	for (i = 0; i < len - 1; i++) {
> +		if (i % ENTRIES_PER_LINE == 0)
> +			printf("\n");
> +		printf("%s(0x%8.8xL), ", trans, table[i]);
>  	}
> +	printf("%s(0x%8.8xL)},\n", trans, table[len - 1]);
>  }
> 
>  int main(int argc, char** argv)
>  {
> +	int i;
> +
>  	printf("/* this file is generated - do not edit */\n\n");
> 
>  	if (CRC_LE_BITS > 1) {
>  		crc32init_le();
> -		printf("static const u32 crc32table_le[4][256] = {");
> -		output_table(crc32table_le, LE_TABLE_SIZE, "tole");
> +		printf("static const u32 crc32table_le[%d][%d] = {",
> +		       LE_TABLE_ROWS, LE_TABLE_SIZE);
> +		for (i = 0 ; i < LE_TABLE_ROWS; i++)
> +			output_table(crc32table_le[i], LE_TABLE_SIZE,
> "tole");
>  		printf("};\n");
>  	}
> 
>  	if (CRC_BE_BITS > 1) {
>  		crc32init_be();
> -		printf("static const u32 crc32table_be[4][256] = {");
> -		output_table(crc32table_be, BE_TABLE_SIZE, "tobe");
> +		printf("static const u32 crc32table_be[%d][%d] = {",
> +		       BE_TABLE_ROWS, BE_TABLE_SIZE);
> +		for (i = 0 ; i < BE_TABLE_ROWS; i++)
> +			output_table(crc32table_be[i], BE_TABLE_SIZE,
> "tobe");
>  		printf("};\n");
>  	}
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-09-07 19:44 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20110831213729.395283830@systemfabricworks.com>
2011-08-31 22:29 ` [PATCH v6 01/10] crc32-remove-trailing-whitespace.diff Bob Pearson
2011-08-31 22:29 ` [PATCH v6 02/10] crc32-move-to-documentation.diff Bob Pearson
2011-08-31 22:29 ` [PATCH v6 03/10] crc32-replace-self-test.diff Bob Pearson
2011-09-02 23:51   ` Andrew Morton
2011-09-06 16:14     ` Bob Pearson
2011-08-31 22:30 ` [PATCH v6 04/10] crc32-add-pointer-to-tab.diff Bob Pearson
2011-09-01  8:16   ` Joakim Tjernlund
2011-08-31 22:30 ` [PATCH v6 05/10] crc32-misc-cleanup.diff Bob Pearson
2011-09-02 23:50   ` Andrew Morton
2011-09-03  1:44     ` Stephen Rothwell
2011-09-06 13:40       ` Joakim Tjernlund
2011-09-06 14:50         ` Stephen Rothwell
2011-09-06 19:38           ` Andrew Morton
2011-09-06 20:18             ` Bob Pearson
2011-09-07  7:39               ` Joakim Tjernlund
2011-09-07 16:30             ` Bob Pearson
2011-09-07 17:51               ` Joakim Tjernlund
2011-09-06 16:05     ` Bob Pearson
2011-08-31 22:30 ` [PATCH v6 06/10] crc32-fix-check-endian-warnings.diff Bob Pearson
2011-08-31 22:30 ` [PATCH v6 07/10] crc32-add-real-8-bit.diff Bob Pearson
2011-08-31 22:30 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
2011-09-07  7:31   ` Joakim Tjernlund
2011-09-07 19:44     ` Bob Pearson
     [not found]   ` <OF3D37A60B.7A33B855-ONC1257904.00276B5B-C1257904.002951AF@LocalDomain>
2011-09-07  8:30     ` Joakim Tjernlund
2011-08-31 22:30 ` [PATCH v6 09/10] crc32-optimize-loops-for-x86.diff Bob Pearson
2011-08-31 22:30 ` [PATCH v6 10/10] crc32-final.diff Bob Pearson
2011-09-01  3:03 ` [PATCH v6 08/10] crc32-add-slicing-by-8.diff Bob Pearson
2011-09-07  7:32   ` Joakim Tjernlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.