From: "Bob Pearson" <rpearson@systemfabricworks.com>
To: "'George Spelvin'" <linux@horizon.com>,
<fzago@systemfabricworks.com>, <linux-kernel@vger.kernel.org>
Cc: <akpm@linux-foundation.org>, <joakim.tjernlund@transmode.se>
Subject: RE: [PATCH] add slice by 8 algorithm to crc32.c
Date: Mon, 8 Aug 2011 11:50:57 -0500 [thread overview]
Message-ID: <002901cc55eb$5983b0e0$0c8b12a0$@systemfabricworks.com> (raw)
In-Reply-To: <20110808092826.21881.qmail@science.horizon.com>
> -----Original Message-----
> From: George Spelvin [mailto:linux@horizon.com]
> Sent: Monday, August 08, 2011 4:28 AM
> To: fzago@systemfabricworks.com; linux-kernel@vger.kernel.org
> Cc: akpm@linux-foundation.org; joakim.tjernlund@transmode.se;
> linux@horizon.com; rpearson@systemfabricworks.com
> Subject: [PATCH] add slice by 8 algorithm to crc32.c
>
> Sorry I didn't see this when first posted.
>
> The "slice by 8" terminology is pretty confusing. How about
> "Extended Joakim Tjernlund's optimization from commit
> 836e2af92503f1642dbc3c3281ec68ec1dd39d2e to 8-way parallelism."
Here is a link to the article I first read about this algorithm. It mentions
both the 4 and 8 byte version.
I do not know about priority between Joakim and the folks at Intel but Intel
is usually credited with the idea in other articles I have seen. Clearly the
algorithm that is currently in crc32.c is the same as the one described in
the article. As you can see I mis-copied the name from slicing-by-8 to slice
by 8.
http://www.intel.com/technology/comms/perfnet/download/CRC_generators.pdf
>
> Which is essentally what you're doing. The renaming of tab[0] to t0_le
> and t0_be, and removal of the DO_CRC4 macro just increases the diff size.
>
> If you're looking at speeding up the CRC through larger tables, have
> you tried using 10+11+11-bit tables? That would require 20K of tables
> rather than 8K, but would reduce the number of table lookups per byte.
>
>
> One more stunt you could try to increase parallelism: rather than maintain
> the CRC in one register, maintain it in several, and only XOR and collapse
> them at the end.
>
> Start with your 64-bit code, but imagine that the second code block's
> "q = *p32++" always loads 0, and therefore the whole block can be skipped.
> (Since tab[0] = 0 for all CRC tables.)
>
> This computes the CRC of the even words. Then do a second one in parallel
> for the odd words into a separate CRC register. Then combine them at the
> end.
> (Shift one up by 32 bits and XOR into the other.)
>
> This would let you get away with 5K of tables: t4 through t7, and t0.
> t1 through t3 could be skipped.
>
>
> Ideally, I'd write all this code myself, but I'm a bit crunched at work
> right now so wouldn't be able to get to it for a few days.
>
>
>
> Another possible simplification to the startup code. There's no need
> to compute init_bytes explicitly; just loop until the pointer is aligned:
>
> while ((unsigned)buf & 3) {
> if (!len--)
> goto done;
> #ifdef __LITTLE_ENDIAN
> i0 = *buf++ ^ crc;
> crc = t0_le[i0] ^ (crc >> 8);
> #else
> i0 = *buf++ ^ (crc >> 24);
> crc = t0_le[i0] ^ (crc << 8);
> #endif
> }
> p32 = (u32 const *)buf;
> words = len >> 2;
> end_bytes = len & 3;
>
>
> ... although I'd prefer to keep the DO_CRC() and DO_CRC4 macros, and
> extend them to the 64-bit case, to avoid the nested #ifdefs. That would
> make:
>
> while ((unsigned)buf & 3) {
> if (!len--)
> goto done;
> DO_CRC(*buf++);
> }
> p32 = (u32 const *)buf;
> words = len >> 2;
> end_bytes = len & 3;
Personally I don't like macros unless they are very frequently used as you
can probably tell. The ifdefs were somewhat rediced in the second version of
the patch.
next prev parent reply other threads:[~2011-08-08 16:51 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-08 9:28 [PATCH] add slice by 8 algorithm to crc32.c George Spelvin
2011-08-08 10:31 ` Joakim Tjernlund
2011-08-08 10:52 ` George Spelvin
2011-08-08 11:11 ` Joakim Tjernlund
2011-08-08 17:04 ` Bob Pearson
[not found] ` <OFEA1BD2B2.B2A7F07F-ONC12578E6.003D368C-C12578E6.003D7468@LocalDomain>
2011-08-08 11:24 ` Joakim Tjernlund
2011-08-08 11:42 ` Joakim Tjernlund
2011-08-08 12:54 ` George Spelvin
2011-08-08 17:01 ` Bob Pearson
2011-08-08 20:45 ` George Spelvin
2011-08-08 22:21 ` Bob Pearson
2011-08-08 16:54 ` Bob Pearson
2011-08-08 16:50 ` Bob Pearson [this message]
[not found] <OF4AE0115F.3AA5397E-ONC12578DF.002EC6DF-C12578DF.003348E5@transmode.se>
2011-08-02 21:14 ` Bob Pearson
2011-08-02 21:19 ` Bob Pearson
2011-08-04 11:54 ` Joakim Tjernlund
2011-08-04 18:53 ` Bob Pearson
2011-08-05 9:22 ` Joakim Tjernlund
2011-08-05 15:51 ` Bob Pearson
2011-08-08 7:11 ` Joakim Tjernlund
2011-08-05 17:27 ` Bob Pearson
2011-08-08 7:15 ` Joakim Tjernlund
[not found] ` <OF14136E0E.3F2388EF-ONC12578E3.00301969-C12578E3.00338524@LocalDomain>
2011-08-05 13:34 ` Joakim Tjernlund
-- strict thread matches above, loose matches on Subject: below --
2011-07-20 22:19 frank zago
2011-07-28 22:16 ` Andrew Morton
2011-07-29 1:47 ` Bob Pearson
2011-08-01 19:39 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='002901cc55eb$5983b0e0$0c8b12a0$@systemfabricworks.com' \
--to=rpearson@systemfabricworks.com \
--cc=akpm@linux-foundation.org \
--cc=fzago@systemfabricworks.com \
--cc=joakim.tjernlund@transmode.se \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@horizon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).