linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: herbert@gondor.apana.org.au, JBeulich@suse.com,
	tim.c.chen@linux.intel.com
Cc: linux@horizon.com, linux-kernel@vger.kernel.org, sandyw@twitter.com
Subject: [RFC PATCH] crypto: crc32c-pclmul - Use pmovzxdq to shrink K_table
Date: 28 May 2014 10:40:00 -0400	[thread overview]
Message-ID: <20140528144000.28686.qmail@ns.horizon.com> (raw)

While following a number of tangents in the code (I was figuring out
how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit
words, all of which had the high half set to zero.

Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously
present, so it could use pmovzxdq and save 1K of kernel data.

The following patch obviously lacks the kludges for old binutils,
but should convey the general idea.

Jan: Is support for SLE10's pre-2.18 binutils still required?
Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.

Two other minor additional changes:

1. The current code unnecessarily puts the table in the read-write
   .data section.  Moved to .text.
2. I'm also not sure why it's necessary to force such large alignment
   on K_table.  Comments on reducing it?

Signed-off-by: George Spelvin <linux@horizon.com>


diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index dbc4339b..9f885ee4 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -216,15 +216,11 @@ LABEL crc_ %i
 	## 4) Combine three results:
 	################################################################
 
-	lea	(K_table-16)(%rip), bufp	# first entry is for idx 1
+	lea	(K_table-8)(%rip), bufp		# first entry is for idx 1
 	shlq    $3, %rax			# rax *= 8
-	subq    %rax, tmp			# tmp -= rax*8
-	shlq    $1, %rax
-	subq    %rax, tmp			# tmp -= rax*16
-						# (total tmp -= rax*24)
-	addq    %rax, bufp
-
-	movdqa  (bufp), %xmm0			# 2 consts: K1:K2
+	pmovzxdq (bufp,%rax), %xmm0		# 2 consts: K1:K2
+	leal	(%eax,%eax,2), %eax		# rax *= 3 (total *24)
+	subq    %rax, tmp			# tmp -= rax*24
 
 	movq    crc_init, %xmm1			# CRC for block 1
 	PCLMULQDQ 0x00,%xmm0,%xmm1		# Multiply by K2
@@ -331,136 +327,135 @@ ENDPROC(crc_pcl)
 
 	################################################################
 	## PCLMULQDQ tables
-	## Table is 128 entries x 2 quad words each
+	## Table is 128 entries x 2 words (8 bytes) each
 	################################################################
-.data
-.align 64
+.align 8
 K_table:
-        .quad 0x14cd00bd6,0x105ec76f0
+        .long 0x14cd00bd6,0x105ec76f0
-        .quad 0x0ba4fc28e,0x14cd00bd6
+        .long 0x0ba4fc28e,0x14cd00bd6
-        .quad 0x1d82c63da,0x0f20c0dfe
+        .long 0x1d82c63da,0x0f20c0dfe
-        .quad 0x09e4addf8,0x0ba4fc28e
+        .long 0x09e4addf8,0x0ba4fc28e
-        .quad 0x039d3b296,0x1384aa63a
+        .long 0x039d3b296,0x1384aa63a
-        .quad 0x102f9b8a2,0x1d82c63da
+        .long 0x102f9b8a2,0x1d82c63da
-        .quad 0x14237f5e6,0x01c291d04
+        .long 0x14237f5e6,0x01c291d04
-        .quad 0x00d3b6092,0x09e4addf8
+        .long 0x00d3b6092,0x09e4addf8

(Remaining boring bits of this hunk elided.)

             reply	other threads:[~2014-05-28 14:40 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-28 14:40 George Spelvin [this message]
2014-05-28 15:32 ` [RFC PATCH] crypto: crc32c-pclmul - Use pmovzxdq to shrink K_table George Spelvin
2014-05-28 22:15   ` [PATCH v2] crypto: crc32c-pclmul - Shrink K_table to 32-bit words George Spelvin
2014-05-28 23:02     ` Tim Chen
2014-05-28 23:55       ` George Spelvin
2014-05-29  3:26       ` George Spelvin
2014-05-29 16:33         ` Tim Chen
2014-05-28 20:47 ` [RFC PATCH] crypto: crc32c-pclmul - Use pmovzxdq to shrink K_table Jan Beulich
2014-05-28 21:47   ` George Spelvin
2014-05-29  6:44     ` Jan Beulich
2014-05-28 22:32 ` Tim Chen
2014-05-28 23:01   ` George Spelvin
2014-05-28 23:28     ` Tim Chen
2014-05-29 23:54       ` George Spelvin
2014-05-30  1:07         ` Tim Chen
2014-05-30  1:16           ` Dave Jones
2014-05-30 17:56             ` Tim Chen
2014-05-30 18:45               ` Dirk Brandewie
2014-05-30 19:32                 ` Tim Chen
2014-05-30 19:38                   ` Dirk Brandewie
2014-05-30 20:07                     ` Tim Chen
2014-05-30 20:15                       ` Dirk Brandewie
2014-05-30  1:37           ` George Spelvin
2014-05-30  5:25             ` George Spelvin
2014-05-30 16:10               ` Tim Chen
2014-05-30 16:52                 ` George Spelvin
2014-05-30 17:01                   ` Tim Chen
2014-06-07  3:08                     ` [PATCH v3] crypto: crc32c-pclmul - Shrink K_table to 32-bit words George Spelvin
2014-06-20 18:42                       ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140528144000.28686.qmail@ns.horizon.com \
    --to=linux@horizon.com \
    --cc=JBeulich@suse.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sandyw@twitter.com \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).