From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
To: Jussi Kivilinna <jussi.kivilinna@iki.fi>,
Herbert Xu <herbert@gondor.apana.org.au>,
"David S. Miller" <davem@davemloft.net>,
Vitaly Chikunov <vt@altlinux.org>,
Eric Biggers <ebiggers@google.com>,
Eric Biggers <ebiggers@kernel.org>,
Gilad Ben-Yossef <gilad@benyossef.com>,
Ard Biesheuvel <ardb@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-crypto@vger.kernel.org, x86@kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/6] crypto: x86/sm3 - add AVX assembly implementation
Date: Tue, 21 Dec 2021 15:39:28 +0800 [thread overview]
Message-ID: <404b02be-2e94-1d80-8512-f25a5a93378e@linux.alibaba.com> (raw)
In-Reply-To: <9e70bf33-bab5-83a3-1eb0-7cae442c2f64@iki.fi>
On 12/21/21 2:03 AM, Jussi Kivilinna wrote:
> On 20.12.2021 10.22, Tianjia Zhang wrote:
>> This patch adds AVX assembly accelerated implementation of SM3 secure
>> hash algorithm. From the benchmark data, compared to pure software
>> implementation sm3-generic, the performance increase is up to 38%.
>>
>> The main algorithm implementation based on SM3 AES/BMI2 accelerated
>> work by libgcrypt at:
>> https://gnupg.org/software/libgcrypt/index.html
>>
>> Benchmark on Intel i5-6200U 2.30GHz, performance data of two
>> implementations, pure software sm3-generic and sm3-avx acceleration.
>> The data comes from the 326 mode and 422 mode of tcrypt. The abscissas
>> are different lengths of per update. The data is tabulated and the
>> unit is Mb/s:
>>
>> update-size | 16 64 256 1024 2048 4096 8192
>> --------------------------------------------------------------------
>> sm3-generic | 105.97 129.60 182.12 189.62 188.06 193.66 194.88
>> sm3-avx | 119.87 163.05 244.44 260.92 257.60 264.87 265.88
>>
>> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
>> ---
>> arch/x86/crypto/Makefile | 3 +
>> arch/x86/crypto/sm3-avx-asm_64.S | 521 +++++++++++++++++++++++++++++++
>> arch/x86/crypto/sm3_avx_glue.c | 134 ++++++++
>> crypto/Kconfig | 13 +
>> 4 files changed, 671 insertions(+)
>> create mode 100644 arch/x86/crypto/sm3-avx-asm_64.S
>> create mode 100644 arch/x86/crypto/sm3_avx_glue.c
>>
>> diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
>> index f307c93fc90a..7cbe860f6201 100644
>> --- a/arch/x86/crypto/Makefile
>> +++ b/arch/x86/crypto/Makefile
>> @@ -88,6 +88,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o
>> nhpoly1305-avx2-glue.o
>> obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
>> +obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) += sm3-avx-x86_64.o
>> +sm3-avx-x86_64-y := sm3-avx-asm_64.o sm3_avx_glue.o
>> +
>> obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o
>> sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o
>> diff --git a/arch/x86/crypto/sm3-avx-asm_64.S
>> b/arch/x86/crypto/sm3-avx-asm_64.S
>> new file mode 100644
>> index 000000000000..e7a9a37f3609
>> --- /dev/null
>> +++ b/arch/x86/crypto/sm3-avx-asm_64.S
>> @@ -0,0 +1,521 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * SM3 AVX accelerated transform.
>> + * specified in:
>> https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02
>> + *
>> + * Copyright (C) 2021 Jussi Kivilinna <jussi.kivilinna@iki.fi>
>> + * Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
>> + */
> <snip>
>> +
>> +#define R(i, a, b, c, d, e, f, g, h, round, widx,
>> wtype) \
>> + /* rol(a, 12) => t0
>> */ \
>> + roll3mov(12, a, t0); /* rorxl here would reduce perf by 6% on
>> zen3 */ \
>> + /* rol (t0 + e + t), 7) => t1
>> */ \
>> + addl3(t0, e,
>> t1); \
>> + addl $K##round,
>> t1; \
>
> It's better to use "leal K##round(t0, e, 1), t1;" here and fix K0-K63
> macros
> instead as I noted at libgcrypt mailing-list:
> https://lists.gnupg.org/pipermail/gcrypt-devel/2021-December/005209.html
>
> -Jussi
Thanks for pointing it out, I will fix it in the next patch.
Best regards,
Tianjia
next prev parent reply other threads:[~2021-12-21 7:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-20 8:22 [PATCH 0/6] Introduce x86 assembly accelerated implementation for SM3 algorithm Tianjia Zhang
2021-12-20 8:22 ` [PATCH 1/6] crypto: sm3 - create SM3 stand-alone library Tianjia Zhang
2021-12-20 8:22 ` [PATCH 2/6] crypto: arm64/sm3-ce - make dependent on sm3 library Tianjia Zhang
2021-12-20 8:22 ` [PATCH 3/6] crypto: sm2 " Tianjia Zhang
2021-12-20 8:22 ` [PATCH 4/6] crypto: sm3 " Tianjia Zhang
2021-12-20 8:22 ` [PATCH 5/6] crypto: x86/sm3 - add AVX assembly implementation Tianjia Zhang
2021-12-20 18:03 ` Jussi Kivilinna
2021-12-21 7:39 ` Tianjia Zhang [this message]
2021-12-20 8:22 ` [PATCH 6/6] crypto: tcrypt - add asynchronous speed test for SM3 Tianjia Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=404b02be-2e94-1d80-8512-f25a5a93378e@linux.alibaba.com \
--to=tianjia.zhang@linux.alibaba.com \
--cc=ardb@kernel.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=ebiggers@google.com \
--cc=ebiggers@kernel.org \
--cc=gilad@benyossef.com \
--cc=herbert@gondor.apana.org.au \
--cc=hpa@zytor.com \
--cc=jussi.kivilinna@iki.fi \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=vt@altlinux.org \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).