From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
"David S. Miller" <davem@davemloft.net>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Gilad Ben-Yossef <gilad@benyossef.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
"Markku-Juhani O . Saarinen" <mjos@iki.fi>,
Jussi Kivilinna <jussi.kivilinna@iki.fi>,
x86@kernel.org, linux-crypto@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] crypto: x86/sm4 - add AES-NI/AVX/x86_64 assembler implementation
Date: Sun, 13 Jun 2021 18:14:52 +0800 [thread overview]
Message-ID: <d8df4d67-1a01-ff83-7739-af49b5b5e574@linux.alibaba.com> (raw)
In-Reply-To: <YMKf93/cnPGGtRW3@gmail.com>
Hi Eric,
On 6/11/21 7:27 AM, Eric Biggers wrote:
> On Thu, Jun 10, 2021 at 09:44:59PM +0800, Tianjia Zhang wrote:
>> This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4
>> block cipher. Through two affine transforms, we can use the AES
>> S-Box to simulate the SM4 S-Box to achieve the effect of instruction
>> acceleration.
>>
>
> Benchmark results, please.
>
> Also, is this passing the self-tests, including the fuzz tests?
>
I will provide this information in the next version.
>> +/*
>> + * void sm4_aesni_avx_expand_key(const u8 *key, u32 *rk_enc,
>> + * u32 *rk_dec, const u32 *fk, const u32 *ck);
>> + */
>> +SYM_FUNC_START(sm4_aesni_avx_expand_key)
>> + /* input:
>> + * %rdi: 128-bit key
>> + * %rsi: rkey_enc
>> + * %rdx: rkey_dec
>> + * %rcx: fk array
>> + * %r8: ck array
>> + */
>> + FRAME_BEGIN
>
> Key expansion isn't performance-critical. Can the C library version be used, or
> does the key need to be expanded in a way specific to this x86 implementation?
>
It can be replaced by a common implementation of C library. For expand
key that are not called frequently, the optimization of a specific
instruction set does not bring much benefit. Of course, it is possible
to delete this implementation.
>> +/*
>> + * void sm4_aesni_avx_crypt4(const u32 *rk, u8 *dst,
>> + * const u8 *src, int nblocks)
>> + */
>> +SYM_FUNC_START(sm4_aesni_avx_crypt4)
>> + /* input:
>> + * %rdi: round key array, CTX
>> + * %rsi: dst (1..4 blocks)
>> + * %rdx: src (1..4 blocks)
>> + * %rcx: num blocks (1..4)
>> + */
>> + FRAME_BEGIN
> [...]
>
>> +static void sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
>> +{
>> + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
>> +
>> + if (crypto_simd_usable()) {
>> + kernel_fpu_begin();
>> + sm4_aesni_avx_crypt4(ctx->rkey_enc, out, in, 1);
>> + kernel_fpu_end();
>> + } else
>> + crypto_sm4_do_crypt(ctx->rkey_enc, out, in);
>> +}
>> +
>> +static void sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
>> +{
>> + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
>> +
>> + if (crypto_simd_usable()) {
>> + kernel_fpu_begin();
>> + sm4_aesni_avx_crypt4(ctx->rkey_dec, out, in, 1);
>> + kernel_fpu_end();
>> + } else
>> + crypto_sm4_do_crypt(ctx->rkey_dec, out, in);
>> +}
>
> Your assembly code appears to handle encrypting up to 4 blocks at a time.
> However you have only wired this up to the "cipher" API which does 1 block at a
> time. Is this intentional?
>
> What are your performance results with real-world chaining modes like XTS, and
> do you plan to implement any of these modes directly?
>
This implementation is intentional. First, a general block encryption is
provided. There is no obvious performance improvement in this
implementation. The key to optimization is to make full use of parallel
four blocks encryption at a time. This is still under development, and I
will continue to implement things like XTS in the future. Optimization
of such specific modes.
>> +
>> +static struct crypto_alg sm4_asm_alg = {
>> + .cra_name = "sm4",
>> + .cra_driver_name = "sm4-asm",
>
> In arch/x86/crypto/, "-asm" usually means a vanilla x86 assembly implementation
> without any AES-NI, SSE, AVX, etc. instructions. Calling this something like
> "sm4-aesni-avx" would make more sense. (Or is it actually avx2, not avx?)
>
will do in next version patch.
>> +config CRYPTO_SM4_AESNI_AVX_X86_64
>> + tristate "SM4 cipher algorithm (x86_64/AES-NI/AVX)"
>> + depends on X86 && 64BIT
>> + select CRYPTO_SKCIPHER
>> + select CRYPTO_SIMD
>> + select CRYPTO_ALGAPI
>> + select CRYPTO_LIB_SM4
>
> As-is, neither CRYPTO_SKCIPHER nor CRYPTO_SIMD needs to be selected here.
>
ditto.
>> + help
>> + SM4 cipher algorithms (OSCCA GB/T 32907-2016) (x86_64/AES-NI/AVX).
>> +
>> + SM4 (GBT.32907-2016) is a cryptographic standard issued by the
>> + Organization of State Commercial Administration of China (OSCCA)
>> + as an authorized cryptographic algorithms for the use within China.
>> +
>> + SMS4 was originally created for use in protecting wireless
>> + networks, and is mandated in the Chinese National Standard for
>> + Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure)
>> + (GB.15629.11-2003).
>> +
>> + The latest SM4 standard (GBT.32907-2016) was proposed by OSCCA and
>> + standardized through TC 260 of the Standardization Administration
>> + of the People's Republic of China (SAC).
>> +
>> + The input, output, and key of SMS4 are each 128 bits.
>> +
>> + See also: <https://eprint.iacr.org/2008/329.pdf>
>> +
>> + If unsure, say N.
>
> This is the help text for the x86 implementation specifically. Please don't
> have boilerplate text about the algorithm here; that already exists for the
> generic implementation. The text should explain about the x86 implementation.
>
ditto.
> - Eric
>
Thanks for your suggestion.
Cheers,
Tianjia
prev parent reply other threads:[~2021-06-13 10:14 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 13:44 [PATCH 0/3] Introduce x86 assembler accelerated implementation for SM4 algorithm Tianjia Zhang
2021-06-10 13:44 ` [PATCH 1/3] crypto: sm4 - create SM4 library based on sm4 generic code Tianjia Zhang
2021-06-10 23:19 ` Eric Biggers
2021-06-13 10:14 ` Tianjia Zhang
2022-03-01 10:34 ` Jason A. Donenfeld
2022-03-01 11:50 ` Tianjia Zhang
2022-03-01 13:22 ` Jason A. Donenfeld
2022-03-01 14:17 ` Jason A. Donenfeld
2022-03-02 0:24 ` Herbert Xu
2022-03-02 0:26 ` Jason A. Donenfeld
2022-03-02 22:23 ` Eric Biggers
2022-03-11 23:03 ` Jason A. Donenfeld
2022-03-14 2:32 ` Tianjia Zhang
2022-03-14 2:40 ` Jason A. Donenfeld
2022-03-14 2:45 ` Herbert Xu
2022-03-14 2:46 ` Jason A. Donenfeld
2021-06-10 13:44 ` [PATCH 2/3] crypto: arm64/sm4-ce - Make dependent on sm4 library instead of sm4-generic Tianjia Zhang
2021-06-10 13:44 ` [PATCH 3/3] crypto: x86/sm4 - add AES-NI/AVX/x86_64 assembler implementation Tianjia Zhang
2021-06-10 23:27 ` Eric Biggers
2021-06-13 10:14 ` Tianjia Zhang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d8df4d67-1a01-ff83-7739-af49b5b5e574@linux.alibaba.com \
--to=tianjia.zhang@linux.alibaba.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=ebiggers@kernel.org \
--cc=gilad@benyossef.com \
--cc=herbert@gondor.apana.org.au \
--cc=hpa@zytor.com \
--cc=jussi.kivilinna@iki.fi \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mjos@iki.fi \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).