From: Eric Biggers <ebiggers@kernel.org> To: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Cc: Herbert Xu <herbert@gondor.apana.org.au>, "David S. Miller" <davem@davemloft.net>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>, Gilad Ben-Yossef <gilad@benyossef.com>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, "Markku-Juhani O . Saarinen" <mjos@iki.fi>, Jussi Kivilinna <jussi.kivilinna@iki.fi>, x86@kernel.org, linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] crypto: x86/sm4 - add AES-NI/AVX/x86_64 assembler implementation Date: Thu, 10 Jun 2021 16:27:51 -0700 [thread overview] Message-ID: <YMKf93/cnPGGtRW3@gmail.com> (raw) In-Reply-To: <20210610134459.28541-4-tianjia.zhang@linux.alibaba.com> On Thu, Jun 10, 2021 at 09:44:59PM +0800, Tianjia Zhang wrote: > This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4 > block cipher. Through two affine transforms, we can use the AES > S-Box to simulate the SM4 S-Box to achieve the effect of instruction > acceleration. > Benchmark results, please. Also, is this passing the self-tests, including the fuzz tests? > +/* > + * void sm4_aesni_avx_expand_key(const u8 *key, u32 *rk_enc, > + * u32 *rk_dec, const u32 *fk, const u32 *ck); > + */ > +SYM_FUNC_START(sm4_aesni_avx_expand_key) > + /* input: > + * %rdi: 128-bit key > + * %rsi: rkey_enc > + * %rdx: rkey_dec > + * %rcx: fk array > + * %r8: ck array > + */ > + FRAME_BEGIN Key expansion isn't performance-critical. Can the C library version be used, or does the key need to be expanded in a way specific to this x86 implementation? > +/* > + * void sm4_aesni_avx_crypt4(const u32 *rk, u8 *dst, > + * const u8 *src, int nblocks) > + */ > +SYM_FUNC_START(sm4_aesni_avx_crypt4) > + /* input: > + * %rdi: round key array, CTX > + * %rsi: dst (1..4 blocks) > + * %rdx: src (1..4 blocks) > + * %rcx: num blocks (1..4) > + */ > + FRAME_BEGIN [...] > +static void sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) > +{ > + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm); > + > + if (crypto_simd_usable()) { > + kernel_fpu_begin(); > + sm4_aesni_avx_crypt4(ctx->rkey_enc, out, in, 1); > + kernel_fpu_end(); > + } else > + crypto_sm4_do_crypt(ctx->rkey_enc, out, in); > +} > + > +static void sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) > +{ > + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm); > + > + if (crypto_simd_usable()) { > + kernel_fpu_begin(); > + sm4_aesni_avx_crypt4(ctx->rkey_dec, out, in, 1); > + kernel_fpu_end(); > + } else > + crypto_sm4_do_crypt(ctx->rkey_dec, out, in); > +} Your assembly code appears to handle encrypting up to 4 blocks at a time. However you have only wired this up to the "cipher" API which does 1 block at a time. Is this intentional? What are your performance results with real-world chaining modes like XTS, and do you plan to implement any of these modes directly? > + > +static struct crypto_alg sm4_asm_alg = { > + .cra_name = "sm4", > + .cra_driver_name = "sm4-asm", In arch/x86/crypto/, "-asm" usually means a vanilla x86 assembly implementation without any AES-NI, SSE, AVX, etc. instructions. Calling this something like "sm4-aesni-avx" would make more sense. (Or is it actually avx2, not avx?) > +config CRYPTO_SM4_AESNI_AVX_X86_64 > + tristate "SM4 cipher algorithm (x86_64/AES-NI/AVX)" > + depends on X86 && 64BIT > + select CRYPTO_SKCIPHER > + select CRYPTO_SIMD > + select CRYPTO_ALGAPI > + select CRYPTO_LIB_SM4 As-is, neither CRYPTO_SKCIPHER nor CRYPTO_SIMD needs to be selected here. > + help > + SM4 cipher algorithms (OSCCA GB/T 32907-2016) (x86_64/AES-NI/AVX). > + > + SM4 (GBT.32907-2016) is a cryptographic standard issued by the > + Organization of State Commercial Administration of China (OSCCA) > + as an authorized cryptographic algorithms for the use within China. > + > + SMS4 was originally created for use in protecting wireless > + networks, and is mandated in the Chinese National Standard for > + Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure) > + (GB.15629.11-2003). > + > + The latest SM4 standard (GBT.32907-2016) was proposed by OSCCA and > + standardized through TC 260 of the Standardization Administration > + of the People's Republic of China (SAC). > + > + The input, output, and key of SMS4 are each 128 bits. > + > + See also: <https://eprint.iacr.org/2008/329.pdf> > + > + If unsure, say N. This is the help text for the x86 implementation specifically. Please don't have boilerplate text about the algorithm here; that already exists for the generic implementation. The text should explain about the x86 implementation. - Eric
WARNING: multiple messages have this Message-ID (diff)
From: Eric Biggers <ebiggers@kernel.org> To: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Cc: Herbert Xu <herbert@gondor.apana.org.au>, "David S. Miller" <davem@davemloft.net>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>, Gilad Ben-Yossef <gilad@benyossef.com>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, "Markku-Juhani O . Saarinen" <mjos@iki.fi>, Jussi Kivilinna <jussi.kivilinna@iki.fi>, x86@kernel.org, linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] crypto: x86/sm4 - add AES-NI/AVX/x86_64 assembler implementation Date: Thu, 10 Jun 2021 16:27:51 -0700 [thread overview] Message-ID: <YMKf93/cnPGGtRW3@gmail.com> (raw) In-Reply-To: <20210610134459.28541-4-tianjia.zhang@linux.alibaba.com> On Thu, Jun 10, 2021 at 09:44:59PM +0800, Tianjia Zhang wrote: > This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4 > block cipher. Through two affine transforms, we can use the AES > S-Box to simulate the SM4 S-Box to achieve the effect of instruction > acceleration. > Benchmark results, please. Also, is this passing the self-tests, including the fuzz tests? > +/* > + * void sm4_aesni_avx_expand_key(const u8 *key, u32 *rk_enc, > + * u32 *rk_dec, const u32 *fk, const u32 *ck); > + */ > +SYM_FUNC_START(sm4_aesni_avx_expand_key) > + /* input: > + * %rdi: 128-bit key > + * %rsi: rkey_enc > + * %rdx: rkey_dec > + * %rcx: fk array > + * %r8: ck array > + */ > + FRAME_BEGIN Key expansion isn't performance-critical. Can the C library version be used, or does the key need to be expanded in a way specific to this x86 implementation? > +/* > + * void sm4_aesni_avx_crypt4(const u32 *rk, u8 *dst, > + * const u8 *src, int nblocks) > + */ > +SYM_FUNC_START(sm4_aesni_avx_crypt4) > + /* input: > + * %rdi: round key array, CTX > + * %rsi: dst (1..4 blocks) > + * %rdx: src (1..4 blocks) > + * %rcx: num blocks (1..4) > + */ > + FRAME_BEGIN [...] > +static void sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) > +{ > + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm); > + > + if (crypto_simd_usable()) { > + kernel_fpu_begin(); > + sm4_aesni_avx_crypt4(ctx->rkey_enc, out, in, 1); > + kernel_fpu_end(); > + } else > + crypto_sm4_do_crypt(ctx->rkey_enc, out, in); > +} > + > +static void sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) > +{ > + const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm); > + > + if (crypto_simd_usable()) { > + kernel_fpu_begin(); > + sm4_aesni_avx_crypt4(ctx->rkey_dec, out, in, 1); > + kernel_fpu_end(); > + } else > + crypto_sm4_do_crypt(ctx->rkey_dec, out, in); > +} Your assembly code appears to handle encrypting up to 4 blocks at a time. However you have only wired this up to the "cipher" API which does 1 block at a time. Is this intentional? What are your performance results with real-world chaining modes like XTS, and do you plan to implement any of these modes directly? > + > +static struct crypto_alg sm4_asm_alg = { > + .cra_name = "sm4", > + .cra_driver_name = "sm4-asm", In arch/x86/crypto/, "-asm" usually means a vanilla x86 assembly implementation without any AES-NI, SSE, AVX, etc. instructions. Calling this something like "sm4-aesni-avx" would make more sense. (Or is it actually avx2, not avx?) > +config CRYPTO_SM4_AESNI_AVX_X86_64 > + tristate "SM4 cipher algorithm (x86_64/AES-NI/AVX)" > + depends on X86 && 64BIT > + select CRYPTO_SKCIPHER > + select CRYPTO_SIMD > + select CRYPTO_ALGAPI > + select CRYPTO_LIB_SM4 As-is, neither CRYPTO_SKCIPHER nor CRYPTO_SIMD needs to be selected here. > + help > + SM4 cipher algorithms (OSCCA GB/T 32907-2016) (x86_64/AES-NI/AVX). > + > + SM4 (GBT.32907-2016) is a cryptographic standard issued by the > + Organization of State Commercial Administration of China (OSCCA) > + as an authorized cryptographic algorithms for the use within China. > + > + SMS4 was originally created for use in protecting wireless > + networks, and is mandated in the Chinese National Standard for > + Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure) > + (GB.15629.11-2003). > + > + The latest SM4 standard (GBT.32907-2016) was proposed by OSCCA and > + standardized through TC 260 of the Standardization Administration > + of the People's Republic of China (SAC). > + > + The input, output, and key of SMS4 are each 128 bits. > + > + See also: <https://eprint.iacr.org/2008/329.pdf> > + > + If unsure, say N. This is the help text for the x86 implementation specifically. Please don't have boilerplate text about the algorithm here; that already exists for the generic implementation. The text should explain about the x86 implementation. - Eric _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-06-10 23:27 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-10 13:44 [PATCH 0/3] Introduce x86 assembler accelerated implementation for SM4 algorithm Tianjia Zhang 2021-06-10 13:44 ` Tianjia Zhang 2021-06-10 13:44 ` [PATCH 1/3] crypto: sm4 - create SM4 library based on sm4 generic code Tianjia Zhang 2021-06-10 13:44 ` Tianjia Zhang 2021-06-10 23:19 ` Eric Biggers 2021-06-10 23:19 ` Eric Biggers 2021-06-13 10:14 ` Tianjia Zhang 2021-06-13 10:14 ` Tianjia Zhang 2022-03-01 10:34 ` Jason A. Donenfeld 2022-03-01 10:34 ` Jason A. Donenfeld 2022-03-01 11:50 ` Tianjia Zhang 2022-03-01 11:50 ` Tianjia Zhang 2022-03-01 13:22 ` Jason A. Donenfeld 2022-03-01 13:22 ` Jason A. Donenfeld 2022-03-01 14:17 ` Jason A. Donenfeld 2022-03-01 14:17 ` Jason A. Donenfeld 2022-03-02 0:24 ` Herbert Xu 2022-03-02 0:24 ` Herbert Xu 2022-03-02 0:26 ` Jason A. Donenfeld 2022-03-02 0:26 ` Jason A. Donenfeld 2022-03-02 22:23 ` Eric Biggers 2022-03-02 22:23 ` Eric Biggers 2022-03-11 23:03 ` Jason A. Donenfeld 2022-03-11 23:03 ` Jason A. Donenfeld 2022-03-14 2:32 ` Tianjia Zhang 2022-03-14 2:32 ` Tianjia Zhang 2022-03-14 2:40 ` Jason A. Donenfeld 2022-03-14 2:40 ` Jason A. Donenfeld 2022-03-14 2:45 ` Herbert Xu 2022-03-14 2:45 ` Herbert Xu 2022-03-14 2:46 ` Jason A. Donenfeld 2022-03-14 2:46 ` Jason A. Donenfeld 2021-06-10 13:44 ` [PATCH 2/3] crypto: arm64/sm4-ce - Make dependent on sm4 library instead of sm4-generic Tianjia Zhang 2021-06-10 13:44 ` Tianjia Zhang 2021-06-10 13:44 ` [PATCH 3/3] crypto: x86/sm4 - add AES-NI/AVX/x86_64 assembler implementation Tianjia Zhang 2021-06-10 13:44 ` Tianjia Zhang 2021-06-10 23:27 ` Eric Biggers [this message] 2021-06-10 23:27 ` Eric Biggers 2021-06-13 10:14 ` Tianjia Zhang 2021-06-13 10:14 ` Tianjia Zhang
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YMKf93/cnPGGtRW3@gmail.com \ --to=ebiggers@kernel.org \ --cc=ard.biesheuvel@linaro.org \ --cc=bp@alien8.de \ --cc=catalin.marinas@arm.com \ --cc=davem@davemloft.net \ --cc=gilad@benyossef.com \ --cc=herbert@gondor.apana.org.au \ --cc=hpa@zytor.com \ --cc=jussi.kivilinna@iki.fi \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-crypto@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=mjos@iki.fi \ --cc=tglx@linutronix.de \ --cc=tianjia.zhang@linux.alibaba.com \ --cc=will@kernel.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.