* [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines @ 2018-08-27 11:02 Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden Ard Biesheuvel ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 11:02 UTC (permalink / raw) To: linux-arm-kernel, linux-crypto Cc: will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel, Ard Biesheuvel There are many crc32 users in the kernel that call the library routine rather than the crypto API wrapper, and so none of these callers use the accelerated arm64 instructions when available. While this is not known to cause performance issues, calling a table based time variant implementation with a non-negligible D-cache footprint (8 KB) is wasteful in any case, and now that the crc32 instructions have been made mandatory in the architecture, let's wire them up into the core crc routines. This also means that they will be exposed to the crypto API via the generic CRC32 driver, and so we can remove the scalar routines from the crypto API driver. This leaves the PMULL code, which will only be useful on systems that implement 64x64 PMULL but not the CRC32 instructions. Given that no such systems are known to exist, this driver is removed entirely in patch #4. Ard Biesheuvel (4): lib/crc32: make core crc32() routines weak so they can be overridden arm64: cpufeature: add feature for CRC32 instructions arm64/lib: add accelerated crc32 routines crypto: arm64/crc32 - remove PMULL based CRC32 driver arch/arm64/Kconfig | 1 + arch/arm64/configs/defconfig | 1 - arch/arm64/crypto/Kconfig | 5 - arch/arm64/crypto/Makefile | 3 - arch/arm64/crypto/crc32-ce-core.S | 287 -------------------- arch/arm64/crypto/crc32-ce-glue.c | 244 ----------------- arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/kernel/cpufeature.c | 9 + arch/arm64/lib/Makefile | 2 + arch/arm64/lib/crc32.S | 60 ++++ lib/crc32.c | 11 +- 11 files changed, 81 insertions(+), 545 deletions(-) delete mode 100644 arch/arm64/crypto/crc32-ce-core.S delete mode 100644 arch/arm64/crypto/crc32-ce-glue.c create mode 100644 arch/arm64/lib/crc32.S -- 2.18.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel @ 2018-08-27 11:02 ` Ard Biesheuvel 2018-09-04 9:44 ` Herbert Xu 2018-08-27 11:02 ` [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions Ard Biesheuvel ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 11:02 UTC (permalink / raw) To: linux-arm-kernel, linux-crypto Cc: will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel, Ard Biesheuvel Allow architectures to drop in accelerated CRC32 routines by making the crc32_le/__crc32c_le entry points weak, and exposing non-weak aliases for them that may be used by the accelerated versions as fallbacks in case the instructions they rely upon are not available. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- lib/crc32.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/crc32.c b/lib/crc32.c index a6c9afafc8c8..45b1d67a1767 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -183,21 +183,21 @@ static inline u32 __pure crc32_le_generic(u32 crc, unsigned char const *p, } #if CRC_LE_BITS == 1 -u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_le(u32 crc, unsigned char const *p, size_t len) { return crc32_le_generic(crc, p, len, NULL, CRC32_POLY_LE); } -u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) { return crc32_le_generic(crc, p, len, NULL, CRC32C_POLY_LE); } #else -u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_le(u32 crc, unsigned char const *p, size_t len) { return crc32_le_generic(crc, p, len, (const u32 (*)[256])crc32table_le, CRC32_POLY_LE); } -u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) { return crc32_le_generic(crc, p, len, (const u32 (*)[256])crc32ctable_le, CRC32C_POLY_LE); @@ -206,6 +206,9 @@ u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len) EXPORT_SYMBOL(crc32_le); EXPORT_SYMBOL(__crc32c_le); +u32 crc32_le_base(u32, unsigned char const *, size_t) __alias(crc32_le); +u32 __crc32c_le_base(u32, unsigned char const *, size_t) __alias(__crc32c_le); + /* * This multiplies the polynomials x and y modulo the given modulus. * This follows the "little-endian" CRC convention that the lsbit -- 2.18.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden 2018-08-27 11:02 ` [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden Ard Biesheuvel @ 2018-09-04 9:44 ` Herbert Xu 0 siblings, 0 replies; 15+ messages in thread From: Herbert Xu @ 2018-09-04 9:44 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-crypto, will.deacon, catalin.marinas, ebiggers, suzuki.poulose, linux-kernel On Mon, Aug 27, 2018 at 01:02:42PM +0200, Ard Biesheuvel wrote: > Allow architectures to drop in accelerated CRC32 routines by making > the crc32_le/__crc32c_le entry points weak, and exposing non-weak > aliases for them that may be used by the accelerated versions as > fallbacks in case the instructions they rely upon are not available. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden Ard Biesheuvel @ 2018-08-27 11:02 ` Ard Biesheuvel 2018-08-28 17:01 ` Will Deacon 2018-08-27 11:02 ` [PATCH 3/4] arm64/lib: add accelerated crc32 routines Ard Biesheuvel ` (2 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 11:02 UTC (permalink / raw) To: linux-arm-kernel, linux-crypto Cc: will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel, Ard Biesheuvel Add a CRC32 feature bit and wire it up to the CPU id register so we will be able to use alternatives patching for CRC32 operations. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpufeature.c | 9 +++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index ae1f70450fb2..9932aca9704b 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -51,7 +51,8 @@ #define ARM64_SSBD 30 #define ARM64_MISMATCHED_CACHE_TYPE 31 #define ARM64_HAS_STAGE2_FWB 32 +#define ARM64_HAS_CRC32 33 -#define ARM64_NCAPS 33 +#define ARM64_NCAPS 34 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index e238b7932096..7626b80128f5 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1222,6 +1222,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .cpu_enable = cpu_enable_hw_dbm, }, #endif + { + .desc = "CRC32 instructions", + .capability = ARM64_HAS_CRC32, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_cpuid_feature, + .sys_reg = SYS_ID_AA64ISAR0_EL1, + .field_pos = ID_AA64ISAR0_CRC32_SHIFT, + .min_field_value = 1, + }, {}, }; -- 2.18.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-08-27 11:02 ` [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions Ard Biesheuvel @ 2018-08-28 17:01 ` Will Deacon 2018-08-28 18:43 ` Ard Biesheuvel 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2018-08-28 17:01 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-crypto, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: > Add a CRC32 feature bit and wire it up to the CPU id register so we > will be able to use alternatives patching for CRC32 operations. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm64/include/asm/cpucaps.h | 3 ++- > arch/arm64/kernel/cpufeature.c | 9 +++++++++ > 2 files changed, 11 insertions(+), 1 deletion(-) Acked-by: Will Deacon <will.deacon@arm.com> With the minor caveat below... > diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > index ae1f70450fb2..9932aca9704b 100644 > --- a/arch/arm64/include/asm/cpucaps.h > +++ b/arch/arm64/include/asm/cpucaps.h > @@ -51,7 +51,8 @@ > #define ARM64_SSBD 30 > #define ARM64_MISMATCHED_CACHE_TYPE 31 > #define ARM64_HAS_STAGE2_FWB 32 > +#define ARM64_HAS_CRC32 33 > > -#define ARM64_NCAPS 33 > +#define ARM64_NCAPS 34 ... if this goes via crypto, you'll almost certainly get a (trivial) conflict with arm64, since these numbers get bumped all the time. Will > #endif /* __ASM_CPUCAPS_H */ > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index e238b7932096..7626b80128f5 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -1222,6 +1222,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > .cpu_enable = cpu_enable_hw_dbm, > }, > #endif > + { > + .desc = "CRC32 instructions", > + .capability = ARM64_HAS_CRC32, > + .type = ARM64_CPUCAP_SYSTEM_FEATURE, > + .matches = has_cpuid_feature, > + .sys_reg = SYS_ID_AA64ISAR0_EL1, > + .field_pos = ID_AA64ISAR0_CRC32_SHIFT, > + .min_field_value = 1, > + }, > {}, > }; > > -- > 2.18.0 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-08-28 17:01 ` Will Deacon @ 2018-08-28 18:43 ` Ard Biesheuvel 2018-09-04 3:18 ` Herbert Xu 0 siblings, 1 reply; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-28 18:43 UTC (permalink / raw) To: Will Deacon Cc: linux-arm-kernel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Catalin Marinas, Herbert Xu, Eric Biggers, Suzuki K. Poulose, Linux Kernel Mailing List On 28 August 2018 at 19:01, Will Deacon <will.deacon@arm.com> wrote: > On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: >> Add a CRC32 feature bit and wire it up to the CPU id register so we >> will be able to use alternatives patching for CRC32 operations. >> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> arch/arm64/include/asm/cpucaps.h | 3 ++- >> arch/arm64/kernel/cpufeature.c | 9 +++++++++ >> 2 files changed, 11 insertions(+), 1 deletion(-) > > Acked-by: Will Deacon <will.deacon@arm.com> > > With the minor caveat below... > >> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h >> index ae1f70450fb2..9932aca9704b 100644 >> --- a/arch/arm64/include/asm/cpucaps.h >> +++ b/arch/arm64/include/asm/cpucaps.h >> @@ -51,7 +51,8 @@ >> #define ARM64_SSBD 30 >> #define ARM64_MISMATCHED_CACHE_TYPE 31 >> #define ARM64_HAS_STAGE2_FWB 32 >> +#define ARM64_HAS_CRC32 33 >> >> -#define ARM64_NCAPS 33 >> +#define ARM64_NCAPS 34 > > > ... if this goes via crypto, you'll almost certainly get a (trivial) > conflict with arm64, since these numbers get bumped all the time. > I think the first three patches should go through the arm64 tree. The last one just removes the now redundant crc32 SIMD driver, and Herbert could pick that up separately, i.e., it should be totally independent. >> #endif /* __ASM_CPUCAPS_H */ >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c >> index e238b7932096..7626b80128f5 100644 >> --- a/arch/arm64/kernel/cpufeature.c >> +++ b/arch/arm64/kernel/cpufeature.c >> @@ -1222,6 +1222,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = { >> .cpu_enable = cpu_enable_hw_dbm, >> }, >> #endif >> + { >> + .desc = "CRC32 instructions", >> + .capability = ARM64_HAS_CRC32, >> + .type = ARM64_CPUCAP_SYSTEM_FEATURE, >> + .matches = has_cpuid_feature, >> + .sys_reg = SYS_ID_AA64ISAR0_EL1, >> + .field_pos = ID_AA64ISAR0_CRC32_SHIFT, >> + .min_field_value = 1, >> + }, >> {}, >> }; >> >> -- >> 2.18.0 >> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-08-28 18:43 ` Ard Biesheuvel @ 2018-09-04 3:18 ` Herbert Xu 2018-09-04 9:38 ` Will Deacon 2018-09-10 15:45 ` Catalin Marinas 0 siblings, 2 replies; 15+ messages in thread From: Herbert Xu @ 2018-09-04 3:18 UTC (permalink / raw) To: Ard Biesheuvel Cc: Will Deacon, linux-arm-kernel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Catalin Marinas, Eric Biggers, Suzuki K. Poulose, Linux Kernel Mailing List On Tue, Aug 28, 2018 at 08:43:35PM +0200, Ard Biesheuvel wrote: > On 28 August 2018 at 19:01, Will Deacon <will.deacon@arm.com> wrote: > > On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: > >> Add a CRC32 feature bit and wire it up to the CPU id register so we > >> will be able to use alternatives patching for CRC32 operations. > >> > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > >> --- > >> arch/arm64/include/asm/cpucaps.h | 3 ++- > >> arch/arm64/kernel/cpufeature.c | 9 +++++++++ > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > > > With the minor caveat below... > > > >> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > >> index ae1f70450fb2..9932aca9704b 100644 > >> --- a/arch/arm64/include/asm/cpucaps.h > >> +++ b/arch/arm64/include/asm/cpucaps.h > >> @@ -51,7 +51,8 @@ > >> #define ARM64_SSBD 30 > >> #define ARM64_MISMATCHED_CACHE_TYPE 31 > >> #define ARM64_HAS_STAGE2_FWB 32 > >> +#define ARM64_HAS_CRC32 33 > >> > >> -#define ARM64_NCAPS 33 > >> +#define ARM64_NCAPS 34 > > > > > > ... if this goes via crypto, you'll almost certainly get a (trivial) > > conflict with arm64, since these numbers get bumped all the time. > > > > I think the first three patches should go through the arm64 tree. The > last one just removes the now redundant crc32 SIMD driver, and Herbert > could pick that up separately, i.e., it should be totally independent. Yes let's do that. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-09-04 3:18 ` Herbert Xu @ 2018-09-04 9:38 ` Will Deacon 2018-09-04 9:44 ` Herbert Xu 2018-09-10 15:45 ` Catalin Marinas 1 sibling, 1 reply; 15+ messages in thread From: Will Deacon @ 2018-09-04 9:38 UTC (permalink / raw) To: Herbert Xu Cc: Ard Biesheuvel, linux-arm-kernel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Catalin Marinas, Eric Biggers, Suzuki K. Poulose, Linux Kernel Mailing List On Tue, Sep 04, 2018 at 11:18:55AM +0800, Herbert Xu wrote: > On Tue, Aug 28, 2018 at 08:43:35PM +0200, Ard Biesheuvel wrote: > > On 28 August 2018 at 19:01, Will Deacon <will.deacon@arm.com> wrote: > > > On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: > > >> Add a CRC32 feature bit and wire it up to the CPU id register so we > > >> will be able to use alternatives patching for CRC32 operations. > > >> > > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > > >> --- > > >> arch/arm64/include/asm/cpucaps.h | 3 ++- > > >> arch/arm64/kernel/cpufeature.c | 9 +++++++++ > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > > > > > With the minor caveat below... > > > > > >> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > > >> index ae1f70450fb2..9932aca9704b 100644 > > >> --- a/arch/arm64/include/asm/cpucaps.h > > >> +++ b/arch/arm64/include/asm/cpucaps.h > > >> @@ -51,7 +51,8 @@ > > >> #define ARM64_SSBD 30 > > >> #define ARM64_MISMATCHED_CACHE_TYPE 31 > > >> #define ARM64_HAS_STAGE2_FWB 32 > > >> +#define ARM64_HAS_CRC32 33 > > >> > > >> -#define ARM64_NCAPS 33 > > >> +#define ARM64_NCAPS 34 > > > > > > > > > ... if this goes via crypto, you'll almost certainly get a (trivial) > > > conflict with arm64, since these numbers get bumped all the time. > > > > > > > I think the first three patches should go through the arm64 tree. The > > last one just removes the now redundant crc32 SIMD driver, and Herbert > > could pick that up separately, i.e., it should be totally independent. > > Yes let's do that. Okey doke! In which case, please can we have your Ack on the first patch? Cheers, Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-09-04 9:38 ` Will Deacon @ 2018-09-04 9:44 ` Herbert Xu 0 siblings, 0 replies; 15+ messages in thread From: Herbert Xu @ 2018-09-04 9:44 UTC (permalink / raw) To: Will Deacon Cc: Ard Biesheuvel, linux-arm-kernel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Catalin Marinas, Eric Biggers, Suzuki K. Poulose, Linux Kernel Mailing List On Tue, Sep 04, 2018 at 10:38:45AM +0100, Will Deacon wrote: > On Tue, Sep 04, 2018 at 11:18:55AM +0800, Herbert Xu wrote: > > On Tue, Aug 28, 2018 at 08:43:35PM +0200, Ard Biesheuvel wrote: > > > On 28 August 2018 at 19:01, Will Deacon <will.deacon@arm.com> wrote: > > > > On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: > > > >> Add a CRC32 feature bit and wire it up to the CPU id register so we > > > >> will be able to use alternatives patching for CRC32 operations. > > > >> > > > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > > > >> --- > > > >> arch/arm64/include/asm/cpucaps.h | 3 ++- > > > >> arch/arm64/kernel/cpufeature.c | 9 +++++++++ > > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > > > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > > > > > > > With the minor caveat below... > > > > > > > >> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > > > >> index ae1f70450fb2..9932aca9704b 100644 > > > >> --- a/arch/arm64/include/asm/cpucaps.h > > > >> +++ b/arch/arm64/include/asm/cpucaps.h > > > >> @@ -51,7 +51,8 @@ > > > >> #define ARM64_SSBD 30 > > > >> #define ARM64_MISMATCHED_CACHE_TYPE 31 > > > >> #define ARM64_HAS_STAGE2_FWB 32 > > > >> +#define ARM64_HAS_CRC32 33 > > > >> > > > >> -#define ARM64_NCAPS 33 > > > >> +#define ARM64_NCAPS 34 > > > > > > > > > > > > ... if this goes via crypto, you'll almost certainly get a (trivial) > > > > conflict with arm64, since these numbers get bumped all the time. > > > > > > > > > > I think the first three patches should go through the arm64 tree. The > > > last one just removes the now redundant crc32 SIMD driver, and Herbert > > > could pick that up separately, i.e., it should be totally independent. > > > > Yes let's do that. > > Okey doke! In which case, please can we have your Ack on the first patch? Sure, I have just sent an ack for that patch. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions 2018-09-04 3:18 ` Herbert Xu 2018-09-04 9:38 ` Will Deacon @ 2018-09-10 15:45 ` Catalin Marinas 1 sibling, 0 replies; 15+ messages in thread From: Catalin Marinas @ 2018-09-10 15:45 UTC (permalink / raw) To: Herbert Xu Cc: Ard Biesheuvel, Suzuki K. Poulose, Eric Biggers, Will Deacon, Linux Kernel Mailing List, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-arm-kernel On Tue, Sep 04, 2018 at 11:18:55AM +0800, Herbert Xu wrote: > On Tue, Aug 28, 2018 at 08:43:35PM +0200, Ard Biesheuvel wrote: > > On 28 August 2018 at 19:01, Will Deacon <will.deacon@arm.com> wrote: > > > On Mon, Aug 27, 2018 at 01:02:43PM +0200, Ard Biesheuvel wrote: > > >> Add a CRC32 feature bit and wire it up to the CPU id register so we > > >> will be able to use alternatives patching for CRC32 operations. > > >> > > >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > > >> --- > > >> arch/arm64/include/asm/cpucaps.h | 3 ++- > > >> arch/arm64/kernel/cpufeature.c | 9 +++++++++ > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > > > > > With the minor caveat below... > > > > > >> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > > >> index ae1f70450fb2..9932aca9704b 100644 > > >> --- a/arch/arm64/include/asm/cpucaps.h > > >> +++ b/arch/arm64/include/asm/cpucaps.h > > >> @@ -51,7 +51,8 @@ > > >> #define ARM64_SSBD 30 > > >> #define ARM64_MISMATCHED_CACHE_TYPE 31 > > >> #define ARM64_HAS_STAGE2_FWB 32 > > >> +#define ARM64_HAS_CRC32 33 > > >> > > >> -#define ARM64_NCAPS 33 > > >> +#define ARM64_NCAPS 34 > > > > > > > > > ... if this goes via crypto, you'll almost certainly get a (trivial) > > > conflict with arm64, since these numbers get bumped all the time. > > > > > > > I think the first three patches should go through the arm64 tree. The > > last one just removes the now redundant crc32 SIMD driver, and Herbert > > could pick that up separately, i.e., it should be totally independent. > > Yes let's do that. I queued the first 3 patches for 4.19. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] arm64/lib: add accelerated crc32 routines 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions Ard Biesheuvel @ 2018-08-27 11:02 ` Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver Ard Biesheuvel 2018-08-27 14:53 ` [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Theodore Y. Ts'o 4 siblings, 0 replies; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 11:02 UTC (permalink / raw) To: linux-arm-kernel, linux-crypto Cc: will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel, Ard Biesheuvel Unlike crc32c(), which is wired up to the crypto API internally so the optimal driver is selected based on the platform's capabilities, crc32_le() is implemented as a library function using a slice-by-8 table based C implementation. Even though few of the call sites may be bottlenecks, calling a time variant implementation with a non-negligible D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates support for the CRC32 instructions that were optional in ARMv8.0, but are already widely available, even on the Cortex-A53 based Raspberry Pi. So implement routines that use these instructions if available, and fall back to the existing generic routines otherwise. The selection is based on alternatives patching. Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE, this just codifies the status quo. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/Kconfig | 1 + arch/arm64/lib/Makefile | 2 + arch/arm64/lib/crc32.S | 60 ++++++++++++++++++++ 3 files changed, 63 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 29e75b47becd..0625355f12fa 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -75,6 +75,7 @@ config ARM64 select CLONE_BACKWARDS select COMMON_CLK select CPU_PM if (SUSPEND || CPU_IDLE) + select CRC32 select DCACHE_WORD_ACCESS select DMA_DIRECT_OPS select EDAC_SUPPORT diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile index 68755fd70dcf..f28f91fd96a2 100644 --- a/arch/arm64/lib/Makefile +++ b/arch/arm64/lib/Makefile @@ -25,3 +25,5 @@ KCOV_INSTRUMENT_atomic_ll_sc.o := n UBSAN_SANITIZE_atomic_ll_sc.o := n lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o + +obj-$(CONFIG_CRC32) += crc32.o diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S new file mode 100644 index 000000000000..5bc1e85b4e1c --- /dev/null +++ b/arch/arm64/lib/crc32.S @@ -0,0 +1,60 @@ +/* + * Accelerated CRC32(C) using AArch64 CRC instructions + * + * Copyright (C) 2016 - 2018 Linaro Ltd <ard.biesheuvel@linaro.org> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/linkage.h> +#include <asm/alternative.h> +#include <asm/assembler.h> + + .cpu generic+crc + + .macro __crc32, c +0: subs x2, x2, #16 + b.mi 8f + ldp x3, x4, [x1], #16 +CPU_BE( rev x3, x3 ) +CPU_BE( rev x4, x4 ) + crc32\c\()x w0, w0, x3 + crc32\c\()x w0, w0, x4 + b.ne 0b + ret + +8: tbz x2, #3, 4f + ldr x3, [x1], #8 +CPU_BE( rev x3, x3 ) + crc32\c\()x w0, w0, x3 +4: tbz x2, #2, 2f + ldr w3, [x1], #4 +CPU_BE( rev w3, w3 ) + crc32\c\()w w0, w0, w3 +2: tbz x2, #1, 1f + ldrh w3, [x1], #2 +CPU_BE( rev16 w3, w3 ) + crc32\c\()h w0, w0, w3 +1: tbz x2, #0, 0f + ldrb w3, [x1] + crc32\c\()b w0, w0, w3 +0: ret + .endm + + .align 5 +ENTRY(crc32_le) +alternative_if_not ARM64_HAS_CRC32 + b crc32_le_base +alternative_else_nop_endif + __crc32 +ENDPROC(crc32_le) + + .align 5 +ENTRY(__crc32c_le) +alternative_if_not ARM64_HAS_CRC32 + b __crc32c_le_base +alternative_else_nop_endif + __crc32 c +ENDPROC(__crc32c_le) -- 2.18.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel ` (2 preceding siblings ...) 2018-08-27 11:02 ` [PATCH 3/4] arm64/lib: add accelerated crc32 routines Ard Biesheuvel @ 2018-08-27 11:02 ` Ard Biesheuvel 2018-09-04 5:21 ` Herbert Xu 2018-08-27 14:53 ` [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Theodore Y. Ts'o 4 siblings, 1 reply; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 11:02 UTC (permalink / raw) To: linux-arm-kernel, linux-crypto Cc: will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel, Ard Biesheuvel Now that the scalar fallbacks have been moved out of this driver into the core crc32()/crc32c() routines, we are left with a CRC32 crypto API driver for arm64 that is based only on 64x64 polynomial multiplication, which is an optional instruction in the ARMv8 architecture, and is less and less likely to be available on cores that do not also implement the CRC32 instructions, given that those are mandatory in the architecture as of ARMv8.1. Since the scalar instructions do not require the special handling that SIMD instructions do, and since they turn out to be considerably faster on some cores (Cortex-A53) as well, there is really no point in keeping this code around so let's just remove it. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/configs/defconfig | 1 - arch/arm64/crypto/Kconfig | 5 - arch/arm64/crypto/Makefile | 3 - arch/arm64/crypto/crc32-ce-core.S | 287 -------------------- arch/arm64/crypto/crc32-ce-glue.c | 244 ----------------- 5 files changed, 540 deletions(-) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index f67e8d5e93ad..323da306e9f4 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -703,7 +703,6 @@ CONFIG_CRYPTO_SHA3_ARM64=m CONFIG_CRYPTO_SM3_ARM64_CE=m CONFIG_CRYPTO_GHASH_ARM64_CE=y CONFIG_CRYPTO_CRCT10DIF_ARM64_CE=m -CONFIG_CRYPTO_CRC32_ARM64_CE=m CONFIG_CRYPTO_AES_ARM64_CE_CCM=y CONFIG_CRYPTO_AES_ARM64_CE_BLK=y CONFIG_CRYPTO_CHACHA20_NEON=m diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index e3fdb0fd6f70..63dc00423ca0 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -66,11 +66,6 @@ config CRYPTO_CRCT10DIF_ARM64_CE depends on KERNEL_MODE_NEON && CRC_T10DIF select CRYPTO_HASH -config CRYPTO_CRC32_ARM64_CE - tristate "CRC32 and CRC32C digest algorithms using ARMv8 extensions" - depends on CRC32 - select CRYPTO_HASH - config CRYPTO_AES_ARM64 tristate "AES core cipher using scalar instructions" select CRYPTO_AES diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index bcafd016618e..776357a3be35 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -32,9 +32,6 @@ ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o -obj-$(CONFIG_CRYPTO_CRC32_ARM64_CE) += crc32-ce.o -crc32-ce-y:= crc32-ce-core.o crc32-ce-glue.o - obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o aes-ce-cipher-y := aes-ce-core.o aes-ce-glue.o diff --git a/arch/arm64/crypto/crc32-ce-core.S b/arch/arm64/crypto/crc32-ce-core.S deleted file mode 100644 index 8061bf0f9c66..000000000000 --- a/arch/arm64/crypto/crc32-ce-core.S +++ /dev/null @@ -1,287 +0,0 @@ -/* - * Accelerated CRC32(C) using arm64 CRC, NEON and Crypto Extensions instructions - * - * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -/* GPL HEADER START - * - * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 only, - * as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License version 2 for more details (a copy is included - * in the LICENSE file that accompanied this code). - * - * You should have received a copy of the GNU General Public License - * version 2 along with this program; If not, see http://www.gnu.org/licenses - * - * Please visit http://www.xyratex.com/contact if you need additional - * information or have any questions. - * - * GPL HEADER END - */ - -/* - * Copyright 2012 Xyratex Technology Limited - * - * Using hardware provided PCLMULQDQ instruction to accelerate the CRC32 - * calculation. - * CRC32 polynomial:0x04c11db7(BE)/0xEDB88320(LE) - * PCLMULQDQ is a new instruction in Intel SSE4.2, the reference can be found - * at: - * http://www.intel.com/products/processor/manuals/ - * Intel(R) 64 and IA-32 Architectures Software Developer's Manual - * Volume 2B: Instruction Set Reference, N-Z - * - * Authors: Gregory Prestas <Gregory_Prestas@us.xyratex.com> - * Alexander Boyko <Alexander_Boyko@xyratex.com> - */ - -#include <linux/linkage.h> -#include <asm/assembler.h> - - .section ".rodata", "a" - .align 6 - .cpu generic+crypto+crc - -.Lcrc32_constants: - /* - * [x4*128+32 mod P(x) << 32)]' << 1 = 0x154442bd4 - * #define CONSTANT_R1 0x154442bd4LL - * - * [(x4*128-32 mod P(x) << 32)]' << 1 = 0x1c6e41596 - * #define CONSTANT_R2 0x1c6e41596LL - */ - .octa 0x00000001c6e415960000000154442bd4 - - /* - * [(x128+32 mod P(x) << 32)]' << 1 = 0x1751997d0 - * #define CONSTANT_R3 0x1751997d0LL - * - * [(x128-32 mod P(x) << 32)]' << 1 = 0x0ccaa009e - * #define CONSTANT_R4 0x0ccaa009eLL - */ - .octa 0x00000000ccaa009e00000001751997d0 - - /* - * [(x64 mod P(x) << 32)]' << 1 = 0x163cd6124 - * #define CONSTANT_R5 0x163cd6124LL - */ - .quad 0x0000000163cd6124 - .quad 0x00000000FFFFFFFF - - /* - * #define CRCPOLY_TRUE_LE_FULL 0x1DB710641LL - * - * Barrett Reduction constant (u64`) = u` = (x**64 / P(x))` - * = 0x1F7011641LL - * #define CONSTANT_RU 0x1F7011641LL - */ - .octa 0x00000001F701164100000001DB710641 - -.Lcrc32c_constants: - .octa 0x000000009e4addf800000000740eef02 - .octa 0x000000014cd00bd600000000f20c0dfe - .quad 0x00000000dd45aab8 - .quad 0x00000000FFFFFFFF - .octa 0x00000000dea713f10000000105ec76f0 - - vCONSTANT .req v0 - dCONSTANT .req d0 - qCONSTANT .req q0 - - BUF .req x19 - LEN .req x20 - CRC .req x21 - CONST .req x22 - - vzr .req v9 - - /** - * Calculate crc32 - * BUF - buffer - * LEN - sizeof buffer (multiple of 16 bytes), LEN should be > 63 - * CRC - initial crc32 - * return %eax crc32 - * uint crc32_pmull_le(unsigned char const *buffer, - * size_t len, uint crc32) - */ - .text -ENTRY(crc32_pmull_le) - adr_l x3, .Lcrc32_constants - b 0f - -ENTRY(crc32c_pmull_le) - adr_l x3, .Lcrc32c_constants - -0: frame_push 4, 64 - - mov BUF, x0 - mov LEN, x1 - mov CRC, x2 - mov CONST, x3 - - bic LEN, LEN, #15 - ld1 {v1.16b-v4.16b}, [BUF], #0x40 - movi vzr.16b, #0 - fmov dCONSTANT, CRC - eor v1.16b, v1.16b, vCONSTANT.16b - sub LEN, LEN, #0x40 - cmp LEN, #0x40 - b.lt less_64 - - ldr qCONSTANT, [CONST] - -loop_64: /* 64 bytes Full cache line folding */ - sub LEN, LEN, #0x40 - - pmull2 v5.1q, v1.2d, vCONSTANT.2d - pmull2 v6.1q, v2.2d, vCONSTANT.2d - pmull2 v7.1q, v3.2d, vCONSTANT.2d - pmull2 v8.1q, v4.2d, vCONSTANT.2d - - pmull v1.1q, v1.1d, vCONSTANT.1d - pmull v2.1q, v2.1d, vCONSTANT.1d - pmull v3.1q, v3.1d, vCONSTANT.1d - pmull v4.1q, v4.1d, vCONSTANT.1d - - eor v1.16b, v1.16b, v5.16b - ld1 {v5.16b}, [BUF], #0x10 - eor v2.16b, v2.16b, v6.16b - ld1 {v6.16b}, [BUF], #0x10 - eor v3.16b, v3.16b, v7.16b - ld1 {v7.16b}, [BUF], #0x10 - eor v4.16b, v4.16b, v8.16b - ld1 {v8.16b}, [BUF], #0x10 - - eor v1.16b, v1.16b, v5.16b - eor v2.16b, v2.16b, v6.16b - eor v3.16b, v3.16b, v7.16b - eor v4.16b, v4.16b, v8.16b - - cmp LEN, #0x40 - b.lt less_64 - - if_will_cond_yield_neon - stp q1, q2, [sp, #.Lframe_local_offset] - stp q3, q4, [sp, #.Lframe_local_offset + 32] - do_cond_yield_neon - ldp q1, q2, [sp, #.Lframe_local_offset] - ldp q3, q4, [sp, #.Lframe_local_offset + 32] - ldr qCONSTANT, [CONST] - movi vzr.16b, #0 - endif_yield_neon - b loop_64 - -less_64: /* Folding cache line into 128bit */ - ldr qCONSTANT, [CONST, #16] - - pmull2 v5.1q, v1.2d, vCONSTANT.2d - pmull v1.1q, v1.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v5.16b - eor v1.16b, v1.16b, v2.16b - - pmull2 v5.1q, v1.2d, vCONSTANT.2d - pmull v1.1q, v1.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v5.16b - eor v1.16b, v1.16b, v3.16b - - pmull2 v5.1q, v1.2d, vCONSTANT.2d - pmull v1.1q, v1.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v5.16b - eor v1.16b, v1.16b, v4.16b - - cbz LEN, fold_64 - -loop_16: /* Folding rest buffer into 128bit */ - subs LEN, LEN, #0x10 - - ld1 {v2.16b}, [BUF], #0x10 - pmull2 v5.1q, v1.2d, vCONSTANT.2d - pmull v1.1q, v1.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v5.16b - eor v1.16b, v1.16b, v2.16b - - b.ne loop_16 - -fold_64: - /* perform the last 64 bit fold, also adds 32 zeroes - * to the input stream */ - ext v2.16b, v1.16b, v1.16b, #8 - pmull2 v2.1q, v2.2d, vCONSTANT.2d - ext v1.16b, v1.16b, vzr.16b, #8 - eor v1.16b, v1.16b, v2.16b - - /* final 32-bit fold */ - ldr dCONSTANT, [CONST, #32] - ldr d3, [CONST, #40] - - ext v2.16b, v1.16b, vzr.16b, #4 - and v1.16b, v1.16b, v3.16b - pmull v1.1q, v1.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v2.16b - - /* Finish up with the bit-reversed barrett reduction 64 ==> 32 bits */ - ldr qCONSTANT, [CONST, #48] - - and v2.16b, v1.16b, v3.16b - ext v2.16b, vzr.16b, v2.16b, #8 - pmull2 v2.1q, v2.2d, vCONSTANT.2d - and v2.16b, v2.16b, v3.16b - pmull v2.1q, v2.1d, vCONSTANT.1d - eor v1.16b, v1.16b, v2.16b - mov w0, v1.s[1] - - frame_pop - ret -ENDPROC(crc32_pmull_le) -ENDPROC(crc32c_pmull_le) - - .macro __crc32, c -0: subs x2, x2, #16 - b.mi 8f - ldp x3, x4, [x1], #16 -CPU_BE( rev x3, x3 ) -CPU_BE( rev x4, x4 ) - crc32\c\()x w0, w0, x3 - crc32\c\()x w0, w0, x4 - b.ne 0b - ret - -8: tbz x2, #3, 4f - ldr x3, [x1], #8 -CPU_BE( rev x3, x3 ) - crc32\c\()x w0, w0, x3 -4: tbz x2, #2, 2f - ldr w3, [x1], #4 -CPU_BE( rev w3, w3 ) - crc32\c\()w w0, w0, w3 -2: tbz x2, #1, 1f - ldrh w3, [x1], #2 -CPU_BE( rev16 w3, w3 ) - crc32\c\()h w0, w0, w3 -1: tbz x2, #0, 0f - ldrb w3, [x1] - crc32\c\()b w0, w0, w3 -0: ret - .endm - - .align 5 -ENTRY(crc32_armv8_le) - __crc32 -ENDPROC(crc32_armv8_le) - - .align 5 -ENTRY(crc32c_armv8_le) - __crc32 c -ENDPROC(crc32c_armv8_le) diff --git a/arch/arm64/crypto/crc32-ce-glue.c b/arch/arm64/crypto/crc32-ce-glue.c deleted file mode 100644 index 34b4e3d46aab..000000000000 --- a/arch/arm64/crypto/crc32-ce-glue.c +++ /dev/null @@ -1,244 +0,0 @@ -/* - * Accelerated CRC32(C) using arm64 NEON and Crypto Extensions instructions - * - * Copyright (C) 2016 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include <linux/cpufeature.h> -#include <linux/crc32.h> -#include <linux/init.h> -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/string.h> - -#include <crypto/internal/hash.h> - -#include <asm/hwcap.h> -#include <asm/neon.h> -#include <asm/simd.h> -#include <asm/unaligned.h> - -#define PMULL_MIN_LEN 64L /* minimum size of buffer - * for crc32_pmull_le_16 */ -#define SCALE_F 16L /* size of NEON register */ - -asmlinkage u32 crc32_pmull_le(const u8 buf[], u64 len, u32 init_crc); -asmlinkage u32 crc32_armv8_le(u32 init_crc, const u8 buf[], size_t len); - -asmlinkage u32 crc32c_pmull_le(const u8 buf[], u64 len, u32 init_crc); -asmlinkage u32 crc32c_armv8_le(u32 init_crc, const u8 buf[], size_t len); - -static u32 (*fallback_crc32)(u32 init_crc, const u8 buf[], size_t len); -static u32 (*fallback_crc32c)(u32 init_crc, const u8 buf[], size_t len); - -static int crc32_pmull_cra_init(struct crypto_tfm *tfm) -{ - u32 *key = crypto_tfm_ctx(tfm); - - *key = 0; - return 0; -} - -static int crc32c_pmull_cra_init(struct crypto_tfm *tfm) -{ - u32 *key = crypto_tfm_ctx(tfm); - - *key = ~0; - return 0; -} - -static int crc32_pmull_setkey(struct crypto_shash *hash, const u8 *key, - unsigned int keylen) -{ - u32 *mctx = crypto_shash_ctx(hash); - - if (keylen != sizeof(u32)) { - crypto_shash_set_flags(hash, CRYPTO_TFM_RES_BAD_KEY_LEN); - return -EINVAL; - } - *mctx = le32_to_cpup((__le32 *)key); - return 0; -} - -static int crc32_pmull_init(struct shash_desc *desc) -{ - u32 *mctx = crypto_shash_ctx(desc->tfm); - u32 *crc = shash_desc_ctx(desc); - - *crc = *mctx; - return 0; -} - -static int crc32_update(struct shash_desc *desc, const u8 *data, - unsigned int length) -{ - u32 *crc = shash_desc_ctx(desc); - - *crc = crc32_armv8_le(*crc, data, length); - return 0; -} - -static int crc32c_update(struct shash_desc *desc, const u8 *data, - unsigned int length) -{ - u32 *crc = shash_desc_ctx(desc); - - *crc = crc32c_armv8_le(*crc, data, length); - return 0; -} - -static int crc32_pmull_update(struct shash_desc *desc, const u8 *data, - unsigned int length) -{ - u32 *crc = shash_desc_ctx(desc); - unsigned int l; - - if ((u64)data % SCALE_F) { - l = min_t(u32, length, SCALE_F - ((u64)data % SCALE_F)); - - *crc = fallback_crc32(*crc, data, l); - - data += l; - length -= l; - } - - if (length >= PMULL_MIN_LEN && may_use_simd()) { - l = round_down(length, SCALE_F); - - kernel_neon_begin(); - *crc = crc32_pmull_le(data, l, *crc); - kernel_neon_end(); - - data += l; - length -= l; - } - - if (length > 0) - *crc = fallback_crc32(*crc, data, length); - - return 0; -} - -static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data, - unsigned int length) -{ - u32 *crc = shash_desc_ctx(desc); - unsigned int l; - - if ((u64)data % SCALE_F) { - l = min_t(u32, length, SCALE_F - ((u64)data % SCALE_F)); - - *crc = fallback_crc32c(*crc, data, l); - - data += l; - length -= l; - } - - if (length >= PMULL_MIN_LEN && may_use_simd()) { - l = round_down(length, SCALE_F); - - kernel_neon_begin(); - *crc = crc32c_pmull_le(data, l, *crc); - kernel_neon_end(); - - data += l; - length -= l; - } - - if (length > 0) { - *crc = fallback_crc32c(*crc, data, length); - } - - return 0; -} - -static int crc32_pmull_final(struct shash_desc *desc, u8 *out) -{ - u32 *crc = shash_desc_ctx(desc); - - put_unaligned_le32(*crc, out); - return 0; -} - -static int crc32c_pmull_final(struct shash_desc *desc, u8 *out) -{ - u32 *crc = shash_desc_ctx(desc); - - put_unaligned_le32(~*crc, out); - return 0; -} - -static struct shash_alg crc32_pmull_algs[] = { { - .setkey = crc32_pmull_setkey, - .init = crc32_pmull_init, - .update = crc32_update, - .final = crc32_pmull_final, - .descsize = sizeof(u32), - .digestsize = sizeof(u32), - - .base.cra_ctxsize = sizeof(u32), - .base.cra_init = crc32_pmull_cra_init, - .base.cra_name = "crc32", - .base.cra_driver_name = "crc32-arm64-ce", - .base.cra_priority = 200, - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_blocksize = 1, - .base.cra_module = THIS_MODULE, -}, { - .setkey = crc32_pmull_setkey, - .init = crc32_pmull_init, - .update = crc32c_update, - .final = crc32c_pmull_final, - .descsize = sizeof(u32), - .digestsize = sizeof(u32), - - .base.cra_ctxsize = sizeof(u32), - .base.cra_init = crc32c_pmull_cra_init, - .base.cra_name = "crc32c", - .base.cra_driver_name = "crc32c-arm64-ce", - .base.cra_priority = 200, - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_blocksize = 1, - .base.cra_module = THIS_MODULE, -} }; - -static int __init crc32_pmull_mod_init(void) -{ - if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && (elf_hwcap & HWCAP_PMULL)) { - crc32_pmull_algs[0].update = crc32_pmull_update; - crc32_pmull_algs[1].update = crc32c_pmull_update; - - if (elf_hwcap & HWCAP_CRC32) { - fallback_crc32 = crc32_armv8_le; - fallback_crc32c = crc32c_armv8_le; - } else { - fallback_crc32 = crc32_le; - fallback_crc32c = __crc32c_le; - } - } else if (!(elf_hwcap & HWCAP_CRC32)) { - return -ENODEV; - } - return crypto_register_shashes(crc32_pmull_algs, - ARRAY_SIZE(crc32_pmull_algs)); -} - -static void __exit crc32_pmull_mod_exit(void) -{ - crypto_unregister_shashes(crc32_pmull_algs, - ARRAY_SIZE(crc32_pmull_algs)); -} - -static const struct cpu_feature crc32_cpu_feature[] = { - { cpu_feature(CRC32) }, { cpu_feature(PMULL) }, { } -}; -MODULE_DEVICE_TABLE(cpu, crc32_cpu_feature); - -module_init(crc32_pmull_mod_init); -module_exit(crc32_pmull_mod_exit); - -MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>"); -MODULE_LICENSE("GPL v2"); -- 2.18.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver 2018-08-27 11:02 ` [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver Ard Biesheuvel @ 2018-09-04 5:21 ` Herbert Xu 0 siblings, 0 replies; 15+ messages in thread From: Herbert Xu @ 2018-09-04 5:21 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-crypto, will.deacon, catalin.marinas, ebiggers, suzuki.poulose, linux-kernel On Mon, Aug 27, 2018 at 01:02:45PM +0200, Ard Biesheuvel wrote: > Now that the scalar fallbacks have been moved out of this driver into > the core crc32()/crc32c() routines, we are left with a CRC32 crypto API > driver for arm64 that is based only on 64x64 polynomial multiplication, > which is an optional instruction in the ARMv8 architecture, and is less > and less likely to be available on cores that do not also implement the > CRC32 instructions, given that those are mandatory in the architecture > as of ARMv8.1. > > Since the scalar instructions do not require the special handling that > SIMD instructions do, and since they turn out to be considerably faster > on some cores (Cortex-A53) as well, there is really no point in keeping > this code around so let's just remove it. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Patch applied. Thanks. -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel ` (3 preceding siblings ...) 2018-08-27 11:02 ` [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver Ard Biesheuvel @ 2018-08-27 14:53 ` Theodore Y. Ts'o 2018-08-27 15:18 ` Ard Biesheuvel 4 siblings, 1 reply; 15+ messages in thread From: Theodore Y. Ts'o @ 2018-08-27 14:53 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux-crypto, will.deacon, catalin.marinas, herbert, ebiggers, suzuki.poulose, linux-kernel On Mon, Aug 27, 2018 at 01:02:41PM +0200, Ard Biesheuvel wrote: > While this is not known to cause performance issues, calling a table based > time variant implementation with a non-negligible D-cache footprint (8 KB) > is wasteful in any case, and now that the crc32 instructions have been made > mandatory in the architecture, let's wire them up into the core crc routines. Stupid question --- are there any arm64 SOC's out there which do *not* have the crc32 instructions? Presumably there won't be in the future, because it's now mandatory --- but where there any in the past? - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines 2018-08-27 14:53 ` [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Theodore Y. Ts'o @ 2018-08-27 15:18 ` Ard Biesheuvel 0 siblings, 0 replies; 15+ messages in thread From: Ard Biesheuvel @ 2018-08-27 15:18 UTC (permalink / raw) To: Theodore Y. Ts'o, Ard Biesheuvel, linux-arm-kernel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Will Deacon, Catalin Marinas, Herbert Xu, Eric Biggers, Suzuki K. Poulose, Linux Kernel Mailing List On 27 August 2018 at 16:53, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Mon, Aug 27, 2018 at 01:02:41PM +0200, Ard Biesheuvel wrote: >> While this is not known to cause performance issues, calling a table based >> time variant implementation with a non-negligible D-cache footprint (8 KB) >> is wasteful in any case, and now that the crc32 instructions have been made >> mandatory in the architecture, let's wire them up into the core crc routines. > > Stupid question --- are there any arm64 SOC's out there which do *not* > have the crc32 instructions? Presumably there won't be in the future, > because it's now mandatory --- but where there any in the past? > Yes, the APM Xgene for instance. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-09-10 15:45 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-27 11:02 [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 1/4] lib/crc32: make core crc32() routines weak so they can be overridden Ard Biesheuvel 2018-09-04 9:44 ` Herbert Xu 2018-08-27 11:02 ` [PATCH 2/4] arm64: cpufeature: add feature for CRC32 instructions Ard Biesheuvel 2018-08-28 17:01 ` Will Deacon 2018-08-28 18:43 ` Ard Biesheuvel 2018-09-04 3:18 ` Herbert Xu 2018-09-04 9:38 ` Will Deacon 2018-09-04 9:44 ` Herbert Xu 2018-09-10 15:45 ` Catalin Marinas 2018-08-27 11:02 ` [PATCH 3/4] arm64/lib: add accelerated crc32 routines Ard Biesheuvel 2018-08-27 11:02 ` [PATCH 4/4] crypto: arm64/crc32 - remove PMULL based CRC32 driver Ard Biesheuvel 2018-09-04 5:21 ` Herbert Xu 2018-08-27 14:53 ` [PATCH 0/4] arm64: wire CRC32 instructions into core crc32 routines Theodore Y. Ts'o 2018-08-27 15:18 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).