* [PATCH v3 0/4] arm64: accelerate crc32_be @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey Originally sent only to the arm-linux list - now including linux-crypto. Ard suggested that Herbert take the series. This series completes the arm64 crc32 helper acceleration by adding crc32_be. There are plenty of users, for example OF. To compensate for the extra supporting cruft in lib/crc32.c, a couple of minor tidies. changes since v2: - no code change, but sent to Herbert+crypto with Catalin's ack for arm64 changes since v1: - assembler style fixes from Ard's review Kevin Bracey (4): lib/crc32.c: remove unneeded casts lib/crc32.c: Make crc32_be weak for arch override lib/crc32test.c: correct printed bytes count arm64: accelerate crc32_be arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- lib/crc32.c | 14 +++---- lib/crc32test.c | 2 +- 3 files changed, 80 insertions(+), 23 deletions(-) -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 0/4] arm64: accelerate crc32_be @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey Originally sent only to the arm-linux list - now including linux-crypto. Ard suggested that Herbert take the series. This series completes the arm64 crc32 helper acceleration by adding crc32_be. There are plenty of users, for example OF. To compensate for the extra supporting cruft in lib/crc32.c, a couple of minor tidies. changes since v2: - no code change, but sent to Herbert+crypto with Catalin's ack for arm64 changes since v1: - assembler style fixes from Ard's review Kevin Bracey (4): lib/crc32.c: remove unneeded casts lib/crc32.c: Make crc32_be weak for arch override lib/crc32test.c: correct printed bytes count arm64: accelerate crc32_be arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- lib/crc32.c | 14 +++---- lib/crc32test.c | 2 +- 3 files changed, 80 insertions(+), 23 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 1/4] lib/crc32.c: remove unneeded casts 2022-01-18 10:23 ` Kevin Bracey @ 2022-01-18 10:23 ` Kevin Bracey -1 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey Casts were added in commit 8f243af42ade ("sections: fix const sections for crc32 table") to cope with the tables not being const. They are no longer required since commit f5e38b9284e1 ("lib: crc32: constify crc32 lookup table"). Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/lib/crc32.c b/lib/crc32.c index 2a68dfd3b96c..7f062a2639df 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -194,13 +194,11 @@ u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) #else u32 __pure __weak crc32_le(u32 crc, unsigned char const *p, size_t len) { - return crc32_le_generic(crc, p, len, - (const u32 (*)[256])crc32table_le, CRC32_POLY_LE); + return crc32_le_generic(crc, p, len, crc32table_le, CRC32_POLY_LE); } u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) { - return crc32_le_generic(crc, p, len, - (const u32 (*)[256])crc32ctable_le, CRC32C_POLY_LE); + return crc32_le_generic(crc, p, len, crc32ctable_le, CRC32C_POLY_LE); } #endif EXPORT_SYMBOL(crc32_le); @@ -339,8 +337,7 @@ u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) #else u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) { - return crc32_be_generic(crc, p, len, - (const u32 (*)[256])crc32table_be, CRC32_POLY_BE); + return crc32_be_generic(crc, p, len, crc32table_be, CRC32_POLY_BE); } #endif EXPORT_SYMBOL(crc32_be); -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 1/4] lib/crc32.c: remove unneeded casts @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey Casts were added in commit 8f243af42ade ("sections: fix const sections for crc32 table") to cope with the tables not being const. They are no longer required since commit f5e38b9284e1 ("lib: crc32: constify crc32 lookup table"). Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/lib/crc32.c b/lib/crc32.c index 2a68dfd3b96c..7f062a2639df 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -194,13 +194,11 @@ u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) #else u32 __pure __weak crc32_le(u32 crc, unsigned char const *p, size_t len) { - return crc32_le_generic(crc, p, len, - (const u32 (*)[256])crc32table_le, CRC32_POLY_LE); + return crc32_le_generic(crc, p, len, crc32table_le, CRC32_POLY_LE); } u32 __pure __weak __crc32c_le(u32 crc, unsigned char const *p, size_t len) { - return crc32_le_generic(crc, p, len, - (const u32 (*)[256])crc32ctable_le, CRC32C_POLY_LE); + return crc32_le_generic(crc, p, len, crc32ctable_le, CRC32C_POLY_LE); } #endif EXPORT_SYMBOL(crc32_le); @@ -339,8 +337,7 @@ u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) #else u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) { - return crc32_be_generic(crc, p, len, - (const u32 (*)[256])crc32table_be, CRC32_POLY_BE); + return crc32_be_generic(crc, p, len, crc32table_be, CRC32_POLY_BE); } #endif EXPORT_SYMBOL(crc32_be); -- 2.25.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/4] lib/crc32.c: Make crc32_be weak for arch override 2022-01-18 10:23 ` Kevin Bracey @ 2022-01-18 10:23 ` Kevin Bracey -1 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey crc32_le and __crc32c_le can be overridden - extend this to crc32_be. Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/crc32.c b/lib/crc32.c index 7f062a2639df..5649847d0a8d 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -206,6 +206,7 @@ EXPORT_SYMBOL(__crc32c_le); u32 __pure crc32_le_base(u32, unsigned char const *, size_t) __alias(crc32_le); u32 __pure __crc32c_le_base(u32, unsigned char const *, size_t) __alias(__crc32c_le); +u32 __pure crc32_be_base(u32, unsigned char const *, size_t) __alias(crc32_be); /* * This multiplies the polynomials x and y modulo the given modulus. @@ -330,12 +331,12 @@ static inline u32 __pure crc32_be_generic(u32 crc, unsigned char const *p, } #if CRC_BE_BITS == 1 -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_be(u32 crc, unsigned char const *p, size_t len) { return crc32_be_generic(crc, p, len, NULL, CRC32_POLY_BE); } #else -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_be(u32 crc, unsigned char const *p, size_t len) { return crc32_be_generic(crc, p, len, crc32table_be, CRC32_POLY_BE); } -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/4] lib/crc32.c: Make crc32_be weak for arch override @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey crc32_le and __crc32c_le can be overridden - extend this to crc32_be. Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/crc32.c b/lib/crc32.c index 7f062a2639df..5649847d0a8d 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -206,6 +206,7 @@ EXPORT_SYMBOL(__crc32c_le); u32 __pure crc32_le_base(u32, unsigned char const *, size_t) __alias(crc32_le); u32 __pure __crc32c_le_base(u32, unsigned char const *, size_t) __alias(__crc32c_le); +u32 __pure crc32_be_base(u32, unsigned char const *, size_t) __alias(crc32_be); /* * This multiplies the polynomials x and y modulo the given modulus. @@ -330,12 +331,12 @@ static inline u32 __pure crc32_be_generic(u32 crc, unsigned char const *p, } #if CRC_BE_BITS == 1 -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_be(u32 crc, unsigned char const *p, size_t len) { return crc32_be_generic(crc, p, len, NULL, CRC32_POLY_BE); } #else -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) +u32 __pure __weak crc32_be(u32 crc, unsigned char const *p, size_t len) { return crc32_be_generic(crc, p, len, crc32table_be, CRC32_POLY_BE); } -- 2.25.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 3/4] lib/crc32test.c: correct printed bytes count 2022-01-18 10:23 ` Kevin Bracey @ 2022-01-18 10:23 ` Kevin Bracey -1 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey crc32c_le self test had a stray multiply by two inherited from the crc32_le+crc32_be test loop. Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/crc32test.c b/lib/crc32test.c index 61ddce2cff77..9b4af79412c4 100644 --- a/lib/crc32test.c +++ b/lib/crc32test.c @@ -675,7 +675,7 @@ static int __init crc32c_test(void) /* pre-warm the cache */ for (i = 0; i < 100; i++) { - bytes += 2*test[i].length; + bytes += test[i].length; crc ^= __crc32c_le(test[i].crc, test_buf + test[i].start, test[i].length); -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 3/4] lib/crc32test.c: correct printed bytes count @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey crc32c_le self test had a stray multiply by two inherited from the crc32_le+crc32_be test loop. Signed-off-by: Kevin Bracey <kevin@bracey.fi> --- lib/crc32test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/crc32test.c b/lib/crc32test.c index 61ddce2cff77..9b4af79412c4 100644 --- a/lib/crc32test.c +++ b/lib/crc32test.c @@ -675,7 +675,7 @@ static int __init crc32c_test(void) /* pre-warm the cache */ for (i = 0; i < 100; i++) { - bytes += 2*test[i].length; + bytes += test[i].length; crc ^= __crc32c_le(test[i].crc, test_buf + test[i].start, test[i].length); -- 2.25.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 4/4] arm64: accelerate crc32_be 2022-01-18 10:23 ` Kevin Bracey @ 2022-01-18 10:23 ` Kevin Bracey -1 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey It makes no sense to leave crc32_be using the generic code while we only accelerate the little-endian ops. Even though the big-endian form doesn't fit as smoothly into the arm64, we can speed it up and avoid hitting the D cache. Tested on Cortex-A53. Without acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 192240 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21360 nsec With acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 53480 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21480 nsec Signed-off-by: Kevin Bracey <kevin@bracey.fi> Tested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- 1 file changed, 73 insertions(+), 14 deletions(-) diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S index 0f9e10ecda23..8340dccff46f 100644 --- a/arch/arm64/lib/crc32.S +++ b/arch/arm64/lib/crc32.S @@ -11,7 +11,44 @@ .arch armv8-a+crc - .macro __crc32, c + .macro byteorder, reg, be + .if \be +CPU_LE( rev \reg, \reg ) + .else +CPU_BE( rev \reg, \reg ) + .endif + .endm + + .macro byteorder16, reg, be + .if \be +CPU_LE( rev16 \reg, \reg ) + .else +CPU_BE( rev16 \reg, \reg ) + .endif + .endm + + .macro bitorder, reg, be + .if \be + rbit \reg, \reg + .endif + .endm + + .macro bitorder16, reg, be + .if \be + rbit \reg, \reg + lsr \reg, \reg, #16 + .endif + .endm + + .macro bitorder8, reg, be + .if \be + rbit \reg, \reg + lsr \reg, \reg, #24 + .endif + .endm + + .macro __crc32, c, be=0 + bitorder w0, \be cmp x2, #16 b.lt 8f // less than 16 bytes @@ -24,10 +61,14 @@ add x8, x8, x1 add x1, x1, x7 ldp x5, x6, [x8] -CPU_BE( rev x3, x3 ) -CPU_BE( rev x4, x4 ) -CPU_BE( rev x5, x5 ) -CPU_BE( rev x6, x6 ) + byteorder x3, \be + byteorder x4, \be + byteorder x5, \be + byteorder x6, \be + bitorder x3, \be + bitorder x4, \be + bitorder x5, \be + bitorder x6, \be tst x7, #8 crc32\c\()x w8, w0, x3 @@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 ) 32: ldp x3, x4, [x1], #32 sub x2, x2, #32 ldp x5, x6, [x1, #-16] -CPU_BE( rev x3, x3 ) -CPU_BE( rev x4, x4 ) -CPU_BE( rev x5, x5 ) -CPU_BE( rev x6, x6 ) + byteorder x3, \be + byteorder x4, \be + byteorder x5, \be + byteorder x6, \be + bitorder x3, \be + bitorder x4, \be + bitorder x5, \be + bitorder x6, \be crc32\c\()x w0, w0, x3 crc32\c\()x w0, w0, x4 crc32\c\()x w0, w0, x5 crc32\c\()x w0, w0, x6 cbnz x2, 32b -0: ret +0: bitorder w0, \be + ret 8: tbz x2, #3, 4f ldr x3, [x1], #8 -CPU_BE( rev x3, x3 ) + byteorder x3, \be + bitorder x3, \be crc32\c\()x w0, w0, x3 4: tbz x2, #2, 2f ldr w3, [x1], #4 -CPU_BE( rev w3, w3 ) + byteorder w3, \be + bitorder w3, \be crc32\c\()w w0, w0, w3 2: tbz x2, #1, 1f ldrh w3, [x1], #2 -CPU_BE( rev16 w3, w3 ) + byteorder16 w3, \be + bitorder16 w3, \be crc32\c\()h w0, w0, w3 1: tbz x2, #0, 0f ldrb w3, [x1] + bitorder8 w3, \be crc32\c\()b w0, w0, w3 -0: ret +0: bitorder w0, \be + ret .endm .align 5 @@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32 alternative_else_nop_endif __crc32 c SYM_FUNC_END(__crc32c_le) + + .align 5 +SYM_FUNC_START(crc32_be) +alternative_if_not ARM64_HAS_CRC32 + b crc32_be_base +alternative_else_nop_endif + __crc32 be=1 +SYM_FUNC_END(crc32_be) -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 4/4] arm64: accelerate crc32_be @ 2022-01-18 10:23 ` Kevin Bracey 0 siblings, 0 replies; 12+ messages in thread From: Kevin Bracey @ 2022-01-18 10:23 UTC (permalink / raw) To: Herbert Xu Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas, Kevin Bracey It makes no sense to leave crc32_be using the generic code while we only accelerate the little-endian ops. Even though the big-endian form doesn't fit as smoothly into the arm64, we can speed it up and avoid hitting the D cache. Tested on Cortex-A53. Without acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 192240 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21360 nsec With acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 53480 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21480 nsec Signed-off-by: Kevin Bracey <kevin@bracey.fi> Tested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- 1 file changed, 73 insertions(+), 14 deletions(-) diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S index 0f9e10ecda23..8340dccff46f 100644 --- a/arch/arm64/lib/crc32.S +++ b/arch/arm64/lib/crc32.S @@ -11,7 +11,44 @@ .arch armv8-a+crc - .macro __crc32, c + .macro byteorder, reg, be + .if \be +CPU_LE( rev \reg, \reg ) + .else +CPU_BE( rev \reg, \reg ) + .endif + .endm + + .macro byteorder16, reg, be + .if \be +CPU_LE( rev16 \reg, \reg ) + .else +CPU_BE( rev16 \reg, \reg ) + .endif + .endm + + .macro bitorder, reg, be + .if \be + rbit \reg, \reg + .endif + .endm + + .macro bitorder16, reg, be + .if \be + rbit \reg, \reg + lsr \reg, \reg, #16 + .endif + .endm + + .macro bitorder8, reg, be + .if \be + rbit \reg, \reg + lsr \reg, \reg, #24 + .endif + .endm + + .macro __crc32, c, be=0 + bitorder w0, \be cmp x2, #16 b.lt 8f // less than 16 bytes @@ -24,10 +61,14 @@ add x8, x8, x1 add x1, x1, x7 ldp x5, x6, [x8] -CPU_BE( rev x3, x3 ) -CPU_BE( rev x4, x4 ) -CPU_BE( rev x5, x5 ) -CPU_BE( rev x6, x6 ) + byteorder x3, \be + byteorder x4, \be + byteorder x5, \be + byteorder x6, \be + bitorder x3, \be + bitorder x4, \be + bitorder x5, \be + bitorder x6, \be tst x7, #8 crc32\c\()x w8, w0, x3 @@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 ) 32: ldp x3, x4, [x1], #32 sub x2, x2, #32 ldp x5, x6, [x1, #-16] -CPU_BE( rev x3, x3 ) -CPU_BE( rev x4, x4 ) -CPU_BE( rev x5, x5 ) -CPU_BE( rev x6, x6 ) + byteorder x3, \be + byteorder x4, \be + byteorder x5, \be + byteorder x6, \be + bitorder x3, \be + bitorder x4, \be + bitorder x5, \be + bitorder x6, \be crc32\c\()x w0, w0, x3 crc32\c\()x w0, w0, x4 crc32\c\()x w0, w0, x5 crc32\c\()x w0, w0, x6 cbnz x2, 32b -0: ret +0: bitorder w0, \be + ret 8: tbz x2, #3, 4f ldr x3, [x1], #8 -CPU_BE( rev x3, x3 ) + byteorder x3, \be + bitorder x3, \be crc32\c\()x w0, w0, x3 4: tbz x2, #2, 2f ldr w3, [x1], #4 -CPU_BE( rev w3, w3 ) + byteorder w3, \be + bitorder w3, \be crc32\c\()w w0, w0, w3 2: tbz x2, #1, 1f ldrh w3, [x1], #2 -CPU_BE( rev16 w3, w3 ) + byteorder16 w3, \be + bitorder16 w3, \be crc32\c\()h w0, w0, w3 1: tbz x2, #0, 0f ldrb w3, [x1] + bitorder8 w3, \be crc32\c\()b w0, w0, w3 -0: ret +0: bitorder w0, \be + ret .endm .align 5 @@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32 alternative_else_nop_endif __crc32 c SYM_FUNC_END(__crc32c_le) + + .align 5 +SYM_FUNC_START(crc32_be) +alternative_if_not ARM64_HAS_CRC32 + b crc32_be_base +alternative_else_nop_endif + __crc32 be=1 +SYM_FUNC_END(crc32_be) -- 2.25.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 0/4] arm64: accelerate crc32_be 2022-01-18 10:23 ` Kevin Bracey @ 2022-01-28 6:25 ` Herbert Xu -1 siblings, 0 replies; 12+ messages in thread From: Herbert Xu @ 2022-01-28 6:25 UTC (permalink / raw) To: Kevin Bracey Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas On Tue, Jan 18, 2022 at 12:23:47PM +0200, Kevin Bracey wrote: > Originally sent only to the arm-linux list - now including linux-crypto. > Ard suggested that Herbert take the series. > > This series completes the arm64 crc32 helper acceleration by adding crc32_be. > > There are plenty of users, for example OF. > > To compensate for the extra supporting cruft in lib/crc32.c, a couple of minor > tidies. > > changes since v2: > - no code change, but sent to Herbert+crypto with Catalin's ack for arm64 > > changes since v1: > - assembler style fixes from Ard's review > > Kevin Bracey (4): > lib/crc32.c: remove unneeded casts > lib/crc32.c: Make crc32_be weak for arch override > lib/crc32test.c: correct printed bytes count > arm64: accelerate crc32_be > > arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- > lib/crc32.c | 14 +++---- > lib/crc32test.c | 2 +- > 3 files changed, 80 insertions(+), 23 deletions(-) All applied. Thanks. -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 0/4] arm64: accelerate crc32_be @ 2022-01-28 6:25 ` Herbert Xu 0 siblings, 0 replies; 12+ messages in thread From: Herbert Xu @ 2022-01-28 6:25 UTC (permalink / raw) To: Kevin Bracey Cc: linux-crypto, linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas On Tue, Jan 18, 2022 at 12:23:47PM +0200, Kevin Bracey wrote: > Originally sent only to the arm-linux list - now including linux-crypto. > Ard suggested that Herbert take the series. > > This series completes the arm64 crc32 helper acceleration by adding crc32_be. > > There are plenty of users, for example OF. > > To compensate for the extra supporting cruft in lib/crc32.c, a couple of minor > tidies. > > changes since v2: > - no code change, but sent to Herbert+crypto with Catalin's ack for arm64 > > changes since v1: > - assembler style fixes from Ard's review > > Kevin Bracey (4): > lib/crc32.c: remove unneeded casts > lib/crc32.c: Make crc32_be weak for arch override > lib/crc32test.c: correct printed bytes count > arm64: accelerate crc32_be > > arch/arm64/lib/crc32.S | 87 +++++++++++++++++++++++++++++++++++------- > lib/crc32.c | 14 +++---- > lib/crc32test.c | 2 +- > 3 files changed, 80 insertions(+), 23 deletions(-) All applied. Thanks. -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-01-28 6:26 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-01-18 10:23 [PATCH v3 0/4] arm64: accelerate crc32_be Kevin Bracey 2022-01-18 10:23 ` Kevin Bracey 2022-01-18 10:23 ` [PATCH v3 1/4] lib/crc32.c: remove unneeded casts Kevin Bracey 2022-01-18 10:23 ` Kevin Bracey 2022-01-18 10:23 ` [PATCH v3 2/4] lib/crc32.c: Make crc32_be weak for arch override Kevin Bracey 2022-01-18 10:23 ` Kevin Bracey 2022-01-18 10:23 ` [PATCH v3 3/4] lib/crc32test.c: correct printed bytes count Kevin Bracey 2022-01-18 10:23 ` Kevin Bracey 2022-01-18 10:23 ` [PATCH v3 4/4] arm64: accelerate crc32_be Kevin Bracey 2022-01-18 10:23 ` Kevin Bracey 2022-01-28 6:25 ` [PATCH v3 0/4] " Herbert Xu 2022-01-28 6:25 ` Herbert Xu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.