All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors
@ 2019-08-19 14:15 Ard Biesheuvel
  2019-08-19 15:44 ` Nathan Chancellor
  2019-08-30  8:14 ` Herbert Xu
  0 siblings, 2 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2019-08-19 14:15 UTC (permalink / raw)
  To: linux-crypto; +Cc: herbert, natechancellor, Ard Biesheuvel

When building the new aegis128 NEON code in big endian mode, Clang
complains about the const uint8x16_t permute vectors in the following
way:

  crypto/aegis128-neon-inner.c:58:40: warning: vector initializers are not
      compatible with NEON intrinsics in big endian mode
      [-Wnonportable-vector-initialization]
                static const uint8x16_t shift_rows = {
                                                     ^
  crypto/aegis128-neon-inner.c:58:40: note: consider using vld1q_u8() to
      initialize a vector from memory, or vcombine_u8(vcreate_u8(), vcreate_u8())
      to initialize from integer constants

Since the same issue applies to the uint8x16x4_t loads of the AES Sbox,
update those references as well. However, since GCC does not implement
the vld1q_u8_x4() intrinsic, switch from IS_ENABLED() to a preprocessor
conditional to conditionally include this code.

Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 crypto/aegis128-neon-inner.c | 38 ++++++++++----------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c
index ed55568afd1b..f05310ca22aa 100644
--- a/crypto/aegis128-neon-inner.c
+++ b/crypto/aegis128-neon-inner.c
@@ -26,7 +26,7 @@ struct aegis128_state {
 	uint8x16_t v[5];
 };
 
-extern const uint8x16x4_t crypto_aes_sbox[];
+extern const uint8_t crypto_aes_sbox[];
 
 static struct aegis128_state aegis128_load_state_neon(const void *state)
 {
@@ -55,39 +55,39 @@ uint8x16_t aegis_aes_round(uint8x16_t w)
 
 #ifdef CONFIG_ARM64
 	if (!__builtin_expect(aegis128_have_aes_insn, 1)) {
-		static const uint8x16_t shift_rows = {
+		static const uint8_t shift_rows[] = {
 			0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3,
 			0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb,
 		};
-		static const uint8x16_t ror32by8 = {
+		static const uint8_t ror32by8[] = {
 			0x1, 0x2, 0x3, 0x0, 0x5, 0x6, 0x7, 0x4,
 			0x9, 0xa, 0xb, 0x8, 0xd, 0xe, 0xf, 0xc,
 		};
 		uint8x16_t v;
 
 		// shift rows
-		w = vqtbl1q_u8(w, shift_rows);
+		w = vqtbl1q_u8(w, vld1q_u8(shift_rows));
 
 		// sub bytes
-		if (!IS_ENABLED(CONFIG_CC_IS_GCC)) {
-			v = vqtbl4q_u8(crypto_aes_sbox[0], w);
-			v = vqtbx4q_u8(v, crypto_aes_sbox[1], w - 0x40);
-			v = vqtbx4q_u8(v, crypto_aes_sbox[2], w - 0x80);
-			v = vqtbx4q_u8(v, crypto_aes_sbox[3], w - 0xc0);
-		} else {
-			asm("tbl %0.16b, {v16.16b-v19.16b}, %1.16b" : "=w"(v) : "w"(w));
-			w -= 0x40;
-			asm("tbx %0.16b, {v20.16b-v23.16b}, %1.16b" : "+w"(v) : "w"(w));
-			w -= 0x40;
-			asm("tbx %0.16b, {v24.16b-v27.16b}, %1.16b" : "+w"(v) : "w"(w));
-			w -= 0x40;
-			asm("tbx %0.16b, {v28.16b-v31.16b}, %1.16b" : "+w"(v) : "w"(w));
-		}
+#ifndef CONFIG_CC_IS_GCC
+		v = vqtbl4q_u8(vld1q_u8_x4(crypto_aes_sbox), w);
+		v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0x40), w - 0x40);
+		v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0x80), w - 0x80);
+		v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0xc0), w - 0xc0);
+#else
+		asm("tbl %0.16b, {v16.16b-v19.16b}, %1.16b" : "=w"(v) : "w"(w));
+		w -= 0x40;
+		asm("tbx %0.16b, {v20.16b-v23.16b}, %1.16b" : "+w"(v) : "w"(w));
+		w -= 0x40;
+		asm("tbx %0.16b, {v24.16b-v27.16b}, %1.16b" : "+w"(v) : "w"(w));
+		w -= 0x40;
+		asm("tbx %0.16b, {v28.16b-v31.16b}, %1.16b" : "+w"(v) : "w"(w));
+#endif
 
 		// mix columns
 		w = (v << 1) ^ (uint8x16_t)(((int8x16_t)v >> 7) & 0x1b);
 		w ^= (uint8x16_t)vrev32q_u16((uint16x8_t)v);
-		w ^= vqtbl1q_u8(v ^ w, ror32by8);
+		w ^= vqtbl1q_u8(v ^ w, vld1q_u8(ror32by8));
 
 		return w;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors
  2019-08-19 14:15 [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors Ard Biesheuvel
@ 2019-08-19 15:44 ` Nathan Chancellor
  2019-08-30  8:14 ` Herbert Xu
  1 sibling, 0 replies; 3+ messages in thread
From: Nathan Chancellor @ 2019-08-19 15:44 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-crypto, herbert

On Mon, Aug 19, 2019 at 05:15:00PM +0300, Ard Biesheuvel wrote:
> When building the new aegis128 NEON code in big endian mode, Clang
> complains about the const uint8x16_t permute vectors in the following
> way:
> 
>   crypto/aegis128-neon-inner.c:58:40: warning: vector initializers are not
>       compatible with NEON intrinsics in big endian mode
>       [-Wnonportable-vector-initialization]
>                 static const uint8x16_t shift_rows = {
>                                                      ^
>   crypto/aegis128-neon-inner.c:58:40: note: consider using vld1q_u8() to
>       initialize a vector from memory, or vcombine_u8(vcreate_u8(), vcreate_u8())
>       to initialize from integer constants
> 
> Since the same issue applies to the uint8x16x4_t loads of the AES Sbox,
> update those references as well. However, since GCC does not implement
> the vld1q_u8_x4() intrinsic, switch from IS_ENABLED() to a preprocessor
> conditional to conditionally include this code.
> 
> Reported-by: Nathan Chancellor <natechancellor@gmail.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

I am not familiar enough with vectors and such to confidently give a
review but I can say this fixes the warning and doesn't introduce any
new ones. Thank you for the fix!

Tested-by: Nathan Chancellor <natechancellor@gmail.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors
  2019-08-19 14:15 [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors Ard Biesheuvel
  2019-08-19 15:44 ` Nathan Chancellor
@ 2019-08-30  8:14 ` Herbert Xu
  1 sibling, 0 replies; 3+ messages in thread
From: Herbert Xu @ 2019-08-30  8:14 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-crypto, natechancellor

On Mon, Aug 19, 2019 at 05:15:00PM +0300, Ard Biesheuvel wrote:
> When building the new aegis128 NEON code in big endian mode, Clang
> complains about the const uint8x16_t permute vectors in the following
> way:
> 
>   crypto/aegis128-neon-inner.c:58:40: warning: vector initializers are not
>       compatible with NEON intrinsics in big endian mode
>       [-Wnonportable-vector-initialization]
>                 static const uint8x16_t shift_rows = {
>                                                      ^
>   crypto/aegis128-neon-inner.c:58:40: note: consider using vld1q_u8() to
>       initialize a vector from memory, or vcombine_u8(vcreate_u8(), vcreate_u8())
>       to initialize from integer constants
> 
> Since the same issue applies to the uint8x16x4_t loads of the AES Sbox,
> update those references as well. However, since GCC does not implement
> the vld1q_u8_x4() intrinsic, switch from IS_ENABLED() to a preprocessor
> conditional to conditionally include this code.
> 
> Reported-by: Nathan Chancellor <natechancellor@gmail.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  crypto/aegis128-neon-inner.c | 38 ++++++++++----------
>  1 file changed, 19 insertions(+), 19 deletions(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-08-30  8:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-19 14:15 [PATCH] crypto: arm64/aegis128 - use explicit vector load for permute vectors Ard Biesheuvel
2019-08-19 15:44 ` Nathan Chancellor
2019-08-30  8:14 ` Herbert Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.