All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-17 13:32 ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: herbert, linux-arm-kernel, Ard Biesheuvel, Ondrej Mosnacek, Eric Biggers

This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
block handling', which is included as patch #3 here, but hasn't been
modified substantially.

Patch #1 should probably go to -stable, even though aegis128 does not appear
to be widely used.

Patches #2 and #3 improve the SIMD code paths.

Patch #4 enables fuzz testing for the SIMD code by registering the generic
code as a separate driver if the SIMD code path is enabled.

Changes since v2:
- add Ondrej's ack to #1
- fix an issue spotted by Ondrej in #4 where the generic code path would still
  use some of the SIMD helpers

Cc: Ondrej Mosnacek <omosnacek@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>

[0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/

Ard Biesheuvel (4):
  crypto: aegis128 - wipe plaintext and tag if decryption fails
  crypto: aegis128/neon - optimize tail block handling
  crypto: aegis128/neon - move final tag check to SIMD domain
  crypto: aegis128 - expose SIMD code path as separate driver

 crypto/aegis128-core.c       | 245 ++++++++++++++------
 crypto/aegis128-neon-inner.c | 122 ++++++++--
 crypto/aegis128-neon.c       |  21 +-
 3 files changed, 287 insertions(+), 101 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-17 13:32 ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: Eric Biggers, Ondrej Mosnacek, herbert, linux-arm-kernel, Ard Biesheuvel

This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
block handling', which is included as patch #3 here, but hasn't been
modified substantially.

Patch #1 should probably go to -stable, even though aegis128 does not appear
to be widely used.

Patches #2 and #3 improve the SIMD code paths.

Patch #4 enables fuzz testing for the SIMD code by registering the generic
code as a separate driver if the SIMD code path is enabled.

Changes since v2:
- add Ondrej's ack to #1
- fix an issue spotted by Ondrej in #4 where the generic code path would still
  use some of the SIMD helpers

Cc: Ondrej Mosnacek <omosnacek@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>

[0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/

Ard Biesheuvel (4):
  crypto: aegis128 - wipe plaintext and tag if decryption fails
  crypto: aegis128/neon - optimize tail block handling
  crypto: aegis128/neon - move final tag check to SIMD domain
  crypto: aegis128 - expose SIMD code path as separate driver

 crypto/aegis128-core.c       | 245 ++++++++++++++------
 crypto/aegis128-neon-inner.c | 122 ++++++++--
 crypto/aegis128-neon.c       |  21 +-
 3 files changed, 287 insertions(+), 101 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/4] crypto: aegis128 - wipe plaintext and tag if decryption fails
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-17 13:32   ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: herbert, linux-arm-kernel, Ard Biesheuvel, Ondrej Mosnacek, Eric Biggers

The AEGIS spec mentions explicitly that the security guarantees hold
only if the resulting plaintext and tag of a failed decryption are
withheld. So ensure that we abide by this.

While at it, drop the unused struct aead_request *req parameter from
crypto_aegis128_process_crypt().

Reviewed-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c | 32 ++++++++++++++++----
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 44fb4956f0dd..3a71235892f5 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -154,6 +154,12 @@ static void crypto_aegis128_ad(struct aegis_state *state,
 	}
 }
 
+static void crypto_aegis128_wipe_chunk(struct aegis_state *state, u8 *dst,
+				       const u8 *src, unsigned int size)
+{
+	memzero_explicit(dst, size);
+}
+
 static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 					  const u8 *src, unsigned int size)
 {
@@ -324,7 +330,6 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 
 static __always_inline
 int crypto_aegis128_process_crypt(struct aegis_state *state,
-				  struct aead_request *req,
 				  struct skcipher_walk *walk,
 				  void (*crypt)(struct aegis_state *state,
 					        u8 *dst, const u8 *src,
@@ -403,14 +408,14 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 	if (aegis128_do_simd()) {
 		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
 					   cryptlen);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk);
 		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 	}
@@ -438,19 +443,34 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 	if (aegis128_do_simd()) {
 		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
 					   cryptlen);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk);
 		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 	}
 
-	return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
+	if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) {
+		/*
+		 * From Chapter 4. 'Security Analysis' of the AEGIS spec [0]
+		 *
+		 * "3. If verification fails, the decrypted plaintext and the
+		 *     wrong authentication tag should not be given as output."
+		 *
+		 * [0] https://competitions.cr.yp.to/round3/aegisv11.pdf
+		 */
+		skcipher_walk_aead_decrypt(&walk, req, false);
+		crypto_aegis128_process_crypt(NULL, &walk,
+					      crypto_aegis128_wipe_chunk);
+		memzero_explicit(&tag, sizeof(tag));
+		return -EBADMSG;
+	}
+	return 0;
 }
 
 static struct aead_alg crypto_aegis128_alg = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 1/4] crypto: aegis128 - wipe plaintext and tag if decryption fails
@ 2020-11-17 13:32   ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: Eric Biggers, Ondrej Mosnacek, herbert, linux-arm-kernel, Ard Biesheuvel

The AEGIS spec mentions explicitly that the security guarantees hold
only if the resulting plaintext and tag of a failed decryption are
withheld. So ensure that we abide by this.

While at it, drop the unused struct aead_request *req parameter from
crypto_aegis128_process_crypt().

Reviewed-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c | 32 ++++++++++++++++----
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 44fb4956f0dd..3a71235892f5 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -154,6 +154,12 @@ static void crypto_aegis128_ad(struct aegis_state *state,
 	}
 }
 
+static void crypto_aegis128_wipe_chunk(struct aegis_state *state, u8 *dst,
+				       const u8 *src, unsigned int size)
+{
+	memzero_explicit(dst, size);
+}
+
 static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 					  const u8 *src, unsigned int size)
 {
@@ -324,7 +330,6 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 
 static __always_inline
 int crypto_aegis128_process_crypt(struct aegis_state *state,
-				  struct aead_request *req,
 				  struct skcipher_walk *walk,
 				  void (*crypt)(struct aegis_state *state,
 					        u8 *dst, const u8 *src,
@@ -403,14 +408,14 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 	if (aegis128_do_simd()) {
 		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
 					   cryptlen);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk);
 		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 	}
@@ -438,19 +443,34 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 	if (aegis128_do_simd()) {
 		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
 					   cryptlen);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, req, &walk,
+		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk);
 		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 	}
 
-	return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
+	if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) {
+		/*
+		 * From Chapter 4. 'Security Analysis' of the AEGIS spec [0]
+		 *
+		 * "3. If verification fails, the decrypted plaintext and the
+		 *     wrong authentication tag should not be given as output."
+		 *
+		 * [0] https://competitions.cr.yp.to/round3/aegisv11.pdf
+		 */
+		skcipher_walk_aead_decrypt(&walk, req, false);
+		crypto_aegis128_process_crypt(NULL, &walk,
+					      crypto_aegis128_wipe_chunk);
+		memzero_explicit(&tag, sizeof(tag));
+		return -EBADMSG;
+	}
+	return 0;
 }
 
 static struct aead_alg crypto_aegis128_alg = {
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/4] crypto: aegis128/neon - optimize tail block handling
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-17 13:32   ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: herbert, linux-arm-kernel, Ard Biesheuvel, Ondrej Mosnacek, Eric Biggers

Avoid copying the tail block via a stack buffer if the total size
exceeds a single AEGIS block. In this case, we can use overlapping
loads and stores and NEON permutation instructions instead, which
leads to a modest performance improvement on some cores (< 5%),
and is slightly cleaner. Note that we still need to use a stack
buffer if the entire input is smaller than 16 bytes, given that
we cannot use 16 byte NEON loads and stores safely in this case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-neon-inner.c | 89 +++++++++++++++++---
 1 file changed, 75 insertions(+), 14 deletions(-)

diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c
index 2a660ac1bc3a..cd1b3ad1d1f3 100644
--- a/crypto/aegis128-neon-inner.c
+++ b/crypto/aegis128-neon-inner.c
@@ -20,7 +20,6 @@
 extern int aegis128_have_aes_insn;
 
 void *memcpy(void *dest, const void *src, size_t n);
-void *memset(void *s, int c, size_t n);
 
 struct aegis128_state {
 	uint8x16_t v[5];
@@ -173,10 +172,46 @@ void crypto_aegis128_update_neon(void *state, const void *msg)
 	aegis128_save_state_neon(st, state);
 }
 
+#ifdef CONFIG_ARM
+/*
+ * AArch32 does not provide these intrinsics natively because it does not
+ * implement the underlying instructions. AArch32 only provides 64-bit
+ * wide vtbl.8/vtbx.8 instruction, so use those instead.
+ */
+static uint8x16_t vqtbl1q_u8(uint8x16_t a, uint8x16_t b)
+{
+	union {
+		uint8x16_t	val;
+		uint8x8x2_t	pair;
+	} __a = { a };
+
+	return vcombine_u8(vtbl2_u8(__a.pair, vget_low_u8(b)),
+			   vtbl2_u8(__a.pair, vget_high_u8(b)));
+}
+
+static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b)
+{
+	union {
+		uint8x16_t	val;
+		uint8x8x2_t	pair;
+	} __a = { a };
+
+	return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)),
+			   vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b)));
+}
+#endif
+
+static const uint8_t permute[] __aligned(64) = {
+	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+	 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
+	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+};
+
 void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
+	const int short_input = size < AEGIS_BLOCK_SIZE;
 	uint8x16_t msg;
 
 	preload_sbox();
@@ -186,7 +221,8 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 
 		msg = vld1q_u8(src);
 		st = aegis128_update_neon(st, msg);
-		vst1q_u8(dst, msg ^ s);
+		msg ^= s;
+		vst1q_u8(dst, msg);
 
 		size -= AEGIS_BLOCK_SIZE;
 		src += AEGIS_BLOCK_SIZE;
@@ -195,13 +231,26 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 
 	if (size > 0) {
 		uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
-		uint8_t buf[AEGIS_BLOCK_SIZE] = {};
+		uint8_t buf[AEGIS_BLOCK_SIZE];
+		const void *in = src;
+		void *out = dst;
+		uint8x16_t m;
 
-		memcpy(buf, src, size);
-		msg = vld1q_u8(buf);
-		st = aegis128_update_neon(st, msg);
-		vst1q_u8(buf, msg ^ s);
-		memcpy(dst, buf, size);
+		if (__builtin_expect(short_input, 0))
+			in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size);
+
+		m = vqtbl1q_u8(vld1q_u8(in + size - AEGIS_BLOCK_SIZE),
+			       vld1q_u8(permute + 32 - size));
+
+		st = aegis128_update_neon(st, m);
+
+		vst1q_u8(out + size - AEGIS_BLOCK_SIZE,
+			 vqtbl1q_u8(m ^ s, vld1q_u8(permute + size)));
+
+		if (__builtin_expect(short_input, 0))
+			memcpy(dst, out, size);
+		else
+			vst1q_u8(out - AEGIS_BLOCK_SIZE, msg);
 	}
 
 	aegis128_save_state_neon(st, state);
@@ -211,6 +260,7 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
+	const int short_input = size < AEGIS_BLOCK_SIZE;
 	uint8x16_t msg;
 
 	preload_sbox();
@@ -228,14 +278,25 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 	if (size > 0) {
 		uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
 		uint8_t buf[AEGIS_BLOCK_SIZE];
+		const void *in = src;
+		void *out = dst;
+		uint8x16_t m;
 
-		vst1q_u8(buf, s);
-		memcpy(buf, src, size);
-		msg = vld1q_u8(buf) ^ s;
-		vst1q_u8(buf, msg);
-		memcpy(dst, buf, size);
+		if (__builtin_expect(short_input, 0))
+			in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size);
 
-		st = aegis128_update_neon(st, msg);
+		m = s ^ vqtbx1q_u8(s, vld1q_u8(in + size - AEGIS_BLOCK_SIZE),
+				   vld1q_u8(permute + 32 - size));
+
+		st = aegis128_update_neon(st, m);
+
+		vst1q_u8(out + size - AEGIS_BLOCK_SIZE,
+			 vqtbl1q_u8(m, vld1q_u8(permute + size)));
+
+		if (__builtin_expect(short_input, 0))
+			memcpy(dst, out, size);
+		else
+			vst1q_u8(out - AEGIS_BLOCK_SIZE, msg);
 	}
 
 	aegis128_save_state_neon(st, state);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/4] crypto: aegis128/neon - optimize tail block handling
@ 2020-11-17 13:32   ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: Eric Biggers, Ondrej Mosnacek, herbert, linux-arm-kernel, Ard Biesheuvel

Avoid copying the tail block via a stack buffer if the total size
exceeds a single AEGIS block. In this case, we can use overlapping
loads and stores and NEON permutation instructions instead, which
leads to a modest performance improvement on some cores (< 5%),
and is slightly cleaner. Note that we still need to use a stack
buffer if the entire input is smaller than 16 bytes, given that
we cannot use 16 byte NEON loads and stores safely in this case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-neon-inner.c | 89 +++++++++++++++++---
 1 file changed, 75 insertions(+), 14 deletions(-)

diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c
index 2a660ac1bc3a..cd1b3ad1d1f3 100644
--- a/crypto/aegis128-neon-inner.c
+++ b/crypto/aegis128-neon-inner.c
@@ -20,7 +20,6 @@
 extern int aegis128_have_aes_insn;
 
 void *memcpy(void *dest, const void *src, size_t n);
-void *memset(void *s, int c, size_t n);
 
 struct aegis128_state {
 	uint8x16_t v[5];
@@ -173,10 +172,46 @@ void crypto_aegis128_update_neon(void *state, const void *msg)
 	aegis128_save_state_neon(st, state);
 }
 
+#ifdef CONFIG_ARM
+/*
+ * AArch32 does not provide these intrinsics natively because it does not
+ * implement the underlying instructions. AArch32 only provides 64-bit
+ * wide vtbl.8/vtbx.8 instruction, so use those instead.
+ */
+static uint8x16_t vqtbl1q_u8(uint8x16_t a, uint8x16_t b)
+{
+	union {
+		uint8x16_t	val;
+		uint8x8x2_t	pair;
+	} __a = { a };
+
+	return vcombine_u8(vtbl2_u8(__a.pair, vget_low_u8(b)),
+			   vtbl2_u8(__a.pair, vget_high_u8(b)));
+}
+
+static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b)
+{
+	union {
+		uint8x16_t	val;
+		uint8x8x2_t	pair;
+	} __a = { a };
+
+	return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)),
+			   vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b)));
+}
+#endif
+
+static const uint8_t permute[] __aligned(64) = {
+	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+	 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
+	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+};
+
 void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
+	const int short_input = size < AEGIS_BLOCK_SIZE;
 	uint8x16_t msg;
 
 	preload_sbox();
@@ -186,7 +221,8 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 
 		msg = vld1q_u8(src);
 		st = aegis128_update_neon(st, msg);
-		vst1q_u8(dst, msg ^ s);
+		msg ^= s;
+		vst1q_u8(dst, msg);
 
 		size -= AEGIS_BLOCK_SIZE;
 		src += AEGIS_BLOCK_SIZE;
@@ -195,13 +231,26 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 
 	if (size > 0) {
 		uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
-		uint8_t buf[AEGIS_BLOCK_SIZE] = {};
+		uint8_t buf[AEGIS_BLOCK_SIZE];
+		const void *in = src;
+		void *out = dst;
+		uint8x16_t m;
 
-		memcpy(buf, src, size);
-		msg = vld1q_u8(buf);
-		st = aegis128_update_neon(st, msg);
-		vst1q_u8(buf, msg ^ s);
-		memcpy(dst, buf, size);
+		if (__builtin_expect(short_input, 0))
+			in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size);
+
+		m = vqtbl1q_u8(vld1q_u8(in + size - AEGIS_BLOCK_SIZE),
+			       vld1q_u8(permute + 32 - size));
+
+		st = aegis128_update_neon(st, m);
+
+		vst1q_u8(out + size - AEGIS_BLOCK_SIZE,
+			 vqtbl1q_u8(m ^ s, vld1q_u8(permute + size)));
+
+		if (__builtin_expect(short_input, 0))
+			memcpy(dst, out, size);
+		else
+			vst1q_u8(out - AEGIS_BLOCK_SIZE, msg);
 	}
 
 	aegis128_save_state_neon(st, state);
@@ -211,6 +260,7 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
+	const int short_input = size < AEGIS_BLOCK_SIZE;
 	uint8x16_t msg;
 
 	preload_sbox();
@@ -228,14 +278,25 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 	if (size > 0) {
 		uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
 		uint8_t buf[AEGIS_BLOCK_SIZE];
+		const void *in = src;
+		void *out = dst;
+		uint8x16_t m;
 
-		vst1q_u8(buf, s);
-		memcpy(buf, src, size);
-		msg = vld1q_u8(buf) ^ s;
-		vst1q_u8(buf, msg);
-		memcpy(dst, buf, size);
+		if (__builtin_expect(short_input, 0))
+			in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size);
 
-		st = aegis128_update_neon(st, msg);
+		m = s ^ vqtbx1q_u8(s, vld1q_u8(in + size - AEGIS_BLOCK_SIZE),
+				   vld1q_u8(permute + 32 - size));
+
+		st = aegis128_update_neon(st, m);
+
+		vst1q_u8(out + size - AEGIS_BLOCK_SIZE,
+			 vqtbl1q_u8(m, vld1q_u8(permute + size)));
+
+		if (__builtin_expect(short_input, 0))
+			memcpy(dst, out, size);
+		else
+			vst1q_u8(out - AEGIS_BLOCK_SIZE, msg);
 	}
 
 	aegis128_save_state_neon(st, state);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 3/4] crypto: aegis128/neon - move final tag check to SIMD domain
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-17 13:32   ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: herbert, linux-arm-kernel, Ard Biesheuvel, Ondrej Mosnacek, Eric Biggers

Instead of calculating the tag and returning it to the caller on
decryption, use a SIMD compare and min across vector to perform
the comparison. This is slightly more efficient, and removes the
need on the caller's part to wipe the tag from memory if the
decryption failed.

While at it, switch to unsigned int when passing cryptlen and
assoclen - we don't support input sizes where it matters anyway.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c       | 21 +++++++++----
 crypto/aegis128-neon-inner.c | 33 ++++++++++++++++----
 crypto/aegis128-neon.c       | 21 +++++++++----
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 3a71235892f5..859c7b905618 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -67,9 +67,11 @@ void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst,
 					const u8 *src, unsigned int size);
 void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst,
 					const u8 *src, unsigned int size);
-void crypto_aegis128_final_simd(struct aegis_state *state,
-				union aegis_block *tag_xor,
-				u64 assoclen, u64 cryptlen);
+int crypto_aegis128_final_simd(struct aegis_state *state,
+			       union aegis_block *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize);
 
 static void crypto_aegis128_update(struct aegis_state *state)
 {
@@ -411,7 +413,7 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen);
+					   cryptlen, 0);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
@@ -445,8 +447,15 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
 		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk_simd);
-		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen);
+		if (unlikely(crypto_aegis128_final_simd(&state, &tag,
+							req->assoclen,
+							cryptlen, authsize))) {
+			skcipher_walk_aead_decrypt(&walk, req, false);
+			crypto_aegis128_process_crypt(NULL, req, &walk,
+						      crypto_aegis128_wipe_chunk);
+			return -EBADMSG;
+		}
+		return 0;
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c
index cd1b3ad1d1f3..7de485907d81 100644
--- a/crypto/aegis128-neon-inner.c
+++ b/crypto/aegis128-neon-inner.c
@@ -199,6 +199,17 @@ static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b)
 	return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)),
 			   vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b)));
 }
+
+static int8_t vminvq_s8(int8x16_t v)
+{
+	int8x8_t s = vpmin_s8(vget_low_s8(v), vget_high_s8(v));
+
+	s = vpmin_s8(s, s);
+	s = vpmin_s8(s, s);
+	s = vpmin_s8(s, s);
+
+	return vget_lane_s8(s, 0);
+}
 #endif
 
 static const uint8_t permute[] __aligned(64) = {
@@ -302,8 +313,10 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 	aegis128_save_state_neon(st, state);
 }
 
-void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
-				uint64_t cryptlen)
+int crypto_aegis128_final_neon(void *state, void *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
 	uint8x16_t v;
@@ -311,13 +324,21 @@ void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
 
 	preload_sbox();
 
-	v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8 * assoclen),
-					       vmov_n_u64(8 * cryptlen));
+	v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8ULL * assoclen),
+					       vmov_n_u64(8ULL * cryptlen));
 
 	for (i = 0; i < 7; i++)
 		st = aegis128_update_neon(st, v);
 
-	v = vld1q_u8(tag_xor);
-	v ^= st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4];
+	v = st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4];
+
+	if (authsize > 0) {
+		v = vqtbl1q_u8(~vceqq_u8(v, vld1q_u8(tag_xor)),
+			       vld1q_u8(permute + authsize));
+
+		return vminvq_s8((int8x16_t)v);
+	}
+
 	vst1q_u8(tag_xor, v);
+	return 0;
 }
diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c
index 8271b1fa0fbc..94d591a002a4 100644
--- a/crypto/aegis128-neon.c
+++ b/crypto/aegis128-neon.c
@@ -14,8 +14,10 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size);
 void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size);
-void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
-				uint64_t cryptlen);
+int crypto_aegis128_final_neon(void *state, void *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize);
 
 int aegis128_have_aes_insn __ro_after_init;
 
@@ -60,11 +62,18 @@ void crypto_aegis128_decrypt_chunk_simd(union aegis_block *state, u8 *dst,
 	kernel_neon_end();
 }
 
-void crypto_aegis128_final_simd(union aegis_block *state,
-				union aegis_block *tag_xor,
-				u64 assoclen, u64 cryptlen)
+int crypto_aegis128_final_simd(union aegis_block *state,
+			       union aegis_block *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize)
 {
+	int ret;
+
 	kernel_neon_begin();
-	crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen);
+	ret = crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen,
+					 authsize);
 	kernel_neon_end();
+
+	return ret;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 3/4] crypto: aegis128/neon - move final tag check to SIMD domain
@ 2020-11-17 13:32   ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: Eric Biggers, Ondrej Mosnacek, herbert, linux-arm-kernel, Ard Biesheuvel

Instead of calculating the tag and returning it to the caller on
decryption, use a SIMD compare and min across vector to perform
the comparison. This is slightly more efficient, and removes the
need on the caller's part to wipe the tag from memory if the
decryption failed.

While at it, switch to unsigned int when passing cryptlen and
assoclen - we don't support input sizes where it matters anyway.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c       | 21 +++++++++----
 crypto/aegis128-neon-inner.c | 33 ++++++++++++++++----
 crypto/aegis128-neon.c       | 21 +++++++++----
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 3a71235892f5..859c7b905618 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -67,9 +67,11 @@ void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst,
 					const u8 *src, unsigned int size);
 void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst,
 					const u8 *src, unsigned int size);
-void crypto_aegis128_final_simd(struct aegis_state *state,
-				union aegis_block *tag_xor,
-				u64 assoclen, u64 cryptlen);
+int crypto_aegis128_final_simd(struct aegis_state *state,
+			       union aegis_block *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize);
 
 static void crypto_aegis128_update(struct aegis_state *state)
 {
@@ -411,7 +413,7 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_encrypt_chunk_simd);
 		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen);
+					   cryptlen, 0);
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
@@ -445,8 +447,15 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
 		crypto_aegis128_process_crypt(&state, &walk,
 					      crypto_aegis128_decrypt_chunk_simd);
-		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen);
+		if (unlikely(crypto_aegis128_final_simd(&state, &tag,
+							req->assoclen,
+							cryptlen, authsize))) {
+			skcipher_walk_aead_decrypt(&walk, req, false);
+			crypto_aegis128_process_crypt(NULL, req, &walk,
+						      crypto_aegis128_wipe_chunk);
+			return -EBADMSG;
+		}
+		return 0;
 	} else {
 		crypto_aegis128_init(&state, &ctx->key, req->iv);
 		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c
index cd1b3ad1d1f3..7de485907d81 100644
--- a/crypto/aegis128-neon-inner.c
+++ b/crypto/aegis128-neon-inner.c
@@ -199,6 +199,17 @@ static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b)
 	return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)),
 			   vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b)));
 }
+
+static int8_t vminvq_s8(int8x16_t v)
+{
+	int8x8_t s = vpmin_s8(vget_low_s8(v), vget_high_s8(v));
+
+	s = vpmin_s8(s, s);
+	s = vpmin_s8(s, s);
+	s = vpmin_s8(s, s);
+
+	return vget_lane_s8(s, 0);
+}
 #endif
 
 static const uint8_t permute[] __aligned(64) = {
@@ -302,8 +313,10 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 	aegis128_save_state_neon(st, state);
 }
 
-void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
-				uint64_t cryptlen)
+int crypto_aegis128_final_neon(void *state, void *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize)
 {
 	struct aegis128_state st = aegis128_load_state_neon(state);
 	uint8x16_t v;
@@ -311,13 +324,21 @@ void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
 
 	preload_sbox();
 
-	v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8 * assoclen),
-					       vmov_n_u64(8 * cryptlen));
+	v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8ULL * assoclen),
+					       vmov_n_u64(8ULL * cryptlen));
 
 	for (i = 0; i < 7; i++)
 		st = aegis128_update_neon(st, v);
 
-	v = vld1q_u8(tag_xor);
-	v ^= st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4];
+	v = st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4];
+
+	if (authsize > 0) {
+		v = vqtbl1q_u8(~vceqq_u8(v, vld1q_u8(tag_xor)),
+			       vld1q_u8(permute + authsize));
+
+		return vminvq_s8((int8x16_t)v);
+	}
+
 	vst1q_u8(tag_xor, v);
+	return 0;
 }
diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c
index 8271b1fa0fbc..94d591a002a4 100644
--- a/crypto/aegis128-neon.c
+++ b/crypto/aegis128-neon.c
@@ -14,8 +14,10 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size);
 void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
 					unsigned int size);
-void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen,
-				uint64_t cryptlen);
+int crypto_aegis128_final_neon(void *state, void *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize);
 
 int aegis128_have_aes_insn __ro_after_init;
 
@@ -60,11 +62,18 @@ void crypto_aegis128_decrypt_chunk_simd(union aegis_block *state, u8 *dst,
 	kernel_neon_end();
 }
 
-void crypto_aegis128_final_simd(union aegis_block *state,
-				union aegis_block *tag_xor,
-				u64 assoclen, u64 cryptlen)
+int crypto_aegis128_final_simd(union aegis_block *state,
+			       union aegis_block *tag_xor,
+			       unsigned int assoclen,
+			       unsigned int cryptlen,
+			       unsigned int authsize)
 {
+	int ret;
+
 	kernel_neon_begin();
-	crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen);
+	ret = crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen,
+					 authsize);
 	kernel_neon_end();
+
+	return ret;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-17 13:32   ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: herbert, linux-arm-kernel, Ard Biesheuvel, Ondrej Mosnacek, Eric Biggers

Wiring the SIMD code into the generic driver has the unfortunate side
effect that the tcrypt testing code cannot distinguish them, and will
therefore not use the latter to fuzz test the former, as it does for
other algorithms.

So let's refactor the code a bit so we can register two implementations:
aegis128-generic and aegis128-simd.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c | 220 +++++++++++++-------
 1 file changed, 143 insertions(+), 77 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 859c7b905618..2b05f79475d3 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -86,9 +86,10 @@ static void crypto_aegis128_update(struct aegis_state *state)
 }
 
 static void crypto_aegis128_update_a(struct aegis_state *state,
-				     const union aegis_block *msg)
+				     const union aegis_block *msg,
+				     bool do_simd)
 {
-	if (aegis128_do_simd()) {
+	if (do_simd) {
 		crypto_aegis128_update_simd(state, msg);
 		return;
 	}
@@ -97,9 +98,10 @@ static void crypto_aegis128_update_a(struct aegis_state *state,
 	crypto_aegis_block_xor(&state->blocks[0], msg);
 }
 
-static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg)
+static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg,
+				     bool do_simd)
 {
-	if (aegis128_do_simd()) {
+	if (do_simd) {
 		crypto_aegis128_update_simd(state, msg);
 		return;
 	}
@@ -128,27 +130,28 @@ static void crypto_aegis128_init(struct aegis_state *state,
 	crypto_aegis_block_xor(&state->blocks[4], &crypto_aegis_const[1]);
 
 	for (i = 0; i < 5; i++) {
-		crypto_aegis128_update_a(state, key);
-		crypto_aegis128_update_a(state, &key_iv);
+		crypto_aegis128_update_a(state, key, false);
+		crypto_aegis128_update_a(state, &key_iv, false);
 	}
 }
 
 static void crypto_aegis128_ad(struct aegis_state *state,
-			       const u8 *src, unsigned int size)
+			       const u8 *src, unsigned int size,
+			       bool do_simd)
 {
 	if (AEGIS_ALIGNED(src)) {
 		const union aegis_block *src_blk =
 				(const union aegis_block *)src;
 
 		while (size >= AEGIS_BLOCK_SIZE) {
-			crypto_aegis128_update_a(state, src_blk);
+			crypto_aegis128_update_a(state, src_blk, do_simd);
 
 			size -= AEGIS_BLOCK_SIZE;
 			src_blk++;
 		}
 	} else {
 		while (size >= AEGIS_BLOCK_SIZE) {
-			crypto_aegis128_update_u(state, src);
+			crypto_aegis128_update_u(state, src, do_simd);
 
 			size -= AEGIS_BLOCK_SIZE;
 			src += AEGIS_BLOCK_SIZE;
@@ -180,7 +183,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_aegis_block_xor(&tmp, src_blk);
 
-			crypto_aegis128_update_a(state, src_blk);
+			crypto_aegis128_update_a(state, src_blk, false);
 
 			*dst_blk = tmp;
 
@@ -196,7 +199,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
 
-			crypto_aegis128_update_u(state, src);
+			crypto_aegis128_update_u(state, src, false);
 
 			memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
 
@@ -215,7 +218,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 		crypto_aegis_block_xor(&tmp, &state->blocks[4]);
 		crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 
-		crypto_aegis128_update_a(state, &msg);
+		crypto_aegis128_update_a(state, &msg, false);
 
 		crypto_aegis_block_xor(&msg, &tmp);
 
@@ -241,7 +244,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_aegis_block_xor(&tmp, src_blk);
 
-			crypto_aegis128_update_a(state, &tmp);
+			crypto_aegis128_update_a(state, &tmp, false);
 
 			*dst_blk = tmp;
 
@@ -257,7 +260,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
 
-			crypto_aegis128_update_a(state, &tmp);
+			crypto_aegis128_update_a(state, &tmp, false);
 
 			memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
 
@@ -279,7 +282,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 
 		memset(msg.bytes + size, 0, AEGIS_BLOCK_SIZE - size);
 
-		crypto_aegis128_update_a(state, &msg);
+		crypto_aegis128_update_a(state, &msg, false);
 
 		memcpy(dst, msg.bytes, size);
 	}
@@ -287,7 +290,8 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 
 static void crypto_aegis128_process_ad(struct aegis_state *state,
 				       struct scatterlist *sg_src,
-				       unsigned int assoclen)
+				       unsigned int assoclen,
+				       bool do_simd)
 {
 	struct scatter_walk walk;
 	union aegis_block buf;
@@ -304,13 +308,13 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 			if (pos > 0) {
 				unsigned int fill = AEGIS_BLOCK_SIZE - pos;
 				memcpy(buf.bytes + pos, src, fill);
-				crypto_aegis128_update_a(state, &buf);
+				crypto_aegis128_update_a(state, &buf, do_simd);
 				pos = 0;
 				left -= fill;
 				src += fill;
 			}
 
-			crypto_aegis128_ad(state, src, left);
+			crypto_aegis128_ad(state, src, left, do_simd);
 			src += left & ~(AEGIS_BLOCK_SIZE - 1);
 			left &= AEGIS_BLOCK_SIZE - 1;
 		}
@@ -326,7 +330,7 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 
 	if (pos > 0) {
 		memset(buf.bytes + pos, 0, AEGIS_BLOCK_SIZE - pos);
-		crypto_aegis128_update_a(state, &buf);
+		crypto_aegis128_update_a(state, &buf, do_simd);
 	}
 }
 
@@ -368,7 +372,7 @@ static void crypto_aegis128_final(struct aegis_state *state,
 	crypto_aegis_block_xor(&tmp, &state->blocks[3]);
 
 	for (i = 0; i < 7; i++)
-		crypto_aegis128_update_a(state, &tmp);
+		crypto_aegis128_update_a(state, &tmp, false);
 
 	for (i = 0; i < AEGIS128_STATE_BLOCKS; i++)
 		crypto_aegis_block_xor(tag_xor, &state->blocks[i]);
@@ -396,7 +400,7 @@ static int crypto_aegis128_setauthsize(struct crypto_aead *tfm,
 	return 0;
 }
 
-static int crypto_aegis128_encrypt(struct aead_request *req)
+static int crypto_aegis128_encrypt_generic(struct aead_request *req)
 {
 	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
 	union aegis_block tag = {};
@@ -407,27 +411,18 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 	struct aegis_state state;
 
 	skcipher_walk_aead_encrypt(&walk, req, false);
-	if (aegis128_do_simd()) {
-		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_encrypt_chunk_simd);
-		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen, 0);
-	} else {
-		crypto_aegis128_init(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_encrypt_chunk);
-		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
-	}
+	crypto_aegis128_init(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, false);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_encrypt_chunk);
+	crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 
 	scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
 				 authsize, 1);
 	return 0;
 }
 
-static int crypto_aegis128_decrypt(struct aead_request *req)
+static int crypto_aegis128_decrypt_generic(struct aead_request *req)
 {
 	static const u8 zeros[AEGIS128_MAX_AUTH_SIZE] = {};
 	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
@@ -442,27 +437,11 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 				 authsize, 0);
 
 	skcipher_walk_aead_decrypt(&walk, req, false);
-	if (aegis128_do_simd()) {
-		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_decrypt_chunk_simd);
-		if (unlikely(crypto_aegis128_final_simd(&state, &tag,
-							req->assoclen,
-							cryptlen, authsize))) {
-			skcipher_walk_aead_decrypt(&walk, req, false);
-			crypto_aegis128_process_crypt(NULL, req, &walk,
-						      crypto_aegis128_wipe_chunk);
-			return -EBADMSG;
-		}
-		return 0;
-	} else {
-		crypto_aegis128_init(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_decrypt_chunk);
-		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
-	}
+	crypto_aegis128_init(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, false);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_decrypt_chunk);
+	crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 
 	if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) {
 		/*
@@ -482,42 +461,128 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 	return 0;
 }
 
-static struct aead_alg crypto_aegis128_alg = {
-	.setkey = crypto_aegis128_setkey,
-	.setauthsize = crypto_aegis128_setauthsize,
-	.encrypt = crypto_aegis128_encrypt,
-	.decrypt = crypto_aegis128_decrypt,
+static int crypto_aegis128_encrypt_simd(struct aead_request *req)
+{
+	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+	union aegis_block tag = {};
+	unsigned int authsize = crypto_aead_authsize(tfm);
+	struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
+	unsigned int cryptlen = req->cryptlen;
+	struct skcipher_walk walk;
+	struct aegis_state state;
 
-	.ivsize = AEGIS128_NONCE_SIZE,
-	.maxauthsize = AEGIS128_MAX_AUTH_SIZE,
-	.chunksize = AEGIS_BLOCK_SIZE,
+	if (!aegis128_do_simd())
+		return crypto_aegis128_encrypt_generic(req);
 
-	.base = {
-		.cra_blocksize = 1,
-		.cra_ctxsize = sizeof(struct aegis_ctx),
-		.cra_alignmask = 0,
+	skcipher_walk_aead_encrypt(&walk, req, false);
+	crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, true);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_encrypt_chunk_simd);
+	crypto_aegis128_final_simd(&state, &tag, req->assoclen, cryptlen, 0);
 
-		.cra_priority = 100,
+	scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
+				 authsize, 1);
+	return 0;
+}
 
-		.cra_name = "aegis128",
-		.cra_driver_name = "aegis128-generic",
+static int crypto_aegis128_decrypt_simd(struct aead_request *req)
+{
+	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+	union aegis_block tag;
+	unsigned int authsize = crypto_aead_authsize(tfm);
+	unsigned int cryptlen = req->cryptlen - authsize;
+	struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
+	struct skcipher_walk walk;
+	struct aegis_state state;
 
-		.cra_module = THIS_MODULE,
+	if (!aegis128_do_simd())
+		return crypto_aegis128_decrypt_generic(req);
+
+	scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen,
+				 authsize, 0);
+
+	skcipher_walk_aead_decrypt(&walk, req, false);
+	crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, true);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_decrypt_chunk_simd);
+
+	if (unlikely(crypto_aegis128_final_simd(&state, &tag, req->assoclen,
+						cryptlen, authsize))) {
+		skcipher_walk_aead_decrypt(&walk, req, false);
+		crypto_aegis128_process_crypt(NULL, &walk,
+					      crypto_aegis128_wipe_chunk);
+		return -EBADMSG;
 	}
+	return 0;
+}
+
+static struct aead_alg crypto_aegis128_alg_generic = {
+	.setkey			= crypto_aegis128_setkey,
+	.setauthsize		= crypto_aegis128_setauthsize,
+	.encrypt		= crypto_aegis128_encrypt_generic,
+	.decrypt		= crypto_aegis128_decrypt_generic,
+
+	.ivsize			= AEGIS128_NONCE_SIZE,
+	.maxauthsize		= AEGIS128_MAX_AUTH_SIZE,
+	.chunksize		= AEGIS_BLOCK_SIZE,
+
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct aegis_ctx),
+	.base.cra_alignmask	= 0,
+	.base.cra_priority	= 100,
+	.base.cra_name		= "aegis128",
+	.base.cra_driver_name	= "aegis128-generic",
+	.base.cra_module	= THIS_MODULE,
+};
+
+static struct aead_alg crypto_aegis128_alg_simd = {
+	.setkey			= crypto_aegis128_setkey,
+	.setauthsize		= crypto_aegis128_setauthsize,
+	.encrypt		= crypto_aegis128_encrypt_simd,
+	.decrypt		= crypto_aegis128_decrypt_simd,
+
+	.ivsize			= AEGIS128_NONCE_SIZE,
+	.maxauthsize		= AEGIS128_MAX_AUTH_SIZE,
+	.chunksize		= AEGIS_BLOCK_SIZE,
+
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct aegis_ctx),
+	.base.cra_alignmask	= 0,
+	.base.cra_priority	= 200,
+	.base.cra_name		= "aegis128",
+	.base.cra_driver_name	= "aegis128-simd",
+	.base.cra_module	= THIS_MODULE,
 };
 
 static int __init crypto_aegis128_module_init(void)
 {
+	int ret;
+
+	ret = crypto_register_aead(&crypto_aegis128_alg_generic);
+	if (ret)
+		return ret;
+
 	if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) &&
-	    crypto_aegis128_have_simd())
+	    crypto_aegis128_have_simd()) {
+		ret = crypto_register_aead(&crypto_aegis128_alg_simd);
+		if (ret) {
+			crypto_unregister_aead(&crypto_aegis128_alg_generic);
+			return ret;
+		}
 		static_branch_enable(&have_simd);
-
-	return crypto_register_aead(&crypto_aegis128_alg);
+	}
+	return 0;
 }
 
 static void __exit crypto_aegis128_module_exit(void)
 {
-	crypto_unregister_aead(&crypto_aegis128_alg);
+	if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) &&
+	    crypto_aegis128_have_simd())
+		crypto_unregister_aead(&crypto_aegis128_alg_simd);
+
+	crypto_unregister_aead(&crypto_aegis128_alg_generic);
 }
 
 subsys_initcall(crypto_aegis128_module_init);
@@ -528,3 +593,4 @@ MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
 MODULE_DESCRIPTION("AEGIS-128 AEAD algorithm");
 MODULE_ALIAS_CRYPTO("aegis128");
 MODULE_ALIAS_CRYPTO("aegis128-generic");
+MODULE_ALIAS_CRYPTO("aegis128-simd");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver
@ 2020-11-17 13:32   ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-17 13:32 UTC (permalink / raw)
  To: linux-crypto
  Cc: Eric Biggers, Ondrej Mosnacek, herbert, linux-arm-kernel, Ard Biesheuvel

Wiring the SIMD code into the generic driver has the unfortunate side
effect that the tcrypt testing code cannot distinguish them, and will
therefore not use the latter to fuzz test the former, as it does for
other algorithms.

So let's refactor the code a bit so we can register two implementations:
aegis128-generic and aegis128-simd.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aegis128-core.c | 220 +++++++++++++-------
 1 file changed, 143 insertions(+), 77 deletions(-)

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 859c7b905618..2b05f79475d3 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -86,9 +86,10 @@ static void crypto_aegis128_update(struct aegis_state *state)
 }
 
 static void crypto_aegis128_update_a(struct aegis_state *state,
-				     const union aegis_block *msg)
+				     const union aegis_block *msg,
+				     bool do_simd)
 {
-	if (aegis128_do_simd()) {
+	if (do_simd) {
 		crypto_aegis128_update_simd(state, msg);
 		return;
 	}
@@ -97,9 +98,10 @@ static void crypto_aegis128_update_a(struct aegis_state *state,
 	crypto_aegis_block_xor(&state->blocks[0], msg);
 }
 
-static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg)
+static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg,
+				     bool do_simd)
 {
-	if (aegis128_do_simd()) {
+	if (do_simd) {
 		crypto_aegis128_update_simd(state, msg);
 		return;
 	}
@@ -128,27 +130,28 @@ static void crypto_aegis128_init(struct aegis_state *state,
 	crypto_aegis_block_xor(&state->blocks[4], &crypto_aegis_const[1]);
 
 	for (i = 0; i < 5; i++) {
-		crypto_aegis128_update_a(state, key);
-		crypto_aegis128_update_a(state, &key_iv);
+		crypto_aegis128_update_a(state, key, false);
+		crypto_aegis128_update_a(state, &key_iv, false);
 	}
 }
 
 static void crypto_aegis128_ad(struct aegis_state *state,
-			       const u8 *src, unsigned int size)
+			       const u8 *src, unsigned int size,
+			       bool do_simd)
 {
 	if (AEGIS_ALIGNED(src)) {
 		const union aegis_block *src_blk =
 				(const union aegis_block *)src;
 
 		while (size >= AEGIS_BLOCK_SIZE) {
-			crypto_aegis128_update_a(state, src_blk);
+			crypto_aegis128_update_a(state, src_blk, do_simd);
 
 			size -= AEGIS_BLOCK_SIZE;
 			src_blk++;
 		}
 	} else {
 		while (size >= AEGIS_BLOCK_SIZE) {
-			crypto_aegis128_update_u(state, src);
+			crypto_aegis128_update_u(state, src, do_simd);
 
 			size -= AEGIS_BLOCK_SIZE;
 			src += AEGIS_BLOCK_SIZE;
@@ -180,7 +183,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_aegis_block_xor(&tmp, src_blk);
 
-			crypto_aegis128_update_a(state, src_blk);
+			crypto_aegis128_update_a(state, src_blk, false);
 
 			*dst_blk = tmp;
 
@@ -196,7 +199,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
 
-			crypto_aegis128_update_u(state, src);
+			crypto_aegis128_update_u(state, src, false);
 
 			memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
 
@@ -215,7 +218,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst,
 		crypto_aegis_block_xor(&tmp, &state->blocks[4]);
 		crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 
-		crypto_aegis128_update_a(state, &msg);
+		crypto_aegis128_update_a(state, &msg, false);
 
 		crypto_aegis_block_xor(&msg, &tmp);
 
@@ -241,7 +244,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_aegis_block_xor(&tmp, src_blk);
 
-			crypto_aegis128_update_a(state, &tmp);
+			crypto_aegis128_update_a(state, &tmp, false);
 
 			*dst_blk = tmp;
 
@@ -257,7 +260,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 			crypto_aegis_block_xor(&tmp, &state->blocks[1]);
 			crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
 
-			crypto_aegis128_update_a(state, &tmp);
+			crypto_aegis128_update_a(state, &tmp, false);
 
 			memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
 
@@ -279,7 +282,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 
 		memset(msg.bytes + size, 0, AEGIS_BLOCK_SIZE - size);
 
-		crypto_aegis128_update_a(state, &msg);
+		crypto_aegis128_update_a(state, &msg, false);
 
 		memcpy(dst, msg.bytes, size);
 	}
@@ -287,7 +290,8 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst,
 
 static void crypto_aegis128_process_ad(struct aegis_state *state,
 				       struct scatterlist *sg_src,
-				       unsigned int assoclen)
+				       unsigned int assoclen,
+				       bool do_simd)
 {
 	struct scatter_walk walk;
 	union aegis_block buf;
@@ -304,13 +308,13 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 			if (pos > 0) {
 				unsigned int fill = AEGIS_BLOCK_SIZE - pos;
 				memcpy(buf.bytes + pos, src, fill);
-				crypto_aegis128_update_a(state, &buf);
+				crypto_aegis128_update_a(state, &buf, do_simd);
 				pos = 0;
 				left -= fill;
 				src += fill;
 			}
 
-			crypto_aegis128_ad(state, src, left);
+			crypto_aegis128_ad(state, src, left, do_simd);
 			src += left & ~(AEGIS_BLOCK_SIZE - 1);
 			left &= AEGIS_BLOCK_SIZE - 1;
 		}
@@ -326,7 +330,7 @@ static void crypto_aegis128_process_ad(struct aegis_state *state,
 
 	if (pos > 0) {
 		memset(buf.bytes + pos, 0, AEGIS_BLOCK_SIZE - pos);
-		crypto_aegis128_update_a(state, &buf);
+		crypto_aegis128_update_a(state, &buf, do_simd);
 	}
 }
 
@@ -368,7 +372,7 @@ static void crypto_aegis128_final(struct aegis_state *state,
 	crypto_aegis_block_xor(&tmp, &state->blocks[3]);
 
 	for (i = 0; i < 7; i++)
-		crypto_aegis128_update_a(state, &tmp);
+		crypto_aegis128_update_a(state, &tmp, false);
 
 	for (i = 0; i < AEGIS128_STATE_BLOCKS; i++)
 		crypto_aegis_block_xor(tag_xor, &state->blocks[i]);
@@ -396,7 +400,7 @@ static int crypto_aegis128_setauthsize(struct crypto_aead *tfm,
 	return 0;
 }
 
-static int crypto_aegis128_encrypt(struct aead_request *req)
+static int crypto_aegis128_encrypt_generic(struct aead_request *req)
 {
 	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
 	union aegis_block tag = {};
@@ -407,27 +411,18 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
 	struct aegis_state state;
 
 	skcipher_walk_aead_encrypt(&walk, req, false);
-	if (aegis128_do_simd()) {
-		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_encrypt_chunk_simd);
-		crypto_aegis128_final_simd(&state, &tag, req->assoclen,
-					   cryptlen, 0);
-	} else {
-		crypto_aegis128_init(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_encrypt_chunk);
-		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
-	}
+	crypto_aegis128_init(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, false);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_encrypt_chunk);
+	crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 
 	scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
 				 authsize, 1);
 	return 0;
 }
 
-static int crypto_aegis128_decrypt(struct aead_request *req)
+static int crypto_aegis128_decrypt_generic(struct aead_request *req)
 {
 	static const u8 zeros[AEGIS128_MAX_AUTH_SIZE] = {};
 	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
@@ -442,27 +437,11 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 				 authsize, 0);
 
 	skcipher_walk_aead_decrypt(&walk, req, false);
-	if (aegis128_do_simd()) {
-		crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_decrypt_chunk_simd);
-		if (unlikely(crypto_aegis128_final_simd(&state, &tag,
-							req->assoclen,
-							cryptlen, authsize))) {
-			skcipher_walk_aead_decrypt(&walk, req, false);
-			crypto_aegis128_process_crypt(NULL, req, &walk,
-						      crypto_aegis128_wipe_chunk);
-			return -EBADMSG;
-		}
-		return 0;
-	} else {
-		crypto_aegis128_init(&state, &ctx->key, req->iv);
-		crypto_aegis128_process_ad(&state, req->src, req->assoclen);
-		crypto_aegis128_process_crypt(&state, &walk,
-					      crypto_aegis128_decrypt_chunk);
-		crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
-	}
+	crypto_aegis128_init(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, false);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_decrypt_chunk);
+	crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen);
 
 	if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) {
 		/*
@@ -482,42 +461,128 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
 	return 0;
 }
 
-static struct aead_alg crypto_aegis128_alg = {
-	.setkey = crypto_aegis128_setkey,
-	.setauthsize = crypto_aegis128_setauthsize,
-	.encrypt = crypto_aegis128_encrypt,
-	.decrypt = crypto_aegis128_decrypt,
+static int crypto_aegis128_encrypt_simd(struct aead_request *req)
+{
+	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+	union aegis_block tag = {};
+	unsigned int authsize = crypto_aead_authsize(tfm);
+	struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
+	unsigned int cryptlen = req->cryptlen;
+	struct skcipher_walk walk;
+	struct aegis_state state;
 
-	.ivsize = AEGIS128_NONCE_SIZE,
-	.maxauthsize = AEGIS128_MAX_AUTH_SIZE,
-	.chunksize = AEGIS_BLOCK_SIZE,
+	if (!aegis128_do_simd())
+		return crypto_aegis128_encrypt_generic(req);
 
-	.base = {
-		.cra_blocksize = 1,
-		.cra_ctxsize = sizeof(struct aegis_ctx),
-		.cra_alignmask = 0,
+	skcipher_walk_aead_encrypt(&walk, req, false);
+	crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, true);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_encrypt_chunk_simd);
+	crypto_aegis128_final_simd(&state, &tag, req->assoclen, cryptlen, 0);
 
-		.cra_priority = 100,
+	scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
+				 authsize, 1);
+	return 0;
+}
 
-		.cra_name = "aegis128",
-		.cra_driver_name = "aegis128-generic",
+static int crypto_aegis128_decrypt_simd(struct aead_request *req)
+{
+	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+	union aegis_block tag;
+	unsigned int authsize = crypto_aead_authsize(tfm);
+	unsigned int cryptlen = req->cryptlen - authsize;
+	struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
+	struct skcipher_walk walk;
+	struct aegis_state state;
 
-		.cra_module = THIS_MODULE,
+	if (!aegis128_do_simd())
+		return crypto_aegis128_decrypt_generic(req);
+
+	scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen,
+				 authsize, 0);
+
+	skcipher_walk_aead_decrypt(&walk, req, false);
+	crypto_aegis128_init_simd(&state, &ctx->key, req->iv);
+	crypto_aegis128_process_ad(&state, req->src, req->assoclen, true);
+	crypto_aegis128_process_crypt(&state, &walk,
+				      crypto_aegis128_decrypt_chunk_simd);
+
+	if (unlikely(crypto_aegis128_final_simd(&state, &tag, req->assoclen,
+						cryptlen, authsize))) {
+		skcipher_walk_aead_decrypt(&walk, req, false);
+		crypto_aegis128_process_crypt(NULL, &walk,
+					      crypto_aegis128_wipe_chunk);
+		return -EBADMSG;
 	}
+	return 0;
+}
+
+static struct aead_alg crypto_aegis128_alg_generic = {
+	.setkey			= crypto_aegis128_setkey,
+	.setauthsize		= crypto_aegis128_setauthsize,
+	.encrypt		= crypto_aegis128_encrypt_generic,
+	.decrypt		= crypto_aegis128_decrypt_generic,
+
+	.ivsize			= AEGIS128_NONCE_SIZE,
+	.maxauthsize		= AEGIS128_MAX_AUTH_SIZE,
+	.chunksize		= AEGIS_BLOCK_SIZE,
+
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct aegis_ctx),
+	.base.cra_alignmask	= 0,
+	.base.cra_priority	= 100,
+	.base.cra_name		= "aegis128",
+	.base.cra_driver_name	= "aegis128-generic",
+	.base.cra_module	= THIS_MODULE,
+};
+
+static struct aead_alg crypto_aegis128_alg_simd = {
+	.setkey			= crypto_aegis128_setkey,
+	.setauthsize		= crypto_aegis128_setauthsize,
+	.encrypt		= crypto_aegis128_encrypt_simd,
+	.decrypt		= crypto_aegis128_decrypt_simd,
+
+	.ivsize			= AEGIS128_NONCE_SIZE,
+	.maxauthsize		= AEGIS128_MAX_AUTH_SIZE,
+	.chunksize		= AEGIS_BLOCK_SIZE,
+
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct aegis_ctx),
+	.base.cra_alignmask	= 0,
+	.base.cra_priority	= 200,
+	.base.cra_name		= "aegis128",
+	.base.cra_driver_name	= "aegis128-simd",
+	.base.cra_module	= THIS_MODULE,
 };
 
 static int __init crypto_aegis128_module_init(void)
 {
+	int ret;
+
+	ret = crypto_register_aead(&crypto_aegis128_alg_generic);
+	if (ret)
+		return ret;
+
 	if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) &&
-	    crypto_aegis128_have_simd())
+	    crypto_aegis128_have_simd()) {
+		ret = crypto_register_aead(&crypto_aegis128_alg_simd);
+		if (ret) {
+			crypto_unregister_aead(&crypto_aegis128_alg_generic);
+			return ret;
+		}
 		static_branch_enable(&have_simd);
-
-	return crypto_register_aead(&crypto_aegis128_alg);
+	}
+	return 0;
 }
 
 static void __exit crypto_aegis128_module_exit(void)
 {
-	crypto_unregister_aead(&crypto_aegis128_alg);
+	if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) &&
+	    crypto_aegis128_have_simd())
+		crypto_unregister_aead(&crypto_aegis128_alg_simd);
+
+	crypto_unregister_aead(&crypto_aegis128_alg_generic);
 }
 
 subsys_initcall(crypto_aegis128_module_init);
@@ -528,3 +593,4 @@ MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
 MODULE_DESCRIPTION("AEGIS-128 AEAD algorithm");
 MODULE_ALIAS_CRYPTO("aegis128");
 MODULE_ALIAS_CRYPTO("aegis128-generic");
+MODULE_ALIAS_CRYPTO("aegis128-simd");
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver
  2020-11-17 13:32   ` Ard Biesheuvel
@ 2020-11-20  8:55     ` Ondrej Mosnáček
  -1 siblings, 0 replies; 22+ messages in thread
From: Ondrej Mosnáček @ 2020-11-20  8:55 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-crypto, Herbert Xu, linux-arm-kernel, Eric Biggers

ut 17. 11. 2020 o 14:32 Ard Biesheuvel <ardb@kernel.org> napísal(a):
> Wiring the SIMD code into the generic driver has the unfortunate side
> effect that the tcrypt testing code cannot distinguish them, and will
> therefore not use the latter to fuzz test the former, as it does for
> other algorithms.
>
> So let's refactor the code a bit so we can register two implementations:
> aegis128-generic and aegis128-simd.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Reviewed-by: Ondrej Mosnacek <omosnacek@gmail.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver
@ 2020-11-20  8:55     ` Ondrej Mosnáček
  0 siblings, 0 replies; 22+ messages in thread
From: Ondrej Mosnáček @ 2020-11-20  8:55 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Eric Biggers, linux-crypto, linux-arm-kernel, Herbert Xu

ut 17. 11. 2020 o 14:32 Ard Biesheuvel <ardb@kernel.org> napísal(a):
> Wiring the SIMD code into the generic driver has the unfortunate side
> effect that the tcrypt testing code cannot distinguish them, and will
> therefore not use the latter to fuzz test the former, as it does for
> other algorithms.
>
> So let's refactor the code a bit so we can register two implementations:
> aegis128-generic and aegis128-simd.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Reviewed-by: Ondrej Mosnacek <omosnacek@gmail.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-27  6:24   ` Herbert Xu
  -1 siblings, 0 replies; 22+ messages in thread
From: Herbert Xu @ 2020-11-27  6:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, Ondrej Mosnacek, Eric Biggers

On Tue, Nov 17, 2020 at 02:32:10PM +0100, Ard Biesheuvel wrote:
> This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> block handling', which is included as patch #3 here, but hasn't been
> modified substantially.
> 
> Patch #1 should probably go to -stable, even though aegis128 does not appear
> to be widely used.
> 
> Patches #2 and #3 improve the SIMD code paths.
> 
> Patch #4 enables fuzz testing for the SIMD code by registering the generic
> code as a separate driver if the SIMD code path is enabled.
> 
> Changes since v2:
> - add Ondrej's ack to #1
> - fix an issue spotted by Ondrej in #4 where the generic code path would still
>   use some of the SIMD helpers
> 
> Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
> 
> [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> 
> Ard Biesheuvel (4):
>   crypto: aegis128 - wipe plaintext and tag if decryption fails
>   crypto: aegis128/neon - optimize tail block handling
>   crypto: aegis128/neon - move final tag check to SIMD domain
>   crypto: aegis128 - expose SIMD code path as separate driver
> 
>  crypto/aegis128-core.c       | 245 ++++++++++++++------
>  crypto/aegis128-neon-inner.c | 122 ++++++++--
>  crypto/aegis128-neon.c       |  21 +-
>  3 files changed, 287 insertions(+), 101 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-27  6:24   ` Herbert Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Herbert Xu @ 2020-11-27  6:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Eric Biggers, Ondrej Mosnacek, linux-crypto, linux-arm-kernel

On Tue, Nov 17, 2020 at 02:32:10PM +0100, Ard Biesheuvel wrote:
> This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> block handling', which is included as patch #3 here, but hasn't been
> modified substantially.
> 
> Patch #1 should probably go to -stable, even though aegis128 does not appear
> to be widely used.
> 
> Patches #2 and #3 improve the SIMD code paths.
> 
> Patch #4 enables fuzz testing for the SIMD code by registering the generic
> code as a separate driver if the SIMD code path is enabled.
> 
> Changes since v2:
> - add Ondrej's ack to #1
> - fix an issue spotted by Ondrej in #4 where the generic code path would still
>   use some of the SIMD helpers
> 
> Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
> 
> [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> 
> Ard Biesheuvel (4):
>   crypto: aegis128 - wipe plaintext and tag if decryption fails
>   crypto: aegis128/neon - optimize tail block handling
>   crypto: aegis128/neon - move final tag check to SIMD domain
>   crypto: aegis128 - expose SIMD code path as separate driver
> 
>  crypto/aegis128-core.c       | 245 ++++++++++++++------
>  crypto/aegis128-neon-inner.c | 122 ++++++++--
>  crypto/aegis128-neon.c       |  21 +-
>  3 files changed, 287 insertions(+), 101 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
  2020-11-17 13:32 ` Ard Biesheuvel
@ 2020-11-30  9:37   ` Geert Uytterhoeven
  -1 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2020-11-30  9:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Crypto Mailing List, Eric Biggers, Ondrej Mosnacek,
	Herbert Xu, Linux ARM, Linux-Next

Hi Ard,

On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> block handling', which is included as patch #3 here, but hasn't been
> modified substantially.
>
> Patch #1 should probably go to -stable, even though aegis128 does not appear
> to be widely used.
>
> Patches #2 and #3 improve the SIMD code paths.
>
> Patch #4 enables fuzz testing for the SIMD code by registering the generic
> code as a separate driver if the SIMD code path is enabled.
>
> Changes since v2:
> - add Ondrej's ack to #1
> - fix an issue spotted by Ondrej in #4 where the generic code path would still
>   use some of the SIMD helpers
>
> Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
>
> [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
>
> Ard Biesheuvel (4):
>   crypto: aegis128 - wipe plaintext and tag if decryption fails
>   crypto: aegis128/neon - optimize tail block handling
>   crypto: aegis128/neon - move final tag check to SIMD domain

crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
crypto/aegis128-core.c:454:40: error: passing argument 2 of
‘crypto_aegis128_process_crypt’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |                                        ^~~
      |                                        |
      |                                        struct aead_request *
crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
but argument is of type ‘struct aead_request *’
  335 |       struct skcipher_walk *walk,
      |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
crypto/aegis128-core.c:454:45: error: passing argument 3 of
‘crypto_aegis128_process_crypt’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |                                             ^~~~~
      |                                             |
      |                                             struct skcipher_walk *
crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
but argument is of type ‘struct skcipher_walk *’
  336 |       void (*crypt)(struct aegis_state *state,
      |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  337 |              u8 *dst, const u8 *src,
      |              ~~~~~~~~~~~~~~~~~~~~~~~
  338 |              unsigned int size))
      |              ~~~~~~~~~~~~~~~~~~
crypto/aegis128-core.c:454:4: error: too many arguments to function
‘crypto_aegis128_process_crypt’
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
crypto/aegis128-core.c:334:5: note: declared here
  334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1

>   crypto: aegis128 - expose SIMD code path as separate driver

Fixes the above, but causes

ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!

as reported by noreply@ellerman.id.au for m68k/defconfig and
m68k/sun3_defconfig.
(neon depends on arm).

>  crypto/aegis128-core.c       | 245 ++++++++++++++------
>  crypto/aegis128-neon-inner.c | 122 ++++++++--
>  crypto/aegis128-neon.c       |  21 +-
>  3 files changed, 287 insertions(+), 101 deletions(-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-30  9:37   ` Geert Uytterhoeven
  0 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2020-11-30  9:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Herbert Xu, Ondrej Mosnacek, Eric Biggers, Linux-Next,
	Linux Crypto Mailing List, Linux ARM

Hi Ard,

On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> block handling', which is included as patch #3 here, but hasn't been
> modified substantially.
>
> Patch #1 should probably go to -stable, even though aegis128 does not appear
> to be widely used.
>
> Patches #2 and #3 improve the SIMD code paths.
>
> Patch #4 enables fuzz testing for the SIMD code by registering the generic
> code as a separate driver if the SIMD code path is enabled.
>
> Changes since v2:
> - add Ondrej's ack to #1
> - fix an issue spotted by Ondrej in #4 where the generic code path would still
>   use some of the SIMD helpers
>
> Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
>
> [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
>
> Ard Biesheuvel (4):
>   crypto: aegis128 - wipe plaintext and tag if decryption fails
>   crypto: aegis128/neon - optimize tail block handling
>   crypto: aegis128/neon - move final tag check to SIMD domain

crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
crypto/aegis128-core.c:454:40: error: passing argument 2 of
‘crypto_aegis128_process_crypt’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |                                        ^~~
      |                                        |
      |                                        struct aead_request *
crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
but argument is of type ‘struct aead_request *’
  335 |       struct skcipher_walk *walk,
      |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
crypto/aegis128-core.c:454:45: error: passing argument 3 of
‘crypto_aegis128_process_crypt’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |                                             ^~~~~
      |                                             |
      |                                             struct skcipher_walk *
crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
but argument is of type ‘struct skcipher_walk *’
  336 |       void (*crypt)(struct aegis_state *state,
      |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  337 |              u8 *dst, const u8 *src,
      |              ~~~~~~~~~~~~~~~~~~~~~~~
  338 |              unsigned int size))
      |              ~~~~~~~~~~~~~~~~~~
crypto/aegis128-core.c:454:4: error: too many arguments to function
‘crypto_aegis128_process_crypt’
  454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
crypto/aegis128-core.c:334:5: note: declared here
  334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1

>   crypto: aegis128 - expose SIMD code path as separate driver

Fixes the above, but causes

ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!

as reported by noreply@ellerman.id.au for m68k/defconfig and
m68k/sun3_defconfig.
(neon depends on arm).

>  crypto/aegis128-core.c       | 245 ++++++++++++++------
>  crypto/aegis128-neon-inner.c | 122 ++++++++--
>  crypto/aegis128-neon.c       |  21 +-
>  3 files changed, 287 insertions(+), 101 deletions(-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
  2020-11-30  9:37   ` Geert Uytterhoeven
@ 2020-11-30  9:43     ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-30  9:43 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linux Crypto Mailing List, Eric Biggers, Ondrej Mosnacek,
	Herbert Xu, Linux ARM, Linux-Next

On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Ard,
>
> On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > block handling', which is included as patch #3 here, but hasn't been
> > modified substantially.
> >
> > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > to be widely used.
> >
> > Patches #2 and #3 improve the SIMD code paths.
> >
> > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > code as a separate driver if the SIMD code path is enabled.
> >
> > Changes since v2:
> > - add Ondrej's ack to #1
> > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> >   use some of the SIMD helpers
> >
> > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > Cc: Eric Biggers <ebiggers@kernel.org>
> >
> > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> >
> > Ard Biesheuvel (4):
> >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> >   crypto: aegis128/neon - optimize tail block handling
> >   crypto: aegis128/neon - move final tag check to SIMD domain
>
> crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> crypto/aegis128-core.c:454:40: error: passing argument 2 of
> ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> [-Werror=incompatible-pointer-types]
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |                                        ^~~
>       |                                        |
>       |                                        struct aead_request *
> crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> but argument is of type ‘struct aead_request *’
>   335 |       struct skcipher_walk *walk,
>       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> crypto/aegis128-core.c:454:45: error: passing argument 3 of
> ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> [-Werror=incompatible-pointer-types]
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |                                             ^~~~~
>       |                                             |
>       |                                             struct skcipher_walk *
> crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> but argument is of type ‘struct skcipher_walk *’
>   336 |       void (*crypt)(struct aegis_state *state,
>       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   337 |              u8 *dst, const u8 *src,
>       |              ~~~~~~~~~~~~~~~~~~~~~~~
>   338 |              unsigned int size))
>       |              ~~~~~~~~~~~~~~~~~~
> crypto/aegis128-core.c:454:4: error: too many arguments to function
> ‘crypto_aegis128_process_crypt’
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> crypto/aegis128-core.c:334:5: note: declared here
>   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
>       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors
> make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
>
> >   crypto: aegis128 - expose SIMD code path as separate driver
>
> Fixes the above, but causes
>
> ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
>
> as reported by noreply@ellerman.id.au for m68k/defconfig and
> m68k/sun3_defconfig.
> (neon depends on arm).
>

Thanks for the report.

It seems like GCC is not optimizing away calls to routines that are
unreachable. Which GCC version are you using?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-30  9:43     ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-30  9:43 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Herbert Xu, Ondrej Mosnacek, Eric Biggers, Linux-Next,
	Linux Crypto Mailing List, Linux ARM

On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Ard,
>
> On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > block handling', which is included as patch #3 here, but hasn't been
> > modified substantially.
> >
> > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > to be widely used.
> >
> > Patches #2 and #3 improve the SIMD code paths.
> >
> > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > code as a separate driver if the SIMD code path is enabled.
> >
> > Changes since v2:
> > - add Ondrej's ack to #1
> > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> >   use some of the SIMD helpers
> >
> > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > Cc: Eric Biggers <ebiggers@kernel.org>
> >
> > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> >
> > Ard Biesheuvel (4):
> >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> >   crypto: aegis128/neon - optimize tail block handling
> >   crypto: aegis128/neon - move final tag check to SIMD domain
>
> crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> crypto/aegis128-core.c:454:40: error: passing argument 2 of
> ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> [-Werror=incompatible-pointer-types]
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |                                        ^~~
>       |                                        |
>       |                                        struct aead_request *
> crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> but argument is of type ‘struct aead_request *’
>   335 |       struct skcipher_walk *walk,
>       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> crypto/aegis128-core.c:454:45: error: passing argument 3 of
> ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> [-Werror=incompatible-pointer-types]
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |                                             ^~~~~
>       |                                             |
>       |                                             struct skcipher_walk *
> crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> but argument is of type ‘struct skcipher_walk *’
>   336 |       void (*crypt)(struct aegis_state *state,
>       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   337 |              u8 *dst, const u8 *src,
>       |              ~~~~~~~~~~~~~~~~~~~~~~~
>   338 |              unsigned int size))
>       |              ~~~~~~~~~~~~~~~~~~
> crypto/aegis128-core.c:454:4: error: too many arguments to function
> ‘crypto_aegis128_process_crypt’
>   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
>       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> crypto/aegis128-core.c:334:5: note: declared here
>   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
>       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors
> make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
>
> >   crypto: aegis128 - expose SIMD code path as separate driver
>
> Fixes the above, but causes
>
> ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
>
> as reported by noreply@ellerman.id.au for m68k/defconfig and
> m68k/sun3_defconfig.
> (neon depends on arm).
>

Thanks for the report.

It seems like GCC is not optimizing away calls to routines that are
unreachable. Which GCC version are you using?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
  2020-11-30  9:43     ` Ard Biesheuvel
@ 2020-11-30  9:45       ` Ard Biesheuvel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-30  9:45 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linux Crypto Mailing List, Eric Biggers, Ondrej Mosnacek,
	Herbert Xu, Linux ARM, Linux-Next

On Mon, 30 Nov 2020 at 10:43, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Ard,
> >
> > On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > > block handling', which is included as patch #3 here, but hasn't been
> > > modified substantially.
> > >
> > > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > > to be widely used.
> > >
> > > Patches #2 and #3 improve the SIMD code paths.
> > >
> > > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > > code as a separate driver if the SIMD code path is enabled.
> > >
> > > Changes since v2:
> > > - add Ondrej's ack to #1
> > > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> > >   use some of the SIMD helpers
> > >
> > > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > > Cc: Eric Biggers <ebiggers@kernel.org>
> > >
> > > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> > >
> > > Ard Biesheuvel (4):
> > >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> > >   crypto: aegis128/neon - optimize tail block handling
> > >   crypto: aegis128/neon - move final tag check to SIMD domain
> >
> > crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> > crypto/aegis128-core.c:454:40: error: passing argument 2 of
> > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > [-Werror=incompatible-pointer-types]
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |                                        ^~~
> >       |                                        |
> >       |                                        struct aead_request *
> > crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> > but argument is of type ‘struct aead_request *’
> >   335 |       struct skcipher_walk *walk,
> >       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> > crypto/aegis128-core.c:454:45: error: passing argument 3 of
> > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > [-Werror=incompatible-pointer-types]
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |                                             ^~~~~
> >       |                                             |
> >       |                                             struct skcipher_walk *
> > crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> > aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> > aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> > but argument is of type ‘struct skcipher_walk *’
> >   336 |       void (*crypt)(struct aegis_state *state,
> >       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >   337 |              u8 *dst, const u8 *src,
> >       |              ~~~~~~~~~~~~~~~~~~~~~~~
> >   338 |              unsigned int size))
> >       |              ~~~~~~~~~~~~~~~~~~
> > crypto/aegis128-core.c:454:4: error: too many arguments to function
> > ‘crypto_aegis128_process_crypt’
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > crypto/aegis128-core.c:334:5: note: declared here
> >   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
> >       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > cc1: some warnings being treated as errors
> > make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
> >
> > >   crypto: aegis128 - expose SIMD code path as separate driver
> >
> > Fixes the above, but causes
> >
> > ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
> >
> > as reported by noreply@ellerman.id.au for m68k/defconfig and
> > m68k/sun3_defconfig.
> > (neon depends on arm).
> >
>
> Thanks for the report.
>
> It seems like GCC is not optimizing away calls to routines that are
> unreachable. Which GCC version are you using?

Also, mind checking whether the below works around this?

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 2b05f79475d3..89dc1c559689 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -89,7 +89,7 @@ static void crypto_aegis128_update_a(struct
aegis_state *state,
                                     const union aegis_block *msg,
                                     bool do_simd)
 {
-       if (do_simd) {
+       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
                crypto_aegis128_update_simd(state, msg);
                return;
        }
@@ -101,7 +101,7 @@ static void crypto_aegis128_update_a(struct
aegis_state *state,
 static void crypto_aegis128_update_u(struct aegis_state *state, const
void *msg,
                                     bool do_simd)
 {
-       if (do_simd) {
+       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
                crypto_aegis128_update_simd(state, msg);
                return;
        }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-30  9:45       ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2020-11-30  9:45 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Herbert Xu, Ondrej Mosnacek, Eric Biggers, Linux-Next,
	Linux Crypto Mailing List, Linux ARM

On Mon, 30 Nov 2020 at 10:43, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Ard,
> >
> > On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > > block handling', which is included as patch #3 here, but hasn't been
> > > modified substantially.
> > >
> > > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > > to be widely used.
> > >
> > > Patches #2 and #3 improve the SIMD code paths.
> > >
> > > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > > code as a separate driver if the SIMD code path is enabled.
> > >
> > > Changes since v2:
> > > - add Ondrej's ack to #1
> > > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> > >   use some of the SIMD helpers
> > >
> > > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > > Cc: Eric Biggers <ebiggers@kernel.org>
> > >
> > > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> > >
> > > Ard Biesheuvel (4):
> > >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> > >   crypto: aegis128/neon - optimize tail block handling
> > >   crypto: aegis128/neon - move final tag check to SIMD domain
> >
> > crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> > crypto/aegis128-core.c:454:40: error: passing argument 2 of
> > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > [-Werror=incompatible-pointer-types]
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |                                        ^~~
> >       |                                        |
> >       |                                        struct aead_request *
> > crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> > but argument is of type ‘struct aead_request *’
> >   335 |       struct skcipher_walk *walk,
> >       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> > crypto/aegis128-core.c:454:45: error: passing argument 3 of
> > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > [-Werror=incompatible-pointer-types]
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |                                             ^~~~~
> >       |                                             |
> >       |                                             struct skcipher_walk *
> > crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> > aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> > aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> > but argument is of type ‘struct skcipher_walk *’
> >   336 |       void (*crypt)(struct aegis_state *state,
> >       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >   337 |              u8 *dst, const u8 *src,
> >       |              ~~~~~~~~~~~~~~~~~~~~~~~
> >   338 |              unsigned int size))
> >       |              ~~~~~~~~~~~~~~~~~~
> > crypto/aegis128-core.c:454:4: error: too many arguments to function
> > ‘crypto_aegis128_process_crypt’
> >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> >       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > crypto/aegis128-core.c:334:5: note: declared here
> >   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
> >       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > cc1: some warnings being treated as errors
> > make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
> >
> > >   crypto: aegis128 - expose SIMD code path as separate driver
> >
> > Fixes the above, but causes
> >
> > ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
> >
> > as reported by noreply@ellerman.id.au for m68k/defconfig and
> > m68k/sun3_defconfig.
> > (neon depends on arm).
> >
>
> Thanks for the report.
>
> It seems like GCC is not optimizing away calls to routines that are
> unreachable. Which GCC version are you using?

Also, mind checking whether the below works around this?

diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
index 2b05f79475d3..89dc1c559689 100644
--- a/crypto/aegis128-core.c
+++ b/crypto/aegis128-core.c
@@ -89,7 +89,7 @@ static void crypto_aegis128_update_a(struct
aegis_state *state,
                                     const union aegis_block *msg,
                                     bool do_simd)
 {
-       if (do_simd) {
+       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
                crypto_aegis128_update_simd(state, msg);
                return;
        }
@@ -101,7 +101,7 @@ static void crypto_aegis128_update_a(struct
aegis_state *state,
 static void crypto_aegis128_update_u(struct aegis_state *state, const
void *msg,
                                     bool do_simd)
 {
-       if (do_simd) {
+       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
                crypto_aegis128_update_simd(state, msg);
                return;
        }

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
  2020-11-30  9:45       ` Ard Biesheuvel
@ 2020-11-30 12:14         ` Geert Uytterhoeven
  -1 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2020-11-30 12:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Crypto Mailing List, Eric Biggers, Ondrej Mosnacek,
	Herbert Xu, Linux ARM, Linux-Next

Hi Ard,

On Mon, Nov 30, 2020 at 10:45 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Mon, 30 Nov 2020 at 10:43, Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > > > block handling', which is included as patch #3 here, but hasn't been
> > > > modified substantially.
> > > >
> > > > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > > > to be widely used.
> > > >
> > > > Patches #2 and #3 improve the SIMD code paths.
> > > >
> > > > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > > > code as a separate driver if the SIMD code path is enabled.
> > > >
> > > > Changes since v2:
> > > > - add Ondrej's ack to #1
> > > > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> > > >   use some of the SIMD helpers
> > > >
> > > > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > > > Cc: Eric Biggers <ebiggers@kernel.org>
> > > >
> > > > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> > > >
> > > > Ard Biesheuvel (4):
> > > >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> > > >   crypto: aegis128/neon - optimize tail block handling
> > > >   crypto: aegis128/neon - move final tag check to SIMD domain
> > >
> > > crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> > > crypto/aegis128-core.c:454:40: error: passing argument 2 of
> > > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > > [-Werror=incompatible-pointer-types]
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |                                        ^~~
> > >       |                                        |
> > >       |                                        struct aead_request *
> > > crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> > > but argument is of type ‘struct aead_request *’
> > >   335 |       struct skcipher_walk *walk,
> > >       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> > > crypto/aegis128-core.c:454:45: error: passing argument 3 of
> > > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > > [-Werror=incompatible-pointer-types]
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |                                             ^~~~~
> > >       |                                             |
> > >       |                                             struct skcipher_walk *
> > > crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> > > aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> > > aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> > > but argument is of type ‘struct skcipher_walk *’
> > >   336 |       void (*crypt)(struct aegis_state *state,
> > >       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >   337 |              u8 *dst, const u8 *src,
> > >       |              ~~~~~~~~~~~~~~~~~~~~~~~
> > >   338 |              unsigned int size))
> > >       |              ~~~~~~~~~~~~~~~~~~
> > > crypto/aegis128-core.c:454:4: error: too many arguments to function
> > > ‘crypto_aegis128_process_crypt’
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > crypto/aegis128-core.c:334:5: note: declared here
> > >   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
> > >       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > cc1: some warnings being treated as errors
> > > make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
> > >
> > > >   crypto: aegis128 - expose SIMD code path as separate driver
> > >
> > > Fixes the above, but causes
> > >
> > > ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
> > >
> > > as reported by noreply@ellerman.id.au for m68k/defconfig and
> > > m68k/sun3_defconfig.
> > > (neon depends on arm).
> > >
> >
> > Thanks for the report.
> >
> > It seems like GCC is not optimizing away calls to routines that are

The code is not unreachable. Both crypto_aegis128_encrypt_simd() and
crypto_aegis128_decrypt_simd() call crypto_aegis128_process_ad(..., true);

> > unreachable. Which GCC version are you using?

I'm using 9.3.0, Kisskb is using 8.1.0.

> Also, mind checking whether the below works around this?
>
> diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
> index 2b05f79475d3..89dc1c559689 100644
> --- a/crypto/aegis128-core.c
> +++ b/crypto/aegis128-core.c
> @@ -89,7 +89,7 @@ static void crypto_aegis128_update_a(struct
> aegis_state *state,
>                                      const union aegis_block *msg,
>                                      bool do_simd)
>  {
> -       if (do_simd) {
> +       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
>                 crypto_aegis128_update_simd(state, msg);
>                 return;
>         }
> @@ -101,7 +101,7 @@ static void crypto_aegis128_update_a(struct
> aegis_state *state,
>  static void crypto_aegis128_update_u(struct aegis_state *state, const
> void *msg,
>                                      bool do_simd)
>  {
> -       if (do_simd) {
> +       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
>                 crypto_aegis128_update_simd(state, msg);
>                 return;
>         }

Thanks, that fixes the build for me.

Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 0/4] crypto: aegis128 enhancements
@ 2020-11-30 12:14         ` Geert Uytterhoeven
  0 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2020-11-30 12:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Herbert Xu, Ondrej Mosnacek, Eric Biggers, Linux-Next,
	Linux Crypto Mailing List, Linux ARM

Hi Ard,

On Mon, Nov 30, 2020 at 10:45 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Mon, 30 Nov 2020 at 10:43, Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Mon, 30 Nov 2020 at 10:37, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Tue, Nov 17, 2020 at 2:38 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > This series supersedes [0] '[PATCH] crypto: aegis128/neon - optimize tail
> > > > block handling', which is included as patch #3 here, but hasn't been
> > > > modified substantially.
> > > >
> > > > Patch #1 should probably go to -stable, even though aegis128 does not appear
> > > > to be widely used.
> > > >
> > > > Patches #2 and #3 improve the SIMD code paths.
> > > >
> > > > Patch #4 enables fuzz testing for the SIMD code by registering the generic
> > > > code as a separate driver if the SIMD code path is enabled.
> > > >
> > > > Changes since v2:
> > > > - add Ondrej's ack to #1
> > > > - fix an issue spotted by Ondrej in #4 where the generic code path would still
> > > >   use some of the SIMD helpers
> > > >
> > > > Cc: Ondrej Mosnacek <omosnacek@gmail.com>
> > > > Cc: Eric Biggers <ebiggers@kernel.org>
> > > >
> > > > [0] https://lore.kernel.org/linux-crypto/20201107195516.13952-1-ardb@kernel.org/
> > > >
> > > > Ard Biesheuvel (4):
> > > >   crypto: aegis128 - wipe plaintext and tag if decryption fails
> > > >   crypto: aegis128/neon - optimize tail block handling
> > > >   crypto: aegis128/neon - move final tag check to SIMD domain
> > >
> > > crypto/aegis128-core.c: In function ‘crypto_aegis128_decrypt’:
> > > crypto/aegis128-core.c:454:40: error: passing argument 2 of
> > > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > > [-Werror=incompatible-pointer-types]
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |                                        ^~~
> > >       |                                        |
> > >       |                                        struct aead_request *
> > > crypto/aegis128-core.c:335:29: note: expected ‘struct skcipher_walk *’
> > > but argument is of type ‘struct aead_request *’
> > >   335 |       struct skcipher_walk *walk,
> > >       |       ~~~~~~~~~~~~~~~~~~~~~~^~~~
> > > crypto/aegis128-core.c:454:45: error: passing argument 3 of
> > > ‘crypto_aegis128_process_crypt’ from incompatible pointer type
> > > [-Werror=incompatible-pointer-types]
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |                                             ^~~~~
> > >       |                                             |
> > >       |                                             struct skcipher_walk *
> > > crypto/aegis128-core.c:336:14: note: expected ‘void (*)(struct
> > > aegis_state *, u8 *, const u8 *, unsigned int)’ {aka ‘void (*)(struct
> > > aegis_state *, unsigned char *, const unsigned char *, unsigned int)’}
> > > but argument is of type ‘struct skcipher_walk *’
> > >   336 |       void (*crypt)(struct aegis_state *state,
> > >       |       ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >   337 |              u8 *dst, const u8 *src,
> > >       |              ~~~~~~~~~~~~~~~~~~~~~~~
> > >   338 |              unsigned int size))
> > >       |              ~~~~~~~~~~~~~~~~~~
> > > crypto/aegis128-core.c:454:4: error: too many arguments to function
> > > ‘crypto_aegis128_process_crypt’
> > >   454 |    crypto_aegis128_process_crypt(NULL, req, &walk,
> > >       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > crypto/aegis128-core.c:334:5: note: declared here
> > >   334 | int crypto_aegis128_process_crypt(struct aegis_state *state,
> > >       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > cc1: some warnings being treated as errors
> > > make[1]: *** [scripts/Makefile.build:283: crypto/aegis128-core.o] Error 1
> > >
> > > >   crypto: aegis128 - expose SIMD code path as separate driver
> > >
> > > Fixes the above, but causes
> > >
> > > ERROR: modpost: "crypto_aegis128_update_simd" [crypto/aegis128.ko] undefined!
> > >
> > > as reported by noreply@ellerman.id.au for m68k/defconfig and
> > > m68k/sun3_defconfig.
> > > (neon depends on arm).
> > >
> >
> > Thanks for the report.
> >
> > It seems like GCC is not optimizing away calls to routines that are

The code is not unreachable. Both crypto_aegis128_encrypt_simd() and
crypto_aegis128_decrypt_simd() call crypto_aegis128_process_ad(..., true);

> > unreachable. Which GCC version are you using?

I'm using 9.3.0, Kisskb is using 8.1.0.

> Also, mind checking whether the below works around this?
>
> diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c
> index 2b05f79475d3..89dc1c559689 100644
> --- a/crypto/aegis128-core.c
> +++ b/crypto/aegis128-core.c
> @@ -89,7 +89,7 @@ static void crypto_aegis128_update_a(struct
> aegis_state *state,
>                                      const union aegis_block *msg,
>                                      bool do_simd)
>  {
> -       if (do_simd) {
> +       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
>                 crypto_aegis128_update_simd(state, msg);
>                 return;
>         }
> @@ -101,7 +101,7 @@ static void crypto_aegis128_update_a(struct
> aegis_state *state,
>  static void crypto_aegis128_update_u(struct aegis_state *state, const
> void *msg,
>                                      bool do_simd)
>  {
> -       if (do_simd) {
> +       if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && do_simd) {
>                 crypto_aegis128_update_simd(state, msg);
>                 return;
>         }

Thanks, that fixes the build for me.

Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-11-30 12:16 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 13:32 [PATCH v3 0/4] crypto: aegis128 enhancements Ard Biesheuvel
2020-11-17 13:32 ` Ard Biesheuvel
2020-11-17 13:32 ` [PATCH v3 1/4] crypto: aegis128 - wipe plaintext and tag if decryption fails Ard Biesheuvel
2020-11-17 13:32   ` Ard Biesheuvel
2020-11-17 13:32 ` [PATCH v3 2/4] crypto: aegis128/neon - optimize tail block handling Ard Biesheuvel
2020-11-17 13:32   ` Ard Biesheuvel
2020-11-17 13:32 ` [PATCH v3 3/4] crypto: aegis128/neon - move final tag check to SIMD domain Ard Biesheuvel
2020-11-17 13:32   ` Ard Biesheuvel
2020-11-17 13:32 ` [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver Ard Biesheuvel
2020-11-17 13:32   ` Ard Biesheuvel
2020-11-20  8:55   ` Ondrej Mosnáček
2020-11-20  8:55     ` Ondrej Mosnáček
2020-11-27  6:24 ` [PATCH v3 0/4] crypto: aegis128 enhancements Herbert Xu
2020-11-27  6:24   ` Herbert Xu
2020-11-30  9:37 ` Geert Uytterhoeven
2020-11-30  9:37   ` Geert Uytterhoeven
2020-11-30  9:43   ` Ard Biesheuvel
2020-11-30  9:43     ` Ard Biesheuvel
2020-11-30  9:45     ` Ard Biesheuvel
2020-11-30  9:45       ` Ard Biesheuvel
2020-11-30 12:14       ` Geert Uytterhoeven
2020-11-30 12:14         ` Geert Uytterhoeven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.