linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
@ 2016-10-11 18:15 Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 1/8] crypto: arm64/aes-ce - fix for big endian Ard Biesheuvel
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
currently work, or have ever worked correctly when built for big endian. So this
series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
well, and an additional fix for XTS which escaped my attention before.

Each of these patches carries a fixes tag, and could be backported to stable.
However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
fix is compatible with, not the patch that introduced the algorithm. This is due
to the fact that the key schedules are incompatible between generic AES and the
arm64 Crypto Extensions implementation (but only when building for big endian)
This is not a problem in practice, but it does mean that the AES-CCM and AES in
EBC/CBC/CTR/XTS mode implementations before v3.19 require a different fix, i.e.,
one that is compatible with the generic AES key schedule generation code (which
it currently no longer uses)

In any case, please apply with cc to stable.

Ard Biesheuvel (8):
  crypto: arm64/aes-ce - fix for big endian
  crypto: arm64/ghash-ce - fix for big endian
  crypto: arm64/sha1-ce - fix for big endian
  crypto: arm64/sha2-ce - fix for big endian
  crypto: arm64/aes-ccm-ce: fix for big endian
  crypto: arm64/aes-neon - fix for big endian
  crypto: arm64/aes-xts-ce: fix for big endian
  crypto: arm/aes-ce - fix for big endian

 arch/arm/crypto/aes-ce-glue.c       |  5 ++
 arch/arm64/crypto/aes-ce-ccm-core.S | 53 ++++++++++----------
 arch/arm64/crypto/aes-ce-cipher.c   | 25 +++++----
 arch/arm64/crypto/aes-ce.S          |  1 +
 arch/arm64/crypto/aes-modes.S       |  3 +-
 arch/arm64/crypto/aes-neon.S        | 25 +++++----
 arch/arm64/crypto/ghash-ce-core.S   |  6 +--
 arch/arm64/crypto/sha1-ce-core.S    |  4 +-
 arch/arm64/crypto/sha2-ce-core.S    |  4 +-
 9 files changed, 72 insertions(+), 54 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/8] crypto: arm64/aes-ce - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 2/8] crypto: arm64/ghash-ce " Ard Biesheuvel
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The core AES cipher implementation that uses ARMv8 Crypto Extensions
instructions erroneously loads the round keys as 64-bit quantities,
which causes the algorithm to fail when built for big endian. In
addition, the key schedule generation routine fails to take endianness
into account as well, when loading the combining the input key with
the round constants. So fix both issues.

Fixes: 12ac3efe74f8 ("arm64/crypto: use crypto instructions to generate AES key schedule")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/aes-ce-cipher.c | 25 ++++++++++++--------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/crypto/aes-ce-cipher.c b/arch/arm64/crypto/aes-ce-cipher.c
index f7bd9bf0bbb3..50d9fe11d0c8 100644
--- a/arch/arm64/crypto/aes-ce-cipher.c
+++ b/arch/arm64/crypto/aes-ce-cipher.c
@@ -47,24 +47,24 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
 	kernel_neon_begin_partial(4);
 
 	__asm__("	ld1	{v0.16b}, %[in]			;"
-		"	ld1	{v1.2d}, [%[key]], #16		;"
+		"	ld1	{v1.16b}, [%[key]], #16		;"
 		"	cmp	%w[rounds], #10			;"
 		"	bmi	0f				;"
 		"	bne	3f				;"
 		"	mov	v3.16b, v1.16b			;"
 		"	b	2f				;"
 		"0:	mov	v2.16b, v1.16b			;"
-		"	ld1	{v3.2d}, [%[key]], #16		;"
+		"	ld1	{v3.16b}, [%[key]], #16		;"
 		"1:	aese	v0.16b, v2.16b			;"
 		"	aesmc	v0.16b, v0.16b			;"
-		"2:	ld1	{v1.2d}, [%[key]], #16		;"
+		"2:	ld1	{v1.16b}, [%[key]], #16		;"
 		"	aese	v0.16b, v3.16b			;"
 		"	aesmc	v0.16b, v0.16b			;"
-		"3:	ld1	{v2.2d}, [%[key]], #16		;"
+		"3:	ld1	{v2.16b}, [%[key]], #16		;"
 		"	subs	%w[rounds], %w[rounds], #3	;"
 		"	aese	v0.16b, v1.16b			;"
 		"	aesmc	v0.16b, v0.16b			;"
-		"	ld1	{v3.2d}, [%[key]], #16		;"
+		"	ld1	{v3.16b}, [%[key]], #16		;"
 		"	bpl	1b				;"
 		"	aese	v0.16b, v2.16b			;"
 		"	eor	v0.16b, v0.16b, v3.16b		;"
@@ -92,24 +92,24 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
 	kernel_neon_begin_partial(4);
 
 	__asm__("	ld1	{v0.16b}, %[in]			;"
-		"	ld1	{v1.2d}, [%[key]], #16		;"
+		"	ld1	{v1.16b}, [%[key]], #16		;"
 		"	cmp	%w[rounds], #10			;"
 		"	bmi	0f				;"
 		"	bne	3f				;"
 		"	mov	v3.16b, v1.16b			;"
 		"	b	2f				;"
 		"0:	mov	v2.16b, v1.16b			;"
-		"	ld1	{v3.2d}, [%[key]], #16		;"
+		"	ld1	{v3.16b}, [%[key]], #16		;"
 		"1:	aesd	v0.16b, v2.16b			;"
 		"	aesimc	v0.16b, v0.16b			;"
-		"2:	ld1	{v1.2d}, [%[key]], #16		;"
+		"2:	ld1	{v1.16b}, [%[key]], #16		;"
 		"	aesd	v0.16b, v3.16b			;"
 		"	aesimc	v0.16b, v0.16b			;"
-		"3:	ld1	{v2.2d}, [%[key]], #16		;"
+		"3:	ld1	{v2.16b}, [%[key]], #16		;"
 		"	subs	%w[rounds], %w[rounds], #3	;"
 		"	aesd	v0.16b, v1.16b			;"
 		"	aesimc	v0.16b, v0.16b			;"
-		"	ld1	{v3.2d}, [%[key]], #16		;"
+		"	ld1	{v3.16b}, [%[key]], #16		;"
 		"	bpl	1b				;"
 		"	aesd	v0.16b, v2.16b			;"
 		"	eor	v0.16b, v0.16b, v3.16b		;"
@@ -173,7 +173,12 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 		u32 *rki = ctx->key_enc + (i * kwords);
 		u32 *rko = rki + kwords;
 
+#ifndef CONFIG_CPU_BIG_ENDIAN
 		rko[0] = ror32(aes_sub(rki[kwords - 1]), 8) ^ rcon[i] ^ rki[0];
+#else
+		rko[0] = rol32(aes_sub(rki[kwords - 1]), 8) ^ (rcon[i] << 24) ^
+			 rki[0];
+#endif
 		rko[1] = rko[0] ^ rki[1];
 		rko[2] = rko[1] ^ rki[2];
 		rko[3] = rko[2] ^ rki[3];
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/8] crypto: arm64/ghash-ce - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 1/8] crypto: arm64/aes-ce - fix for big endian Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 3/8] crypto: arm64/sha1-ce " Ard Biesheuvel
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The GHASH key and digest are both pairs of 64-bit quantities, but the
GHASH code does not always refer to them as such, causing failures when
built for big endian. So replace the 16x1 loads and stores with 2x8 ones.

Fixes: b913a6404ce2 ("arm64/crypto: improve performance of GHASH algorithm")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/ghash-ce-core.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S
index dc457015884e..f0bb9f0b524f 100644
--- a/arch/arm64/crypto/ghash-ce-core.S
+++ b/arch/arm64/crypto/ghash-ce-core.S
@@ -29,8 +29,8 @@
 	 *			   struct ghash_key const *k, const char *head)
 	 */
 ENTRY(pmull_ghash_update)
-	ld1		{SHASH.16b}, [x3]
-	ld1		{XL.16b}, [x1]
+	ld1		{SHASH.2d}, [x3]
+	ld1		{XL.2d}, [x1]
 	movi		MASK.16b, #0xe1
 	ext		SHASH2.16b, SHASH.16b, SHASH.16b, #8
 	shl		MASK.2d, MASK.2d, #57
@@ -74,6 +74,6 @@ CPU_LE(	rev64		T1.16b, T1.16b	)
 
 	cbnz		w0, 0b
 
-	st1		{XL.16b}, [x1]
+	st1		{XL.2d}, [x1]
 	ret
 ENDPROC(pmull_ghash_update)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/8] crypto: arm64/sha1-ce - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 1/8] crypto: arm64/aes-ce - fix for big endian Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 2/8] crypto: arm64/ghash-ce " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 4/8] crypto: arm64/sha2-ce " Ard Biesheuvel
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The SHA1 digest is an array of 5 32-bit quantities, so we should refer
to them as such in order for this code to work correctly when built for
big endian. So replace 16 byte scalar loads and stores with 4x4 vector
ones where appropriate.

Fixes: 2c98833a42cd ("arm64/crypto: SHA-1 using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/sha1-ce-core.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/crypto/sha1-ce-core.S b/arch/arm64/crypto/sha1-ce-core.S
index 033aae6d732a..c98e7e849f06 100644
--- a/arch/arm64/crypto/sha1-ce-core.S
+++ b/arch/arm64/crypto/sha1-ce-core.S
@@ -78,7 +78,7 @@ ENTRY(sha1_ce_transform)
 	ld1r		{k3.4s}, [x6]
 
 	/* load state */
-	ldr		dga, [x0]
+	ld1		{dgav.4s}, [x0]
 	ldr		dgb, [x0, #16]
 
 	/* load sha1_ce_state::finalize */
@@ -144,7 +144,7 @@ CPU_LE(	rev32		v11.16b, v11.16b	)
 	b		1b
 
 	/* store new state */
-3:	str		dga, [x0]
+3:	st1		{dgav.4s}, [x0]
 	str		dgb, [x0, #16]
 	ret
 ENDPROC(sha1_ce_transform)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/8] crypto: arm64/sha2-ce - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 3/8] crypto: arm64/sha1-ce " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 5/8] crypto: arm64/aes-ccm-ce: " Ard Biesheuvel
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The SHA256 digest is an array of 8 32-bit quantities, so we should refer
to them as such in order for this code to work correctly when built for
big endian. So replace 16 byte scalar loads and stores with 4x32 vector
ones where appropriate.

Fixes: 6ba6c74dfc6b ("arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/sha2-ce-core.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S
index 5df9d9d470ad..01cfee066837 100644
--- a/arch/arm64/crypto/sha2-ce-core.S
+++ b/arch/arm64/crypto/sha2-ce-core.S
@@ -85,7 +85,7 @@ ENTRY(sha2_ce_transform)
 	ld1		{v12.4s-v15.4s}, [x8]
 
 	/* load state */
-	ldp		dga, dgb, [x0]
+	ld1		{dgav.4s, dgbv.4s}, [x0]
 
 	/* load sha256_ce_state::finalize */
 	ldr		w4, [x0, #:lo12:sha256_ce_offsetof_finalize]
@@ -148,6 +148,6 @@ CPU_LE(	rev32		v19.16b, v19.16b	)
 	b		1b
 
 	/* store new state */
-3:	stp		dga, dgb, [x0]
+3:	st1		{dgav.4s, dgbv.4s}, [x0]
 	ret
 ENDPROC(sha2_ce_transform)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/8] crypto: arm64/aes-ccm-ce: fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 4/8] crypto: arm64/sha2-ce " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 6/8] crypto: arm64/aes-neon - " Ard Biesheuvel
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The AES-CCM implementation that uses ARMv8 Crypto Extensions instructions
refers to the AES round keys as pairs of 64-bit quantities, which causes
failures when building the code for big endian. In addition, it byte swaps
the input counter unconditionally, while this is only required for little
endian builds. So fix both issues.

Fixes: 12ac3efe74f8 ("arm64/crypto: use crypto instructions to generate AES key schedule")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/aes-ce-ccm-core.S | 53 ++++++++++----------
 1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S
index a2a7fbcacc14..3363560c79b7 100644
--- a/arch/arm64/crypto/aes-ce-ccm-core.S
+++ b/arch/arm64/crypto/aes-ce-ccm-core.S
@@ -9,6 +9,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/assembler.h>
 
 	.text
 	.arch	armv8-a+crypto
@@ -19,7 +20,7 @@
 	 */
 ENTRY(ce_aes_ccm_auth_data)
 	ldr	w8, [x3]			/* leftover from prev round? */
-	ld1	{v0.2d}, [x0]			/* load mac */
+	ld1	{v0.16b}, [x0]			/* load mac */
 	cbz	w8, 1f
 	sub	w8, w8, #16
 	eor	v1.16b, v1.16b, v1.16b
@@ -31,7 +32,7 @@ ENTRY(ce_aes_ccm_auth_data)
 	beq	8f				/* out of input? */
 	cbnz	w8, 0b
 	eor	v0.16b, v0.16b, v1.16b
-1:	ld1	{v3.2d}, [x4]			/* load first round key */
+1:	ld1	{v3.16b}, [x4]			/* load first round key */
 	prfm	pldl1strm, [x1]
 	cmp	w5, #12				/* which key size? */
 	add	x6, x4, #16
@@ -41,17 +42,17 @@ ENTRY(ce_aes_ccm_auth_data)
 	mov	v5.16b, v3.16b
 	b	4f
 2:	mov	v4.16b, v3.16b
-	ld1	{v5.2d}, [x6], #16		/* load 2nd round key */
+	ld1	{v5.16b}, [x6], #16		/* load 2nd round key */
 3:	aese	v0.16b, v4.16b
 	aesmc	v0.16b, v0.16b
-4:	ld1	{v3.2d}, [x6], #16		/* load next round key */
+4:	ld1	{v3.16b}, [x6], #16		/* load next round key */
 	aese	v0.16b, v5.16b
 	aesmc	v0.16b, v0.16b
-5:	ld1	{v4.2d}, [x6], #16		/* load next round key */
+5:	ld1	{v4.16b}, [x6], #16		/* load next round key */
 	subs	w7, w7, #3
 	aese	v0.16b, v3.16b
 	aesmc	v0.16b, v0.16b
-	ld1	{v5.2d}, [x6], #16		/* load next round key */
+	ld1	{v5.16b}, [x6], #16		/* load next round key */
 	bpl	3b
 	aese	v0.16b, v4.16b
 	subs	w2, w2, #16			/* last data? */
@@ -60,7 +61,7 @@ ENTRY(ce_aes_ccm_auth_data)
 	ld1	{v1.16b}, [x1], #16		/* load next input block */
 	eor	v0.16b, v0.16b, v1.16b		/* xor with mac */
 	bne	1b
-6:	st1	{v0.2d}, [x0]			/* store mac */
+6:	st1	{v0.16b}, [x0]			/* store mac */
 	beq	10f
 	adds	w2, w2, #16
 	beq	10f
@@ -79,7 +80,7 @@ ENTRY(ce_aes_ccm_auth_data)
 	adds	w7, w7, #1
 	bne	9b
 	eor	v0.16b, v0.16b, v1.16b
-	st1	{v0.2d}, [x0]
+	st1	{v0.16b}, [x0]
 10:	str	w8, [x3]
 	ret
 ENDPROC(ce_aes_ccm_auth_data)
@@ -89,27 +90,27 @@ ENDPROC(ce_aes_ccm_auth_data)
 	 * 			 u32 rounds);
 	 */
 ENTRY(ce_aes_ccm_final)
-	ld1	{v3.2d}, [x2], #16		/* load first round key */
-	ld1	{v0.2d}, [x0]			/* load mac */
+	ld1	{v3.16b}, [x2], #16		/* load first round key */
+	ld1	{v0.16b}, [x0]			/* load mac */
 	cmp	w3, #12				/* which key size? */
 	sub	w3, w3, #2			/* modified # of rounds */
-	ld1	{v1.2d}, [x1]			/* load 1st ctriv */
+	ld1	{v1.16b}, [x1]			/* load 1st ctriv */
 	bmi	0f
 	bne	3f
 	mov	v5.16b, v3.16b
 	b	2f
 0:	mov	v4.16b, v3.16b
-1:	ld1	{v5.2d}, [x2], #16		/* load next round key */
+1:	ld1	{v5.16b}, [x2], #16		/* load next round key */
 	aese	v0.16b, v4.16b
 	aesmc	v0.16b, v0.16b
 	aese	v1.16b, v4.16b
 	aesmc	v1.16b, v1.16b
-2:	ld1	{v3.2d}, [x2], #16		/* load next round key */
+2:	ld1	{v3.16b}, [x2], #16		/* load next round key */
 	aese	v0.16b, v5.16b
 	aesmc	v0.16b, v0.16b
 	aese	v1.16b, v5.16b
 	aesmc	v1.16b, v1.16b
-3:	ld1	{v4.2d}, [x2], #16		/* load next round key */
+3:	ld1	{v4.16b}, [x2], #16		/* load next round key */
 	subs	w3, w3, #3
 	aese	v0.16b, v3.16b
 	aesmc	v0.16b, v0.16b
@@ -120,47 +121,47 @@ ENTRY(ce_aes_ccm_final)
 	aese	v1.16b, v4.16b
 	/* final round key cancels out */
 	eor	v0.16b, v0.16b, v1.16b		/* en-/decrypt the mac */
-	st1	{v0.2d}, [x0]			/* store result */
+	st1	{v0.16b}, [x0]			/* store result */
 	ret
 ENDPROC(ce_aes_ccm_final)
 
 	.macro	aes_ccm_do_crypt,enc
 	ldr	x8, [x6, #8]			/* load lower ctr */
-	ld1	{v0.2d}, [x5]			/* load mac */
-	rev	x8, x8				/* keep swabbed ctr in reg */
+	ld1	{v0.16b}, [x5]			/* load mac */
+CPU_LE(	rev	x8, x8			)	/* keep swabbed ctr in reg */
 0:	/* outer loop */
-	ld1	{v1.1d}, [x6]			/* load upper ctr */
+	ld1	{v1.8b}, [x6]			/* load upper ctr */
 	prfm	pldl1strm, [x1]
 	add	x8, x8, #1
 	rev	x9, x8
 	cmp	w4, #12				/* which key size? */
 	sub	w7, w4, #2			/* get modified # of rounds */
 	ins	v1.d[1], x9			/* no carry in lower ctr */
-	ld1	{v3.2d}, [x3]			/* load first round key */
+	ld1	{v3.16b}, [x3]			/* load first round key */
 	add	x10, x3, #16
 	bmi	1f
 	bne	4f
 	mov	v5.16b, v3.16b
 	b	3f
 1:	mov	v4.16b, v3.16b
-	ld1	{v5.2d}, [x10], #16		/* load 2nd round key */
+	ld1	{v5.16b}, [x10], #16		/* load 2nd round key */
 2:	/* inner loop: 3 rounds, 2x interleaved */
 	aese	v0.16b, v4.16b
 	aesmc	v0.16b, v0.16b
 	aese	v1.16b, v4.16b
 	aesmc	v1.16b, v1.16b
-3:	ld1	{v3.2d}, [x10], #16		/* load next round key */
+3:	ld1	{v3.16b}, [x10], #16		/* load next round key */
 	aese	v0.16b, v5.16b
 	aesmc	v0.16b, v0.16b
 	aese	v1.16b, v5.16b
 	aesmc	v1.16b, v1.16b
-4:	ld1	{v4.2d}, [x10], #16		/* load next round key */
+4:	ld1	{v4.16b}, [x10], #16		/* load next round key */
 	subs	w7, w7, #3
 	aese	v0.16b, v3.16b
 	aesmc	v0.16b, v0.16b
 	aese	v1.16b, v3.16b
 	aesmc	v1.16b, v1.16b
-	ld1	{v5.2d}, [x10], #16		/* load next round key */
+	ld1	{v5.16b}, [x10], #16		/* load next round key */
 	bpl	2b
 	aese	v0.16b, v4.16b
 	aese	v1.16b, v4.16b
@@ -177,14 +178,14 @@ ENDPROC(ce_aes_ccm_final)
 	eor	v0.16b, v0.16b, v2.16b		/* xor mac with pt ^ rk[last] */
 	st1	{v1.16b}, [x0], #16		/* write output block */
 	bne	0b
-	rev	x8, x8
-	st1	{v0.2d}, [x5]			/* store mac */
+CPU_LE(	rev	x8, x8			)
+	st1	{v0.16b}, [x5]			/* store mac */
 	str	x8, [x6, #8]			/* store lsb end of ctr (BE) */
 5:	ret
 
 6:	eor	v0.16b, v0.16b, v5.16b		/* final round mac */
 	eor	v1.16b, v1.16b, v5.16b		/* final round enc */
-	st1	{v0.2d}, [x5]			/* store mac */
+	st1	{v0.16b}, [x5]			/* store mac */
 	add	w2, w2, #16			/* process partial tail block */
 7:	ldrb	w9, [x1], #1			/* get 1 byte of input */
 	umov	w6, v1.b[0]			/* get top crypted ctr byte */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/8] crypto: arm64/aes-neon - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 5/8] crypto: arm64/aes-ccm-ce: " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 7/8] crypto: arm64/aes-xts-ce: " Ard Biesheuvel
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The AES implementation using pure NEON instructions relies on the generic
AES key schedule generation routines, which store the round keys as arrays
of 32-bit quantities stored in memory using native endianness. This means
we should refer to these round keys using 4x4 loads rather than 16x1 loads.
In addition, the ShiftRows tables are loading using a single scalar load,
which is also affected by endianness, so emit these tables in the correct
order depending on whether we are building for big endian or not.

Fixes: 49788fe2a128 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/aes-neon.S | 25 ++++++++++++--------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S
index b93170e1cc93..85f07ead7c5c 100644
--- a/arch/arm64/crypto/aes-neon.S
+++ b/arch/arm64/crypto/aes-neon.S
@@ -9,6 +9,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/assembler.h>
 
 #define AES_ENTRY(func)		ENTRY(neon_ ## func)
 #define AES_ENDPROC(func)	ENDPROC(neon_ ## func)
@@ -83,13 +84,13 @@
 	.endm
 
 	.macro		do_block, enc, in, rounds, rk, rkp, i
-	ld1		{v15.16b}, [\rk]
+	ld1		{v15.4s}, [\rk]
 	add		\rkp, \rk, #16
 	mov		\i, \rounds
 1111:	eor		\in\().16b, \in\().16b, v15.16b		/* ^round key */
 	tbl		\in\().16b, {\in\().16b}, v13.16b	/* ShiftRows */
 	sub_bytes	\in
-	ld1		{v15.16b}, [\rkp], #16
+	ld1		{v15.4s}, [\rkp], #16
 	subs		\i, \i, #1
 	beq		2222f
 	.if		\enc == 1
@@ -229,7 +230,7 @@
 	.endm
 
 	.macro		do_block_2x, enc, in0, in1 rounds, rk, rkp, i
-	ld1		{v15.16b}, [\rk]
+	ld1		{v15.4s}, [\rk]
 	add		\rkp, \rk, #16
 	mov		\i, \rounds
 1111:	eor		\in0\().16b, \in0\().16b, v15.16b	/* ^round key */
@@ -237,7 +238,7 @@
 	sub_bytes_2x	\in0, \in1
 	tbl		\in0\().16b, {\in0\().16b}, v13.16b	/* ShiftRows */
 	tbl		\in1\().16b, {\in1\().16b}, v13.16b	/* ShiftRows */
-	ld1		{v15.16b}, [\rkp], #16
+	ld1		{v15.4s}, [\rkp], #16
 	subs		\i, \i, #1
 	beq		2222f
 	.if		\enc == 1
@@ -254,7 +255,7 @@
 	.endm
 
 	.macro		do_block_4x, enc, in0, in1, in2, in3, rounds, rk, rkp, i
-	ld1		{v15.16b}, [\rk]
+	ld1		{v15.4s}, [\rk]
 	add		\rkp, \rk, #16
 	mov		\i, \rounds
 1111:	eor		\in0\().16b, \in0\().16b, v15.16b	/* ^round key */
@@ -266,7 +267,7 @@
 	tbl		\in1\().16b, {\in1\().16b}, v13.16b	/* ShiftRows */
 	tbl		\in2\().16b, {\in2\().16b}, v13.16b	/* ShiftRows */
 	tbl		\in3\().16b, {\in3\().16b}, v13.16b	/* ShiftRows */
-	ld1		{v15.16b}, [\rkp], #16
+	ld1		{v15.4s}, [\rkp], #16
 	subs		\i, \i, #1
 	beq		2222f
 	.if		\enc == 1
@@ -306,12 +307,16 @@
 	.text
 	.align		4
 .LForward_ShiftRows:
-	.byte		0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3
-	.byte		0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb
+CPU_LE(	.byte		0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3	)
+CPU_LE(	.byte		0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb	)
+CPU_BE(	.byte		0xb, 0x6, 0x1, 0xc, 0x7, 0x2, 0xd, 0x8	)
+CPU_BE(	.byte		0x3, 0xe, 0x9, 0x4, 0xf, 0xa, 0x5, 0x0	)
 
 .LReverse_ShiftRows:
-	.byte		0x0, 0xd, 0xa, 0x7, 0x4, 0x1, 0xe, 0xb
-	.byte		0x8, 0x5, 0x2, 0xf, 0xc, 0x9, 0x6, 0x3
+CPU_LE(	.byte		0x0, 0xd, 0xa, 0x7, 0x4, 0x1, 0xe, 0xb	)
+CPU_LE(	.byte		0x8, 0x5, 0x2, 0xf, 0xc, 0x9, 0x6, 0x3	)
+CPU_BE(	.byte		0x3, 0x6, 0x9, 0xc, 0xf, 0x2, 0x5, 0x8	)
+CPU_BE(	.byte		0xb, 0xe, 0x1, 0x4, 0x7, 0xa, 0xd, 0x0	)
 
 .LForward_Sbox:
 	.byte		0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/8] crypto: arm64/aes-xts-ce: fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 6/8] crypto: arm64/aes-neon - " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-11 18:15 ` [PATCH v2 8/8] crypto: arm/aes-ce - " Ard Biesheuvel
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

Emit the XTS tweak literal constants in the appropriate order for a
single 128-bit scalar literal load.

Fixes: 49788fe2a128 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/aes-ce.S    | 1 +
 arch/arm64/crypto/aes-modes.S | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/crypto/aes-ce.S b/arch/arm64/crypto/aes-ce.S
index 78f3cfe92c08..b46093d567e5 100644
--- a/arch/arm64/crypto/aes-ce.S
+++ b/arch/arm64/crypto/aes-ce.S
@@ -10,6 +10,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/assembler.h>
 
 #define AES_ENTRY(func)		ENTRY(ce_ ## func)
 #define AES_ENDPROC(func)	ENDPROC(ce_ ## func)
diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
index f6e372c528eb..c53dbeae79f2 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -386,7 +386,8 @@ AES_ENDPROC(aes_ctr_encrypt)
 	.endm
 
 .Lxts_mul_x:
-	.word		1, 0, 0x87, 0
+CPU_LE(	.quad		1, 0x87		)
+CPU_BE(	.quad		0x87, 1		)
 
 AES_ENTRY(aes_xts_encrypt)
 	FRAME_PUSH
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/8] crypto: arm/aes-ce - fix for big endian
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 7/8] crypto: arm64/aes-xts-ce: " Ard Biesheuvel
@ 2016-10-11 18:15 ` Ard Biesheuvel
  2016-10-18 10:55 ` [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-11 18:15 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: will.deacon, catalin.marinas, linux, Ard Biesheuvel

The AES key schedule generation is mostly endian agnostic, with the
exception of the rotation and the incorporation of the round constant
at the start of each round. So implement a big endian specific version
of that part to make the whole routine big endian compatible.

Fixes: 86464859cc77 ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/crypto/aes-ce-glue.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index aef022a87c53..04410d9f5e72 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -88,8 +88,13 @@ static int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 		u32 *rki = ctx->key_enc + (i * kwords);
 		u32 *rko = rki + kwords;
 
+#ifndef CONFIG_CPU_BIG_ENDIAN
 		rko[0] = ror32(ce_aes_sub(rki[kwords - 1]), 8);
 		rko[0] = rko[0] ^ rki[0] ^ rcon[i];
+#else
+		rko[0] = rol32(ce_aes_sub(rki[kwords - 1]), 8);
+		rko[0] = rko[0] ^ rki[0] ^ (rcon[i] << 24);
+#endif
 		rko[1] = rko[0] ^ rki[1];
 		rko[2] = rko[1] ^ rki[2];
 		rko[3] = rko[2] ^ rki[3];
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2016-10-11 18:15 ` [PATCH v2 8/8] crypto: arm/aes-ce - " Ard Biesheuvel
@ 2016-10-18 10:55 ` Ard Biesheuvel
  2016-10-18 11:49 ` Catalin Marinas
  2016-10-21  3:16 ` Herbert Xu
  10 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 10:55 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, Herbert Xu
  Cc: Will Deacon, Catalin Marinas, Russell King - ARM Linux, Ard Biesheuvel

On 11 October 2016 at 19:15, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
> currently work, or have ever worked correctly when built for big endian. So this
> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
> well, and an additional fix for XTS which escaped my attention before.
>
> Each of these patches carries a fixes tag, and could be backported to stable.
> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
> fix is compatible with, not the patch that introduced the algorithm. This is due
> to the fact that the key schedules are incompatible between generic AES and the
> arm64 Crypto Extensions implementation (but only when building for big endian)
> This is not a problem in practice, but it does mean that the AES-CCM and AES in
> EBC/CBC/CTR/XTS mode implementations before v3.19 require a different fix, i.e.,
> one that is compatible with the generic AES key schedule generation code (which
> it currently no longer uses)
>
> In any case, please apply with cc to stable.
>

Ping?

> Ard Biesheuvel (8):
>   crypto: arm64/aes-ce - fix for big endian
>   crypto: arm64/ghash-ce - fix for big endian
>   crypto: arm64/sha1-ce - fix for big endian
>   crypto: arm64/sha2-ce - fix for big endian
>   crypto: arm64/aes-ccm-ce: fix for big endian
>   crypto: arm64/aes-neon - fix for big endian
>   crypto: arm64/aes-xts-ce: fix for big endian
>   crypto: arm/aes-ce - fix for big endian
>
>  arch/arm/crypto/aes-ce-glue.c       |  5 ++
>  arch/arm64/crypto/aes-ce-ccm-core.S | 53 ++++++++++----------
>  arch/arm64/crypto/aes-ce-cipher.c   | 25 +++++----
>  arch/arm64/crypto/aes-ce.S          |  1 +
>  arch/arm64/crypto/aes-modes.S       |  3 +-
>  arch/arm64/crypto/aes-neon.S        | 25 +++++----
>  arch/arm64/crypto/ghash-ce-core.S   |  6 +--
>  arch/arm64/crypto/sha1-ce-core.S    |  4 +-
>  arch/arm64/crypto/sha2-ce-core.S    |  4 +-
>  9 files changed, 72 insertions(+), 54 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2016-10-18 10:55 ` [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
@ 2016-10-18 11:49 ` Catalin Marinas
  2016-10-18 12:14   ` Ard Biesheuvel
  2016-10-21  3:16 ` Herbert Xu
  10 siblings, 1 reply; 17+ messages in thread
From: Catalin Marinas @ 2016-10-18 11:49 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, herbert, will.deacon, linux

On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
> currently work, or have ever worked correctly when built for big endian. So this
> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
> well, and an additional fix for XTS which escaped my attention before.
> 
> Each of these patches carries a fixes tag, and could be backported to stable.
> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
> fix is compatible with, not the patch that introduced the algorithm.

I think for future reference, the Fixes tag should denote the commit
that introduced the issue. An explicit Cc: stable tag would state how
far back it should be applied.

> Ard Biesheuvel (8):
>   crypto: arm64/aes-ce - fix for big endian
>   crypto: arm64/ghash-ce - fix for big endian
>   crypto: arm64/sha1-ce - fix for big endian
>   crypto: arm64/sha2-ce - fix for big endian
>   crypto: arm64/aes-ccm-ce: fix for big endian
>   crypto: arm64/aes-neon - fix for big endian
>   crypto: arm64/aes-xts-ce: fix for big endian
>   crypto: arm/aes-ce - fix for big endian

The changes look fine to me but I can't claim I fully understand these
algorithms. FWIW:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

(Will may pick them up for 4.9-rcX)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-18 11:49 ` Catalin Marinas
@ 2016-10-18 12:14   ` Ard Biesheuvel
  2016-10-19  3:03     ` Herbert Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 12:14 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-crypto, linux-arm-kernel, Herbert Xu, Will Deacon,
	Russell King - ARM Linux

On 18 October 2016 at 12:49, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
>> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
>> currently work, or have ever worked correctly when built for big endian. So this
>> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
>> well, and an additional fix for XTS which escaped my attention before.
>>
>> Each of these patches carries a fixes tag, and could be backported to stable.
>> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
>> fix is compatible with, not the patch that introduced the algorithm.
>
> I think for future reference, the Fixes tag should denote the commit
> that introduced the issue. An explicit Cc: stable tag would state how
> far back it should be applied.
>

OK, that sounds reasonable.

>> Ard Biesheuvel (8):
>>   crypto: arm64/aes-ce - fix for big endian
>>   crypto: arm64/ghash-ce - fix for big endian
>>   crypto: arm64/sha1-ce - fix for big endian
>>   crypto: arm64/sha2-ce - fix for big endian
>>   crypto: arm64/aes-ccm-ce: fix for big endian
>>   crypto: arm64/aes-neon - fix for big endian
>>   crypto: arm64/aes-xts-ce: fix for big endian
>>   crypto: arm/aes-ce - fix for big endian
>
> The changes look fine to me but I can't claim I fully understand these
> algorithms. FWIW:
>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>
> (Will may pick them up for 4.9-rcX)

Thanks, although I was kind of expecting Herbert to pick these up,
given that #8 affects ARM not arm64.

But if you (or Will) can pick up #1 to #7, that is also fine, then I
can drop #8 into rmk's patch database.

Thanks,
Ard,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-18 12:14   ` Ard Biesheuvel
@ 2016-10-19  3:03     ` Herbert Xu
  2016-10-19  8:46       ` Will Deacon
  0 siblings, 1 reply; 17+ messages in thread
From: Herbert Xu @ 2016-10-19  3:03 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Catalin Marinas, linux-crypto, linux-arm-kernel, Will Deacon,
	Russell King - ARM Linux

On Tue, Oct 18, 2016 at 01:14:38PM +0100, Ard Biesheuvel wrote:
> On 18 October 2016 at 12:49, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
> >> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
> >> currently work, or have ever worked correctly when built for big endian. So this
> >> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
> >> well, and an additional fix for XTS which escaped my attention before.
> >>
> >> Each of these patches carries a fixes tag, and could be backported to stable.
> >> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
> >> fix is compatible with, not the patch that introduced the algorithm.
> >
> > I think for future reference, the Fixes tag should denote the commit
> > that introduced the issue. An explicit Cc: stable tag would state how
> > far back it should be applied.
> >
> 
> OK, that sounds reasonable.
> 
> >> Ard Biesheuvel (8):
> >>   crypto: arm64/aes-ce - fix for big endian
> >>   crypto: arm64/ghash-ce - fix for big endian
> >>   crypto: arm64/sha1-ce - fix for big endian
> >>   crypto: arm64/sha2-ce - fix for big endian
> >>   crypto: arm64/aes-ccm-ce: fix for big endian
> >>   crypto: arm64/aes-neon - fix for big endian
> >>   crypto: arm64/aes-xts-ce: fix for big endian
> >>   crypto: arm/aes-ce - fix for big endian
> >
> > The changes look fine to me but I can't claim I fully understand these
> > algorithms. FWIW:
> >
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> >
> > (Will may pick them up for 4.9-rcX)
> 
> Thanks, although I was kind of expecting Herbert to pick these up,
> given that #8 affects ARM not arm64.
> 
> But if you (or Will) can pick up #1 to #7, that is also fine, then I
> can drop #8 into rmk's patch database.

I was planning merging these for 4.10.  But I'm fine with them
going through the arm tree.  Let me know what you guys want to
do.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-19  3:03     ` Herbert Xu
@ 2016-10-19  8:46       ` Will Deacon
  2016-10-19  8:49         ` Ard Biesheuvel
  0 siblings, 1 reply; 17+ messages in thread
From: Will Deacon @ 2016-10-19  8:46 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Catalin Marinas, Russell King - ARM Linux, linux-crypto,
	linux-arm-kernel, Ard Biesheuvel

On Wed, Oct 19, 2016 at 11:03:33AM +0800, Herbert Xu wrote:
> On Tue, Oct 18, 2016 at 01:14:38PM +0100, Ard Biesheuvel wrote:
> > On 18 October 2016 at 12:49, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
> > >> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
> > >> currently work, or have ever worked correctly when built for big endian. So this
> > >> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
> > >> well, and an additional fix for XTS which escaped my attention before.
> > >>
> > >> Each of these patches carries a fixes tag, and could be backported to stable.
> > >> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
> > >> fix is compatible with, not the patch that introduced the algorithm.
> > >
> > > I think for future reference, the Fixes tag should denote the commit
> > > that introduced the issue. An explicit Cc: stable tag would state how
> > > far back it should be applied.
> > >
> > 
> > OK, that sounds reasonable.
> > 
> > >> Ard Biesheuvel (8):
> > >>   crypto: arm64/aes-ce - fix for big endian
> > >>   crypto: arm64/ghash-ce - fix for big endian
> > >>   crypto: arm64/sha1-ce - fix for big endian
> > >>   crypto: arm64/sha2-ce - fix for big endian
> > >>   crypto: arm64/aes-ccm-ce: fix for big endian
> > >>   crypto: arm64/aes-neon - fix for big endian
> > >>   crypto: arm64/aes-xts-ce: fix for big endian
> > >>   crypto: arm/aes-ce - fix for big endian
> > >
> > > The changes look fine to me but I can't claim I fully understand these
> > > algorithms. FWIW:
> > >
> > > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > >
> > > (Will may pick them up for 4.9-rcX)
> > 
> > Thanks, although I was kind of expecting Herbert to pick these up,
> > given that #8 affects ARM not arm64.
> > 
> > But if you (or Will) can pick up #1 to #7, that is also fine, then I
> > can drop #8 into rmk's patch database.
> 
> I was planning merging these for 4.10.  But I'm fine with them
> going through the arm tree.  Let me know what you guys want to
> do.

I assumed you'd take them through crypto, as per usual, so I didn't
queue anything in the arm64 tree.

Ard -- were you planning to get these in for 4.9?

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-19  8:46       ` Will Deacon
@ 2016-10-19  8:49         ` Ard Biesheuvel
  2016-10-19  9:15           ` Will Deacon
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2016-10-19  8:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Catalin Marinas, Russell King - ARM Linux, Herbert Xu,
	linux-arm-kernel, linux-crypto

On 19 October 2016 at 09:46, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Oct 19, 2016 at 11:03:33AM +0800, Herbert Xu wrote:
>> On Tue, Oct 18, 2016 at 01:14:38PM +0100, Ard Biesheuvel wrote:
>> > On 18 October 2016 at 12:49, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > > On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
>> > >> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
>> > >> currently work, or have ever worked correctly when built for big endian. So this
>> > >> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
>> > >> well, and an additional fix for XTS which escaped my attention before.
>> > >>
>> > >> Each of these patches carries a fixes tag, and could be backported to stable.
>> > >> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
>> > >> fix is compatible with, not the patch that introduced the algorithm.
>> > >
>> > > I think for future reference, the Fixes tag should denote the commit
>> > > that introduced the issue. An explicit Cc: stable tag would state how
>> > > far back it should be applied.
>> > >
>> >
>> > OK, that sounds reasonable.
>> >
>> > >> Ard Biesheuvel (8):
>> > >>   crypto: arm64/aes-ce - fix for big endian
>> > >>   crypto: arm64/ghash-ce - fix for big endian
>> > >>   crypto: arm64/sha1-ce - fix for big endian
>> > >>   crypto: arm64/sha2-ce - fix for big endian
>> > >>   crypto: arm64/aes-ccm-ce: fix for big endian
>> > >>   crypto: arm64/aes-neon - fix for big endian
>> > >>   crypto: arm64/aes-xts-ce: fix for big endian
>> > >>   crypto: arm/aes-ce - fix for big endian
>> > >
>> > > The changes look fine to me but I can't claim I fully understand these
>> > > algorithms. FWIW:
>> > >
>> > > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>> > >
>> > > (Will may pick them up for 4.9-rcX)
>> >
>> > Thanks, although I was kind of expecting Herbert to pick these up,
>> > given that #8 affects ARM not arm64.
>> >
>> > But if you (or Will) can pick up #1 to #7, that is also fine, then I
>> > can drop #8 into rmk's patch database.
>>
>> I was planning merging these for 4.10.  But I'm fine with them
>> going through the arm tree.  Let me know what you guys want to
>> do.
>
> I assumed you'd take them through crypto, as per usual, so I didn't
> queue anything in the arm64 tree.
>
> Ard -- were you planning to get these in for 4.9?
>

These are arguably bug fixes, but I spotted them by accident, they
weren't reported to me or anything. But it seems strange to add a cc
stable and then hold off until the next merge window.

In any case, I don't care deeply either way, as long as they get
merged in the end. I think it makes sense to keep them together (arm64
+ ARM), so Herbert's tree is a more natural route for them to take. I
will leave it up to Herbert whether they are sent onward as fixes or
as part of v4.10

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-19  8:49         ` Ard Biesheuvel
@ 2016-10-19  9:15           ` Will Deacon
  0 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2016-10-19  9:15 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Catalin Marinas, Russell King - ARM Linux, Herbert Xu,
	linux-arm-kernel, linux-crypto

On Wed, Oct 19, 2016 at 09:49:46AM +0100, Ard Biesheuvel wrote:
> On 19 October 2016 at 09:46, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Oct 19, 2016 at 11:03:33AM +0800, Herbert Xu wrote:
> >> I was planning merging these for 4.10.  But I'm fine with them
> >> going through the arm tree.  Let me know what you guys want to
> >> do.
> >
> > I assumed you'd take them through crypto, as per usual, so I didn't
> > queue anything in the arm64 tree.
> >
> > Ard -- were you planning to get these in for 4.9?
> >
> 
> These are arguably bug fixes, but I spotted them by accident, they
> weren't reported to me or anything. But it seems strange to add a cc
> stable and then hold off until the next merge window.
> 
> In any case, I don't care deeply either way, as long as they get
> merged in the end. I think it makes sense to keep them together (arm64
> + ARM), so Herbert's tree is a more natural route for them to take. I
> will leave it up to Herbert whether they are sent onward as fixes or
> as part of v4.10

Sounds good to me.

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes
  2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
                   ` (9 preceding siblings ...)
  2016-10-18 11:49 ` Catalin Marinas
@ 2016-10-21  3:16 ` Herbert Xu
  10 siblings, 0 replies; 17+ messages in thread
From: Herbert Xu @ 2016-10-21  3:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, will.deacon, catalin.marinas, linux

On Tue, Oct 11, 2016 at 07:15:12PM +0100, Ard Biesheuvel wrote:
> As it turns out, none of the accelerated crypto routines under arch/arm64/crypto
> currently work, or have ever worked correctly when built for big endian. So this
> series fixes all of them. This v2 now includes a similar fix for 32-bit ARM as
> well, and an additional fix for XTS which escaped my attention before.
> 
> Each of these patches carries a fixes tag, and could be backported to stable.
> However, for patches #1 and #5, the fixes tag denotes the oldest commit that the
> fix is compatible with, not the patch that introduced the algorithm. This is due
> to the fact that the key schedules are incompatible between generic AES and the
> arm64 Crypto Extensions implementation (but only when building for big endian)
> This is not a problem in practice, but it does mean that the AES-CCM and AES in
> EBC/CBC/CTR/XTS mode implementations before v3.19 require a different fix, i.e.,
> one that is compatible with the generic AES key schedule generation code (which
> it currently no longer uses)
> 
> In any case, please apply with cc to stable.
> 
> Ard Biesheuvel (8):
>   crypto: arm64/aes-ce - fix for big endian
>   crypto: arm64/ghash-ce - fix for big endian
>   crypto: arm64/sha1-ce - fix for big endian
>   crypto: arm64/sha2-ce - fix for big endian
>   crypto: arm64/aes-ccm-ce: fix for big endian
>   crypto: arm64/aes-neon - fix for big endian
>   crypto: arm64/aes-xts-ce: fix for big endian
>   crypto: arm/aes-ce - fix for big endian
> 
>  arch/arm/crypto/aes-ce-glue.c       |  5 ++
>  arch/arm64/crypto/aes-ce-ccm-core.S | 53 ++++++++++----------
>  arch/arm64/crypto/aes-ce-cipher.c   | 25 +++++----
>  arch/arm64/crypto/aes-ce.S          |  1 +
>  arch/arm64/crypto/aes-modes.S       |  3 +-
>  arch/arm64/crypto/aes-neon.S        | 25 +++++----
>  arch/arm64/crypto/ghash-ce-core.S   |  6 +--
>  arch/arm64/crypto/sha1-ce-core.S    |  4 +-
>  arch/arm64/crypto/sha2-ce-core.S    |  4 +-
>  9 files changed, 72 insertions(+), 54 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-10-21  3:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-11 18:15 [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 1/8] crypto: arm64/aes-ce - fix for big endian Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 2/8] crypto: arm64/ghash-ce " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 3/8] crypto: arm64/sha1-ce " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 4/8] crypto: arm64/sha2-ce " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 5/8] crypto: arm64/aes-ccm-ce: " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 6/8] crypto: arm64/aes-neon - " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 7/8] crypto: arm64/aes-xts-ce: " Ard Biesheuvel
2016-10-11 18:15 ` [PATCH v2 8/8] crypto: arm/aes-ce - " Ard Biesheuvel
2016-10-18 10:55 ` [PATCH v2 0/8] crypto: ARM/arm64 - big endian fixes Ard Biesheuvel
2016-10-18 11:49 ` Catalin Marinas
2016-10-18 12:14   ` Ard Biesheuvel
2016-10-19  3:03     ` Herbert Xu
2016-10-19  8:46       ` Will Deacon
2016-10-19  8:49         ` Ard Biesheuvel
2016-10-19  9:15           ` Will Deacon
2016-10-21  3:16 ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).