linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] crypto: vmx - CTR: always increment IV as quadword
@ 2019-05-15 10:24 Daniel Axtens
  2019-05-17  1:26 ` Nayna
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Daniel Axtens @ 2019-05-15 10:24 UTC (permalink / raw)
  To: mpe, ebiggers, linux-crypto, Herbert Xu
  Cc: marcelo.cerri, Stephan Mueller, leo.barbosa, linuxppc-dev, nayna,
	pfsmorigo, leitao, gcwilson, omosnacek

The kernel self-tests picked up an issue with CTR mode:
alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"

Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
after 3 increments it should wrap around to 0.

In the aesp8-ppc code from OpenSSL, there are two paths that
increment IVs: the bulk (8 at a time) path, and the individual
path which is used when there are fewer than 8 AES blocks to
process.

In the bulk path, the IV is incremented with vadduqm: "Vector
Add Unsigned Quadword Modulo", which does 128-bit addition.

In the individual path, however, the IV is incremented with
vadduwm: "Vector Add Unsigned Word Modulo", which instead
does 4 32-bit additions. Thus the IV would instead become
FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.

Use vadduqm.

This was probably a typo originally, what with q and w being
adjacent. It is a pretty narrow edge case: I am really
impressed by the quality of the kernel self-tests!

Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Axtens <dja@axtens.net>

---

I'll pass this along internally to get it into OpenSSL as well.
---
 drivers/crypto/vmx/aesp8-ppc.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
index de78282b8f44..9c6b5c1d6a1a 100644
--- a/drivers/crypto/vmx/aesp8-ppc.pl
+++ b/drivers/crypto/vmx/aesp8-ppc.pl
@@ -1357,7 +1357,7 @@ Loop_ctr32_enc:
 	addi		$idx,$idx,16
 	bdnz		Loop_ctr32_enc
 
-	vadduwm		$ivec,$ivec,$one
+	vadduqm		$ivec,$ivec,$one
 	 vmr		$dat,$inptail
 	 lvx		$inptail,0,$inp
 	 addi		$inp,$inp,16
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: vmx - CTR: always increment IV as quadword
  2019-05-15 10:24 [PATCH] crypto: vmx - CTR: always increment IV as quadword Daniel Axtens
@ 2019-05-17  1:26 ` Nayna
  2019-05-17  6:00 ` Herbert Xu
  2019-05-20  1:59 ` Daniel Axtens
  2 siblings, 0 replies; 5+ messages in thread
From: Nayna @ 2019-05-17  1:26 UTC (permalink / raw)
  To: Daniel Axtens, mpe, ebiggers, linux-crypto, Herbert Xu
  Cc: marcelo.cerri, Stephan Mueller, leo.barbosa, linuxppc-dev, nayna,
	pfsmorigo, leitao, gcwilson, omosnacek



On 05/15/2019 06:24 AM, Daniel Axtens wrote:
> The kernel self-tests picked up an issue with CTR mode:
> alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"
>
> Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
> after 3 increments it should wrap around to 0.
>
> In the aesp8-ppc code from OpenSSL, there are two paths that
> increment IVs: the bulk (8 at a time) path, and the individual
> path which is used when there are fewer than 8 AES blocks to
> process.
>
> In the bulk path, the IV is incremented with vadduqm: "Vector
> Add Unsigned Quadword Modulo", which does 128-bit addition.
>
> In the individual path, however, the IV is incremented with
> vadduwm: "Vector Add Unsigned Word Modulo", which instead
> does 4 32-bit additions. Thus the IV would instead become
> FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
>
> Use vadduqm.
>
> This was probably a typo originally, what with q and w being
> adjacent. It is a pretty narrow edge case: I am really
> impressed by the quality of the kernel self-tests!
>
> Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
> Cc: stable@vger.kernel.org
> Signed-off-by: Daniel Axtens <dja@axtens.net>
>
> ---
>
> I'll pass this along internally to get it into OpenSSL as well.
> ---
>   drivers/crypto/vmx/aesp8-ppc.pl | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
> index de78282b8f44..9c6b5c1d6a1a 100644
> --- a/drivers/crypto/vmx/aesp8-ppc.pl
> +++ b/drivers/crypto/vmx/aesp8-ppc.pl
> @@ -1357,7 +1357,7 @@ Loop_ctr32_enc:
>   	addi		$idx,$idx,16
>   	bdnz		Loop_ctr32_enc
>   
> -	vadduwm		$ivec,$ivec,$one
> +	vadduqm		$ivec,$ivec,$one
>   	 vmr		$dat,$inptail
>   	 lvx		$inptail,0,$inp
>   	 addi		$inp,$inp,16

Acked-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: vmx - CTR: always increment IV as quadword
  2019-05-15 10:24 [PATCH] crypto: vmx - CTR: always increment IV as quadword Daniel Axtens
  2019-05-17  1:26 ` Nayna
@ 2019-05-17  6:00 ` Herbert Xu
  2019-05-20  1:59 ` Daniel Axtens
  2 siblings, 0 replies; 5+ messages in thread
From: Herbert Xu @ 2019-05-17  6:00 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: mpe, ebiggers, linux-crypto, marcelo.cerri, Stephan Mueller,
	leo.barbosa, linuxppc-dev, nayna, pfsmorigo, leitao, gcwilson,
	omosnacek

On Wed, May 15, 2019 at 08:24:50PM +1000, Daniel Axtens wrote:
> The kernel self-tests picked up an issue with CTR mode:
> alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"
> 
> Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
> after 3 increments it should wrap around to 0.
> 
> In the aesp8-ppc code from OpenSSL, there are two paths that
> increment IVs: the bulk (8 at a time) path, and the individual
> path which is used when there are fewer than 8 AES blocks to
> process.
> 
> In the bulk path, the IV is incremented with vadduqm: "Vector
> Add Unsigned Quadword Modulo", which does 128-bit addition.
> 
> In the individual path, however, the IV is incremented with
> vadduwm: "Vector Add Unsigned Word Modulo", which instead
> does 4 32-bit additions. Thus the IV would instead become
> FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
> 
> Use vadduqm.
> 
> This was probably a typo originally, what with q and w being
> adjacent. It is a pretty narrow edge case: I am really
> impressed by the quality of the kernel self-tests!
> 
> Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
> Cc: stable@vger.kernel.org
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> 
> ---
> 
> I'll pass this along internally to get it into OpenSSL as well.
> ---
>  drivers/crypto/vmx/aesp8-ppc.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: vmx - CTR: always increment IV as quadword
  2019-05-15 10:24 [PATCH] crypto: vmx - CTR: always increment IV as quadword Daniel Axtens
  2019-05-17  1:26 ` Nayna
  2019-05-17  6:00 ` Herbert Xu
@ 2019-05-20  1:59 ` Daniel Axtens
  2019-05-20 16:39   ` Eric Biggers
  2 siblings, 1 reply; 5+ messages in thread
From: Daniel Axtens @ 2019-05-20  1:59 UTC (permalink / raw)
  To: mpe, ebiggers, linux-crypto, Herbert Xu
  Cc: marcelo.cerri, Stephan Mueller, leo.barbosa, linuxppc-dev, nayna,
	pfsmorigo, leitao, gcwilson, omosnacek

Daniel Axtens <dja@axtens.net> writes:

> The kernel self-tests picked up an issue with CTR mode:
> alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"
>
> Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
> after 3 increments it should wrap around to 0.
>
> In the aesp8-ppc code from OpenSSL, there are two paths that
> increment IVs: the bulk (8 at a time) path, and the individual
> path which is used when there are fewer than 8 AES blocks to
> process.
>
> In the bulk path, the IV is incremented with vadduqm: "Vector
> Add Unsigned Quadword Modulo", which does 128-bit addition.
>
> In the individual path, however, the IV is incremented with
> vadduwm: "Vector Add Unsigned Word Modulo", which instead
> does 4 32-bit additions. Thus the IV would instead become
> FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
>
> Use vadduqm.
>
> This was probably a typo originally, what with q and w being
> adjacent. It is a pretty narrow edge case: I am really
> impressed by the quality of the kernel self-tests!
>
> Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
> Cc: stable@vger.kernel.org
> Signed-off-by: Daniel Axtens <dja@axtens.net>
>
> ---
>
> I'll pass this along internally to get it into OpenSSL as well.

I passed this along to OpenSSL and got pretty comprehensively schooled:
https://github.com/openssl/openssl/pull/8942

It seems we tweak the openssl code to use a 128-bit counter, whereas
the original code was in fact designed for a 32-bit counter. We must
have changed the vaddu instruction in the bulk path but not in the
individual path, as they're both vadduwm (4x32-bit) upstream.

I think this change is still correct with regards to the kernel,
but I guess it's probably something where I should have done a more
thorough read of the documentation before diving in to the code, and
perhaps we should note it in the code somewhere too. Ah well.

Regards,
Daniel

> ---
>  drivers/crypto/vmx/aesp8-ppc.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
> index de78282b8f44..9c6b5c1d6a1a 100644
> --- a/drivers/crypto/vmx/aesp8-ppc.pl
> +++ b/drivers/crypto/vmx/aesp8-ppc.pl
> @@ -1357,7 +1357,7 @@ Loop_ctr32_enc:
>  	addi		$idx,$idx,16
>  	bdnz		Loop_ctr32_enc
>  
> -	vadduwm		$ivec,$ivec,$one
> +	vadduqm		$ivec,$ivec,$one
>  	 vmr		$dat,$inptail
>  	 lvx		$inptail,0,$inp
>  	 addi		$inp,$inp,16
> -- 
> 2.19.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: vmx - CTR: always increment IV as quadword
  2019-05-20  1:59 ` Daniel Axtens
@ 2019-05-20 16:39   ` Eric Biggers
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Biggers @ 2019-05-20 16:39 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: mpe, linux-crypto, Herbert Xu, marcelo.cerri, Stephan Mueller,
	leo.barbosa, linuxppc-dev, nayna, pfsmorigo, leitao, gcwilson,
	omosnacek

On Mon, May 20, 2019 at 11:59:05AM +1000, Daniel Axtens wrote:
> Daniel Axtens <dja@axtens.net> writes:
> 
> > The kernel self-tests picked up an issue with CTR mode:
> > alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"
> >
> > Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
> > after 3 increments it should wrap around to 0.
> >
> > In the aesp8-ppc code from OpenSSL, there are two paths that
> > increment IVs: the bulk (8 at a time) path, and the individual
> > path which is used when there are fewer than 8 AES blocks to
> > process.
> >
> > In the bulk path, the IV is incremented with vadduqm: "Vector
> > Add Unsigned Quadword Modulo", which does 128-bit addition.
> >
> > In the individual path, however, the IV is incremented with
> > vadduwm: "Vector Add Unsigned Word Modulo", which instead
> > does 4 32-bit additions. Thus the IV would instead become
> > FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
> >
> > Use vadduqm.
> >
> > This was probably a typo originally, what with q and w being
> > adjacent. It is a pretty narrow edge case: I am really
> > impressed by the quality of the kernel self-tests!
> >
> > Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> >
> > ---
> >
> > I'll pass this along internally to get it into OpenSSL as well.
> 
> I passed this along to OpenSSL and got pretty comprehensively schooled:
> https://github.com/openssl/openssl/pull/8942
> 
> It seems we tweak the openssl code to use a 128-bit counter, whereas
> the original code was in fact designed for a 32-bit counter. We must
> have changed the vaddu instruction in the bulk path but not in the
> individual path, as they're both vadduwm (4x32-bit) upstream.
> 
> I think this change is still correct with regards to the kernel,
> but I guess it's probably something where I should have done a more
> thorough read of the documentation before diving in to the code, and
> perhaps we should note it in the code somewhere too. Ah well.
> 
> Regards,
> Daniel
> 

Ah, I didn't realize there are multiple conventions for CTR.  Yes, all CTR
implementations in the kernel follow the 128-bit convention, and evidently the
test vectors test for that.  Apparently the VMX OpenSSL code got incompletely
changed from the 32-bit convention by this commit, so that's what you're fixing:

	commit 1d4aa0b4c1816e8ca92a6aadb0d8f6b43c56c0d0
	Author: Leonidas Da Silva Barbosa <leosilva@linux.vnet.ibm.com>
	Date:   Fri Aug 14 10:12:22 2015 -0300

	    crypto: vmx - Fixing AES-CTR counter bug
	    
	    AES-CTR is using a counter 8bytes-8bytes what miss match with
	    kernel specs.
	    
	    In the previous code a vadduwm was done to increment counter.
	    Replacing this for a vadduqm now considering both cases counter
	    8-8 bytes and full 16bytes.

A comment in the code would indeed be helpful.

Note that the kernel doesn't currently need a 32-bit CTR implementation for GCM
like OpenSSL does, because the kernel currently only supports 12-byte IVs with
GCM.  So the low 32 bits of the counter start at 1 and don't overflow,
regardless of whether the counter is incremented mod 2^32 or mod 2^128.

- Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-20 16:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15 10:24 [PATCH] crypto: vmx - CTR: always increment IV as quadword Daniel Axtens
2019-05-17  1:26 ` Nayna
2019-05-17  6:00 ` Herbert Xu
2019-05-20  1:59 ` Daniel Axtens
2019-05-20 16:39   ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).