All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	herbert@gondor.apana.org.au
Subject: Re: [PATCH 2/2] crypto: arm64/crct10dif - revert to C code for short inputs
Date: Thu, 24 Jan 2019 23:29:49 -0800	[thread overview]
Message-ID: <20190125072948.GC700@sol.localdomain> (raw)
In-Reply-To: <20190124182712.7142-3-ard.biesheuvel@linaro.org>

On Thu, Jan 24, 2019 at 07:27:12PM +0100, Ard Biesheuvel wrote:
> The SIMD routine ported from x86 used to have a special code path
> for inputs < 16 bytes, which got lost somewhere along the way.
> Instead, the current glue code aligns the input pointer to permit
> the NEON routine to use special versions of the vld1 instructions
> that assume 16 byte alignment, but this could result in inputs of
> less than 16 bytes to be passed in. 

This description doesn't quite match the patch since the arm64 version of the
assembly doesn't use any alignment specifiers.  I take it that actually means
the alignment in the glue code wasn't necessary in the first place?

> This not only fails the new
> extended tests that Eric has implemented, it also results in the
> code reading before the input pointer, which could potentially
> result in crashes when dealing with less than 16 bytes of input
> at the start of a page which is preceded by an unmapped page.
> 
> So update the glue code to only invoke the NEON routine if the
> input is more than 16 bytes.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Can you add:

Fixes: 6ef5737f3931 ("crypto: arm64/crct10dif - port x86 SSE implementation to arm64")
Cc: stable@vger.kernel.org

> ---
>  arch/arm64/crypto/crct10dif-ce-glue.c | 25 +++++---------------
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c
> index b461d62023f2..567c24f3d224 100644
> --- a/arch/arm64/crypto/crct10dif-ce-glue.c
> +++ b/arch/arm64/crypto/crct10dif-ce-glue.c
> @@ -39,26 +39,13 @@ static int crct10dif_update(struct shash_desc *desc, const u8 *data,
>  			    unsigned int length)
>  {
>  	u16 *crc = shash_desc_ctx(desc);
> -	unsigned int l;
>  
> -	if (unlikely((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) {
> -		l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE -
> -			  ((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE));
> -
> -		*crc = crc_t10dif_generic(*crc, data, l);
> -
> -		length -= l;
> -		data += l;
> -	}
> -
> -	if (length > 0) {
> -		if (may_use_simd()) {
> -			kernel_neon_begin();
> -			*crc = crc_t10dif_pmull(*crc, data, length);
> -			kernel_neon_end();
> -		} else {
> -			*crc = crc_t10dif_generic(*crc, data, length);
> -		}
> +	if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) {
> +		kernel_neon_begin();
> +		*crc = crc_t10dif_pmull(*crc, data, length);
> +		kernel_neon_end();
> +	} else {
> +		*crc = crc_t10dif_generic(*crc, data, length);
>  	}
>  
>  	return 0;
> -- 
> 2.17.1
> 

WARNING: multiple messages have this Message-ID (diff)
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	herbert@gondor.apana.org.au
Subject: Re: [PATCH 2/2] crypto: arm64/crct10dif - revert to C code for short inputs
Date: Thu, 24 Jan 2019 23:29:49 -0800	[thread overview]
Message-ID: <20190125072948.GC700@sol.localdomain> (raw)
In-Reply-To: <20190124182712.7142-3-ard.biesheuvel@linaro.org>

On Thu, Jan 24, 2019 at 07:27:12PM +0100, Ard Biesheuvel wrote:
> The SIMD routine ported from x86 used to have a special code path
> for inputs < 16 bytes, which got lost somewhere along the way.
> Instead, the current glue code aligns the input pointer to permit
> the NEON routine to use special versions of the vld1 instructions
> that assume 16 byte alignment, but this could result in inputs of
> less than 16 bytes to be passed in. 

This description doesn't quite match the patch since the arm64 version of the
assembly doesn't use any alignment specifiers.  I take it that actually means
the alignment in the glue code wasn't necessary in the first place?

> This not only fails the new
> extended tests that Eric has implemented, it also results in the
> code reading before the input pointer, which could potentially
> result in crashes when dealing with less than 16 bytes of input
> at the start of a page which is preceded by an unmapped page.
> 
> So update the glue code to only invoke the NEON routine if the
> input is more than 16 bytes.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Can you add:

Fixes: 6ef5737f3931 ("crypto: arm64/crct10dif - port x86 SSE implementation to arm64")
Cc: stable@vger.kernel.org

> ---
>  arch/arm64/crypto/crct10dif-ce-glue.c | 25 +++++---------------
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c
> index b461d62023f2..567c24f3d224 100644
> --- a/arch/arm64/crypto/crct10dif-ce-glue.c
> +++ b/arch/arm64/crypto/crct10dif-ce-glue.c
> @@ -39,26 +39,13 @@ static int crct10dif_update(struct shash_desc *desc, const u8 *data,
>  			    unsigned int length)
>  {
>  	u16 *crc = shash_desc_ctx(desc);
> -	unsigned int l;
>  
> -	if (unlikely((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) {
> -		l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE -
> -			  ((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE));
> -
> -		*crc = crc_t10dif_generic(*crc, data, l);
> -
> -		length -= l;
> -		data += l;
> -	}
> -
> -	if (length > 0) {
> -		if (may_use_simd()) {
> -			kernel_neon_begin();
> -			*crc = crc_t10dif_pmull(*crc, data, length);
> -			kernel_neon_end();
> -		} else {
> -			*crc = crc_t10dif_generic(*crc, data, length);
> -		}
> +	if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) {
> +		kernel_neon_begin();
> +		*crc = crc_t10dif_pmull(*crc, data, length);
> +		kernel_neon_end();
> +	} else {
> +		*crc = crc_t10dif_generic(*crc, data, length);
>  	}
>  
>  	return 0;
> -- 
> 2.17.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-01-25  7:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-24 18:27 [PATCH 0/2] crypto: fix crct10dif for ARM and arm64 Ard Biesheuvel
2019-01-24 18:27 ` Ard Biesheuvel
2019-01-24 18:27 ` [PATCH 1/2] crypto: arm/crct10dif - revert to C code for short inputs Ard Biesheuvel
2019-01-24 18:27   ` Ard Biesheuvel
2019-01-25  7:22   ` Eric Biggers
2019-01-25  7:22     ` Eric Biggers
2019-01-25  7:48     ` Ard Biesheuvel
2019-01-25  7:48       ` Ard Biesheuvel
2019-01-24 18:27 ` [PATCH 2/2] crypto: arm64/crct10dif " Ard Biesheuvel
2019-01-24 18:27   ` Ard Biesheuvel
2019-01-25  7:29   ` Eric Biggers [this message]
2019-01-25  7:29     ` Eric Biggers
2019-01-25  7:49     ` Ard Biesheuvel
2019-01-25  7:49       ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190125072948.GC700@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.