All of lore.kernel.org
 help / color / mirror / Atom feed
From: Herbert Xu <herbert@gondor.apana.org.au>
To: Eric Biggers <ebiggers@google.com>
Cc: Greg Kaiser <gkaiser@google.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Michael Halcrow <mhalcrow@google.com>,
	Patrik Torstensson <totte@google.com>,
	Paul Lawrence <paullawrence@google.com>,
	linux-fscrypt@vger.kernel.org, linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Paul Crowley <paulcrowley@google.com>
Subject: Re: [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS
Date: Fri, 16 Mar 2018 23:53:43 +0800	[thread overview]
Message-ID: <20180316155343.GC7095@gondor.apana.org.au> (raw)
In-Reply-To: <20180305191707.143961-1-ebiggers@google.com>

On Mon, Mar 05, 2018 at 11:17:07AM -0800, Eric Biggers wrote:
> Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
> for ARM64.  This is ported from the 32-bit version.  It may be useful on
> devices with 64-bit ARM CPUs that don't have the Cryptography
> Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
> processor on the Raspberry Pi 3.
> 
> It generally works the same way as the 32-bit version, but there are
> some slight differences due to the different instructions, registers,
> and syntax available in ARM64 vs. in ARM32.  For example, in the 64-bit
> version there are enough registers to hold the XTS tweaks for each
> 128-byte chunk, so they don't need to be saved on the stack.
> 
> Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              ----------     ----------
>    Speck64/128-XTS (NEON)                 92.2 MB/s      92.2 MB/s
>    Speck128/256-XTS (NEON)                75.0 MB/s      75.0 MB/s
>    Speck128/256-XTS (generic)             47.4 MB/s      35.6 MB/s
>    AES-128-XTS (NEON bit-sliced)          33.4 MB/s      29.6 MB/s
>    AES-256-XTS (NEON bit-sliced)          24.6 MB/s      21.7 MB/s
> 
> The code performs well on higher-end ARM64 processors as well, though
> such processors tend to have the Crypto Extensions which make AES
> preferred.  For example, here are the same benchmarks run on a HiKey960
> (with CPU affinity set for the A73 cores), with the Crypto Extensions
> implementation of AES-256-XTS added:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              -----------    -----------
>    AES-256-XTS (Crypto Extensions)        1273.3 MB/s    1274.7 MB/s
>    Speck64/128-XTS (NEON)                  359.8 MB/s     348.0 MB/s
>    Speck128/256-XTS (NEON)                 292.5 MB/s     286.1 MB/s
>    Speck128/256-XTS (generic)              186.3 MB/s     181.8 MB/s
>    AES-128-XTS (NEON bit-sliced)           142.0 MB/s     124.3 MB/s
>    AES-256-XTS (NEON bit-sliced)           104.7 MB/s      91.1 MB/s
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

WARNING: multiple messages have this Message-ID (diff)
From: Herbert Xu <herbert@gondor.apana.org.au>
To: Eric Biggers <ebiggers@google.com>
Cc: linux-crypto@vger.kernel.org, linux-fscrypt@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Paul Crowley <paulcrowley@google.com>,
	Patrik Torstensson <totte@google.com>,
	Greg Kaiser <gkaiser@google.com>,
	Paul Lawrence <paullawrence@google.com>,
	Michael Halcrow <mhalcrow@google.com>
Subject: Re: [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS
Date: Fri, 16 Mar 2018 23:53:43 +0800	[thread overview]
Message-ID: <20180316155343.GC7095@gondor.apana.org.au> (raw)
In-Reply-To: <20180305191707.143961-1-ebiggers@google.com>

On Mon, Mar 05, 2018 at 11:17:07AM -0800, Eric Biggers wrote:
> Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
> for ARM64.  This is ported from the 32-bit version.  It may be useful on
> devices with 64-bit ARM CPUs that don't have the Cryptography
> Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
> processor on the Raspberry Pi 3.
> 
> It generally works the same way as the 32-bit version, but there are
> some slight differences due to the different instructions, registers,
> and syntax available in ARM64 vs. in ARM32.  For example, in the 64-bit
> version there are enough registers to hold the XTS tweaks for each
> 128-byte chunk, so they don't need to be saved on the stack.
> 
> Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              ----------     ----------
>    Speck64/128-XTS (NEON)                 92.2 MB/s      92.2 MB/s
>    Speck128/256-XTS (NEON)                75.0 MB/s      75.0 MB/s
>    Speck128/256-XTS (generic)             47.4 MB/s      35.6 MB/s
>    AES-128-XTS (NEON bit-sliced)          33.4 MB/s      29.6 MB/s
>    AES-256-XTS (NEON bit-sliced)          24.6 MB/s      21.7 MB/s
> 
> The code performs well on higher-end ARM64 processors as well, though
> such processors tend to have the Crypto Extensions which make AES
> preferred.  For example, here are the same benchmarks run on a HiKey960
> (with CPU affinity set for the A73 cores), with the Crypto Extensions
> implementation of AES-256-XTS added:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              -----------    -----------
>    AES-256-XTS (Crypto Extensions)        1273.3 MB/s    1274.7 MB/s
>    Speck64/128-XTS (NEON)                  359.8 MB/s     348.0 MB/s
>    Speck128/256-XTS (NEON)                 292.5 MB/s     286.1 MB/s
>    Speck128/256-XTS (generic)              186.3 MB/s     181.8 MB/s
>    AES-128-XTS (NEON bit-sliced)           142.0 MB/s     124.3 MB/s
>    AES-256-XTS (NEON bit-sliced)           104.7 MB/s      91.1 MB/s
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

WARNING: multiple messages have this Message-ID (diff)
From: herbert@gondor.apana.org.au (Herbert Xu)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS
Date: Fri, 16 Mar 2018 23:53:43 +0800	[thread overview]
Message-ID: <20180316155343.GC7095@gondor.apana.org.au> (raw)
In-Reply-To: <20180305191707.143961-1-ebiggers@google.com>

On Mon, Mar 05, 2018 at 11:17:07AM -0800, Eric Biggers wrote:
> Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
> for ARM64.  This is ported from the 32-bit version.  It may be useful on
> devices with 64-bit ARM CPUs that don't have the Cryptography
> Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
> processor on the Raspberry Pi 3.
> 
> It generally works the same way as the 32-bit version, but there are
> some slight differences due to the different instructions, registers,
> and syntax available in ARM64 vs. in ARM32.  For example, in the 64-bit
> version there are enough registers to hold the XTS tweaks for each
> 128-byte chunk, so they don't need to be saved on the stack.
> 
> Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              ----------     ----------
>    Speck64/128-XTS (NEON)                 92.2 MB/s      92.2 MB/s
>    Speck128/256-XTS (NEON)                75.0 MB/s      75.0 MB/s
>    Speck128/256-XTS (generic)             47.4 MB/s      35.6 MB/s
>    AES-128-XTS (NEON bit-sliced)          33.4 MB/s      29.6 MB/s
>    AES-256-XTS (NEON bit-sliced)          24.6 MB/s      21.7 MB/s
> 
> The code performs well on higher-end ARM64 processors as well, though
> such processors tend to have the Crypto Extensions which make AES
> preferred.  For example, here are the same benchmarks run on a HiKey960
> (with CPU affinity set for the A73 cores), with the Crypto Extensions
> implementation of AES-256-XTS added:
> 
>    Algorithm                              Encryption     Decryption
>    ---------                              -----------    -----------
>    AES-256-XTS (Crypto Extensions)        1273.3 MB/s    1274.7 MB/s
>    Speck64/128-XTS (NEON)                  359.8 MB/s     348.0 MB/s
>    Speck128/256-XTS (NEON)                 292.5 MB/s     286.1 MB/s
>    Speck128/256-XTS (generic)              186.3 MB/s     181.8 MB/s
>    AES-128-XTS (NEON bit-sliced)           142.0 MB/s     124.3 MB/s
>    AES-256-XTS (NEON bit-sliced)           104.7 MB/s      91.1 MB/s
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

  parent reply	other threads:[~2018-03-16 15:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-05 19:17 [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS Eric Biggers
2018-03-05 19:17 ` Eric Biggers
2018-03-05 19:17 ` Eric Biggers
2018-03-06 12:35 ` Dave Martin
2018-03-06 12:35   ` Dave Martin
2018-03-06 12:35   ` Dave Martin
2018-03-06 12:47   ` Ard Biesheuvel
2018-03-06 12:47     ` Ard Biesheuvel
2018-03-06 12:47     ` Ard Biesheuvel
2018-03-06 13:44     ` Dave Martin
2018-03-06 13:44       ` Dave Martin
2018-03-06 13:44       ` Dave Martin
2018-03-16 15:53 ` Herbert Xu [this message]
2018-03-16 15:53   ` Herbert Xu
2018-03-16 15:53   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180316155343.GC7095@gondor.apana.org.au \
    --to=herbert@gondor.apana.org.au \
    --cc=ard.biesheuvel@linaro.org \
    --cc=ebiggers@google.com \
    --cc=gkaiser@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=mhalcrow@google.com \
    --cc=paulcrowley@google.com \
    --cc=paullawrence@google.com \
    --cc=totte@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.