linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: linux-crypto@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org,
	Ard Biesheuvel <ardb@kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Sterba <dsterba@suse.com>,
	"Jason A . Donenfeld" <Jason@zx2c4.com>,
	Paul Crowley <paulcrowley@google.com>
Subject: Re: [PATCH 0/5] crypto: add NEON-optimized BLAKE2b
Date: Wed, 16 Dec 2020 12:47:56 -0800	[thread overview]
Message-ID: <X9pyfAaw5hQ6ngTI@gmail.com> (raw)
In-Reply-To: <20201215234708.105527-1-ebiggers@kernel.org>

On Tue, Dec 15, 2020 at 03:47:03PM -0800, Eric Biggers wrote:
> This patchset adds a NEON implementation of BLAKE2b for 32-bit ARM.
> Patches 1-4 prepare for it by making some updates to the generic
> implementation, while patch 5 adds the actual NEON implementation.
> 
> On Cortex-A7 (which these days is the most common ARM processor that
> doesn't have the ARMv8 Crypto Extensions), this is over twice as fast as
> SHA-256, and slightly faster than SHA-1.  It is also almost three times
> as fast as the generic implementation of BLAKE2b:
> 
> 	Algorithm            Cycles per byte (on 4096-byte messages)
> 	===================  =======================================
> 	blake2b-256-neon     14.1
> 	sha1-neon            16.4
> 	sha1-asm             20.8
> 	blake2s-256-generic  26.1
> 	sha256-neon	     28.9
> 	sha256-asm	     32.1
> 	blake2b-256-generic  39.9
> 
> This implementation isn't directly based on any other implementation,
> but it borrows some ideas from previous NEON code I've written as well
> as from chacha-neon-core.S.  At least on Cortex-A7, it is faster than
> the other NEON implementations of BLAKE2b I'm aware of (the
> implementation in the BLAKE2 official repository using intrinsics, and
> Andrew Moon's implementation which can be found in SUPERCOP).
> 
> NEON-optimized BLAKE2b is useful because there is interest in using
> BLAKE2b-256 for dm-verity on low-end Android devices (specifically,
> devices that lack the ARMv8 Crypto Extensions) to replace SHA-1.  On
> these devices, the performance cost of upgrading to SHA-256 may be
> unacceptable, whereas BLAKE2b-256 would actually improve performance.
> 
> Although BLAKE2b is intended for 64-bit platforms (unlike BLAKE2s which
> is intended for 32-bit platforms), on 32-bit ARM processors with NEON,
> BLAKE2b is actually faster than BLAKE2s.  This is because NEON supports
> 64-bit operations, and because BLAKE2s's block size is too small for
> NEON to be helpful for it.  The best I've been able to do with BLAKE2s
> on Cortex-A7 is 19.0 cpb with an optimized scalar implementation.

By the way, if people are interested in having my ARM scalar implementation of
BLAKE2s in the kernel too, I can send a patchset for that too.  It just ended up
being slower than BLAKE2b and SHA-1, so it wasn't as good for the use case
mentioned above.  If it were to be added as "blake2s-256-arm", we'd have:

	Algorithm            Cycles per byte (on 4096-byte messages)
	===================  =======================================
	blake2b-256-neon     14.1
	sha1-neon            16.4
	blake2s-256-arm      19.0
	sha1-asm             20.8
	blake2s-256-generic  26.1
	sha256-neon	     28.9
	sha256-asm	     32.1
	blake2b-256-generic  39.9

  parent reply	other threads:[~2020-12-16 20:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-15 23:47 [PATCH 0/5] crypto: add NEON-optimized BLAKE2b Eric Biggers
2020-12-15 23:47 ` [PATCH 1/5] crypto: blake2b - rename constants for consistency with blake2s Eric Biggers
2020-12-17 17:13   ` David Sterba
2020-12-15 23:47 ` [PATCH 2/5] crypto: blake2b - define shash_alg structs using macros Eric Biggers
2020-12-17 17:15   ` David Sterba
2020-12-17 18:35     ` Eric Biggers
2020-12-15 23:47 ` [PATCH 3/5] crypto: blake2b - export helpers for optimized implementations Eric Biggers
2020-12-17 17:15   ` David Sterba
2020-12-17 22:33     ` Eric Biggers
2020-12-15 23:47 ` [PATCH 4/5] crypto: blake2b - update file comment Eric Biggers
2020-12-17 17:17   ` David Sterba
2020-12-15 23:47 ` [PATCH 5/5] crypto: arm/blake2b - add NEON-optimized BLAKE2b implementation Eric Biggers
2020-12-16 20:57   ` Eric Biggers
2020-12-16 20:47 ` Eric Biggers [this message]
2020-12-16 22:32   ` [PATCH 0/5] crypto: add NEON-optimized BLAKE2b Jason A. Donenfeld
2020-12-17  3:54     ` Eric Biggers
2020-12-17 14:01       ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X9pyfAaw5hQ6ngTI@gmail.com \
    --to=ebiggers@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=dsterba@suse.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=paulcrowley@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).